[R] Problem calculating multiple regressions on a data frame.

Gabor Grothendieck ggrothendieck at gmail.com
Tue Apr 27 14:27:33 CEST 2010


Replace lm(...) with try(lm(...))


On Tue, Apr 27, 2010 at 7:48 AM, Luis Sisamón <luis.sisamon at gmail.com> wrote:
> Hi there,
> I am stuck trying to solve what should be a fairly easy problem.
> I have a data frame that essentially consists of (ID, time as seqMonth,
> variable, value) and i want to find the regression coefficient of value vs
> time for each combination of ID and Variable.
> I have tried several approaches and none of them seems to work as i
> expected.
> For example, i have tried:
>
> theSplit<-split(theTestLineal, list(as.factor(theTestLineal $ids),
> as.factor(theTestLineal $variable)))
>
> I can then use
> lm(value~seqMonth,data=zongSplit[[1]])
> ...
> lm(value~seqMonth,data=zongSplit[[4]])
>
> that works well, (it fails for some combinations of ID and variable where
> there is one datapoint)
>
> however when i try to use an lapply:
> lapply(zongSplit,function(x)lm(value~seqMonth,data=x,na.action=na.exclude))
>
> it fails, with error message:
> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>  0 (non-NA) cases
>
> I have tried to change the na.action with no success (na.pass, na.fail,
> na.exclude... all give the same error message)
>
>
> I have also tried to follow the approach suggested by Charles Sharpsteen
> (http://www.mail-archive.com/r-help@r-project.org/msg74759.html) with
> similar results.
> The code is as follows:
> theModels <- by( theTestLineal, list( theTestLineal$ids,
> zongTestLineal$variable), function( dataSlice ){
> linMod <- lm( value ~ seqMonth, data = dataSlice )
>
> # Slope and intercept may be recovered from the output of the coef()
> function:
> intercept <- coef( linMod )[1]
> slope <- coef( linMod )[2]
>
> # The R-Squared value is returned by the summary() function:
> rsq <- summary( linMod )[[ 'r.squared' ]]
>
> # The summary function also provides statistics for the F-distribution,
> # extract them, reformat as a list, rename and feed to pf() using do.call()
> # in order to get the p-value:
> fStats <- as.list( summary( linMod )[[ 'fstatistic' ]] )
> names( fStats ) <- c( 'q', 'df1', 'df2' )
> fStats[[ 'lower.tail' ]] <- FALSE
>
> pVal <- do.call( pf, fStats )
>
> return(data.frame( slope, intercept, rsq, pVal ))
> })
>
> Any help will be appreciated!
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list