[R] apply lm function to dataset split by two variables

Michael Dewey info at aghmed.fsnet.co.uk
Wed Sep 28 13:48:10 CEST 2011


At 09:41 28/09/2011, Elena Guijarro wrote:

>Dear all,
>
>I am not fluent in R and am struggling to 1) apply a lm to a weight-size
>dataset, thus the model has to run separately for each species, each
>year; 2) extract coefs, r-squared, n, etc. The data look like this:
>
>year    sps     cm      w
>2009    50      16      22
>2009    50      17      42
>2009    50      18      45
>2009    51      15      45
>2009    51      16      53
>2009    51      17      73
>2010    50      15      22
>2010    50      16      41
>2010    50      16      21
>2010    50      17      36
>2010    51      15      43
>2010    51      16      67
>2010    51      17      79
>
>
>
>The following script works for data from a single year, but I don't find
>a way to subset the data by sps AND year and get the function running:

I think lmList from the nlme package does this for you. It comes with 
some other helpful extractors or you can write your own as you have 
done. Personally I would return a list rather than a vector but that 
is a matter of taste.


>f <- function(data) lm(log(w) ~ log(cm+0.5), data = data)
>v <- lapply(split(data, data$sps), f)
>
>and then I can extract the data with this script from Peter Solymos
>(although I do not get the number of points used in the analysis):
>
>myFun <-
>function(lm)
>{
>out <- c(lm$coefficients[1],
>      lm$coefficients[2],
>      length(lm$run1$model$y),
>      summary(lm)$coefficients[2,2],
>      pf(summary(lm)$fstatistic[1], summary(lm)$fstatistic[2],
>summary(lm)$fstatistic[3], lower.tail = FALSE),
>      summary(lm)$r.squared)
>names(out) <- c("intercept","slope","n","slope.SE","p.value","r.squared")
>return(out)}
>
>results <- list()
>for (i in 1:length(v)) results[[names(v)[i]]] <- myFun(v[[i]])
>as.data.frame(results)
>
>I have checked the plyr package, but the example that fits my data best
>uses a for loop and I would like to avoid these. I have also tried the
>following (among many other options) without results:
>
>v<-tapply(data$w,list(data$cm,data$year),f)
>
>Error in is.function(FUN) : 'FUN' is missing
>
>Any ideas?
>
>Thanks for your help,
>
>Elena
>
>
>         [[alternative HTML version deleted]]

Michael Dewey
info at aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html



More information about the R-help mailing list