[R] linear model and by()

David Winsemius dwinsemius at comcast.net
Fri Nov 13 03:13:22 CET 2009


On Nov 12, 2009, at 8:26 PM, Sam Albers wrote:

> Hello R list,
>
> This is a question for anyone who has used the by() command. I would  
> like to
> perform a regression on a data frame by several factors. Using by()  
> I think
> that I have able to perform this using the following:
>
>> lm.r <- by(master, list(Sectionf=Sectionf, startd=startd),  
>> function(x) lm
> (tot.c ~ starttime, data = x))
>
> So that is, I would like to perform separate regressions for each  
> level of
> Sectionf for each level of startd. Now I can get the coefficients and
> intercepts from all the fitted models. However, I am now unsure how  
> to glean
> more information from the regressions. If I follow up with this  
> command I
> get this:
>
>> summary(lm.r)
> 9-09-04.Length  9-09-04.Class  9-09-04.Mode
> 12    lm    list
> 12    lm    list
> 12    lm    list
> 9-09-11.Length  9-09-11.Class  9-09-11.Mode
> 12    lm    list
> 12    lm    list
> 12    lm    list
>
> Under normal circumstances, this usually ends up giving me more info  
> about
> the fitted model, but some reason when using the by() command it  
> doesn't.

Beacuase the results are a layer deeper than summary is describing;
Try:
lapply(lm.r, summary)

(this theory tested with one of the standard datasets, Indometh:

library(utils)
data(Indometh)
reglist <- by(Indometh, Indometh$Subject, function(x) lm(conc~time,  
data=x) )
 > summary(reglist)
   Length Class Mode
1 12     lm    list
4 12     lm    list
2 12     lm    list
5 12     lm    list
6 12     lm    list
3 12     lm    list

Whereas, without the assignment to reglist I got a series of expected  
lm output, so by "works".

> Similarly, although I wasn't hopeful that this would work, when I  
> tried to
> examine the residuals, I get these warning messages:

similarly:

lapply(lm.r, plot)

>
>> plot(lm.r$fit,lm.r$res,xlab="Fitted",ylab="Residuals")
> Error in plot.window(...) : need finite 'xlim' values
> In addition: Warning messages:
> 1: In min(x) : no non-missing arguments to min; returning Inf
> 2: In max(x) : no non-missing arguments to max; returning -Inf
> 3: In min(x) : no non-missing arguments to min; returning Inf
> 4: In max(x) : no non-missing arguments to max; returning -Inf
>
> So my question might be answered by someone saying that this isn't a  
> good
> way to do this. Still, can anyone suggest some strategies that might  
> help me
> out here? Specifically, how might I be able work with these  
> regressions once
> I have fitted all the models. Obviously I need to examine the  
> residuals,
> look at the model fit, etc. Any suggestions on what I might be doing  
> wrong
> would be much appreciated. I am also including a subset of the data  
> and the
> commands I used in case that is helpful.
>
> I am using Ubuntu 9.04 and R 2.8.1
>
> Thanks in advance!
>
> Sam

Sorry to not use your data but it's not in a form that lends itself  
very well to quick testing. If you had included the input commands I  
might have tried it.
>
>> attach(master)
>> Sectionf <- factor(Section)
>> startd=as.Date(startdate, format="%d/%m/%Y")
>
>> lm.r <- by(master, list(Sectionf=Sectionf, startd=startd),  
>> function(x) lm
> (tot.c ~ starttime, data = x))
>
> Data:
>
> "Section","starttime","startdate","tot.c"
> "Upstream",0,04/09/09,0.17
> "Upstream",0,04/09/09,0.19
> "Upstream",0,04/09/09,0.14
> "Middle",0,04/09/09,0.2
> "Middle",0,04/09/09,0.13
> "Middle",0,04/09/09,0.11
> "Downstream",0,04/09/09,0.16
> "Downstream",0,04/09/09,0.17
> "Downstream",0,04/09/09,0.17
> "Upstream",25,04/09/09,0.17
> "Upstream",25,04/09/09,0.19
> "Upstream",25,04/09/09,0.14
> "Middle",25,04/09/09,0.2
> "Middle",25,04/09/09,0.13
> "Middle",25,04/09/09,0.11
> "Downstream",25,04/09/09,0.16
> "Downstream",25,04/09/09,0.17
> "Downstream",25,04/09/09,0.17
> "Upstream",50.06,04/09/09,0.17
> "Upstream",50.06,04/09/09,0.19
> "Upstream",50.06,04/09/09,0.14
> "Middle",50.06,04/09/09,0.2
> "Middle",50.06,04/09/09,0.13
> "Middle",50.06,04/09/09,0.11
> "Downstream",50.06,04/09/09,0.16
> "Downstream",50.06,04/09/09,0.17
> "Downstream",50.06,04/09/09,0.17
> "Upstream",75.42,04/09/09,0.17
> "Upstream",75.42,04/09/09,0.19
> "Upstream",75.42,04/09/09,0.14
> "Middle",75.42,04/09/09,0.2
> "Middle",75.42,04/09/09,0.13
> "Middle",75.42,04/09/09,0.11
> "Downstream",75.42,04/09/09,0.16
> "Downstream",75.42,04/09/09,0.17
> "Downstream",75.42,04/09/09,0.17
> "Upstream",100.14,04/09/09,0.17
> "Upstream",100.14,04/09/09,0.19
> "Upstream",100.14,04/09/09,0.14
> "Middle",100.14,04/09/09,0.2
> "Middle",100.14,04/09/09,0.13
> "Middle",100.14,04/09/09,0.11
> "Downstream",100.14,04/09/09,0.16
> "Downstream",100.14,04/09/09,0.17
> "Downstream",100.14,04/09/09,0.17
> "Upstream",125.31,04/09/09,0.17
> "Upstream",125.31,04/09/09,0.19
> "Upstream",125.31,04/09/09,0.14
> "Middle",125.31,04/09/09,0.2
> "Middle",125.31,04/09/09,0.13
> "Middle",125.31,04/09/09,0.11
> "Downstream",125.31,04/09/09,0.16
> "Downstream",125.31,04/09/09,0.17
> "Downstream",125.31,04/09/09,0.17
> "Upstream",150.29,04/09/09,0.17
> "Upstream",150.29,04/09/09,0.19
> "Upstream",150.29,04/09/09,0.14
> "Middle",150.29,04/09/09,0.2
> "Middle",150.29,04/09/09,0.13
> "Middle",150.29,04/09/09,0.11
> "Downstream",150.29,04/09/09,0.16
> "Downstream",150.29,04/09/09,0.17
> "Downstream",150.29,04/09/09,0.17
> "Upstream",0,11/09/09,0.12
> "Upstream",0,11/09/09,0.16
> "Upstream",0,11/09/09,0.12
> "Middle",0,11/09/09,0.08
> "Middle",0,11/09/09,0.12
> "Middle",0,11/09/09,0.1
> "Downstream",0,11/09/09,0.11
> "Downstream",0,11/09/09,0.13
> "Downstream",0,11/09/09,0.13
> "Upstream",25,11/09/09,0.12
> "Upstream",25,11/09/09,0.16
> "Upstream",25,11/09/09,0.12
> "Middle",25,11/09/09,0.08
> "Middle",25,11/09/09,0.12
> "Middle",25,11/09/09,0.1
> "Downstream",25,11/09/09,0.11
> "Downstream",25,11/09/09,0.13
> "Downstream",25,11/09/09,0.13
> "Upstream",50,11/09/09,0.12
> "Upstream",50,11/09/09,0.16
> "Upstream",50,11/09/09,0.12
> "Middle",50,11/09/09,0.08
> "Middle",50,11/09/09,0.12
> "Middle",50,11/09/09,0.1
> "Downstream",50,11/09/09,0.11
> "Downstream",50,11/09/09,0.13
> "Downstream",50,11/09/09,0.13
> "Upstream",75,11/09/09,0.12
> "Upstream",75,11/09/09,0.16
> "Upstream",75,11/09/09,0.12
> "Middle",75,11/09/09,0.08
> "Middle",75,11/09/09,0.12
> "Middle",75,11/09/09,0.1
> "Downstream",75,11/09/09,0.11
> "Downstream",75,11/09/09,0.13
> "Downstream",75,11/09/09,0.13
> "Upstream",100,11/09/09,0.12
> "Upstream",100,11/09/09,0.16
> "Upstream",100,11/09/09,0.12
> "Middle",100,11/09/09,0.08
> "Middle",100,11/09/09,0.12
> "Middle",100,11/09/09,0.1
> "Downstream",100,11/09/09,0.11
> "Downstream",100,11/09/09,0.13
> "Downstream",100,11/09/09,0.13
> "Upstream",125.04,11/09/09,0.12
> "Upstream",125.04,11/09/09,0.16
> "Upstream",125.04,11/09/09,0.12
> "Middle",125.04,11/09/09,0.08
> "Middle",125.04,11/09/09,0.12
> "Middle",125.04,11/09/09,0.1
> "Downstream",125.04,11/09/09,0.11
> "Downstream",125.04,11/09/09,0.13
> "Downstream",125.04,11/09/09,0.13
>
> -- 
> *****************************************************
> Sam Albers
> Geography Program

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list