[R] linear model and by()
David Winsemius
dwinsemius at comcast.net
Fri Nov 13 03:13:22 CET 2009
On Nov 12, 2009, at 8:26 PM, Sam Albers wrote:
> Hello R list,
>
> This is a question for anyone who has used the by() command. I would
> like to
> perform a regression on a data frame by several factors. Using by()
> I think
> that I have able to perform this using the following:
>
>> lm.r <- by(master, list(Sectionf=Sectionf, startd=startd),
>> function(x) lm
> (tot.c ~ starttime, data = x))
>
> So that is, I would like to perform separate regressions for each
> level of
> Sectionf for each level of startd. Now I can get the coefficients and
> intercepts from all the fitted models. However, I am now unsure how
> to glean
> more information from the regressions. If I follow up with this
> command I
> get this:
>
>> summary(lm.r)
> 9-09-04.Length 9-09-04.Class 9-09-04.Mode
> 12 lm list
> 12 lm list
> 12 lm list
> 9-09-11.Length 9-09-11.Class 9-09-11.Mode
> 12 lm list
> 12 lm list
> 12 lm list
>
> Under normal circumstances, this usually ends up giving me more info
> about
> the fitted model, but some reason when using the by() command it
> doesn't.
Beacuase the results are a layer deeper than summary is describing;
Try:
lapply(lm.r, summary)
(this theory tested with one of the standard datasets, Indometh:
library(utils)
data(Indometh)
reglist <- by(Indometh, Indometh$Subject, function(x) lm(conc~time,
data=x) )
> summary(reglist)
Length Class Mode
1 12 lm list
4 12 lm list
2 12 lm list
5 12 lm list
6 12 lm list
3 12 lm list
Whereas, without the assignment to reglist I got a series of expected
lm output, so by "works".
> Similarly, although I wasn't hopeful that this would work, when I
> tried to
> examine the residuals, I get these warning messages:
similarly:
lapply(lm.r, plot)
>
>> plot(lm.r$fit,lm.r$res,xlab="Fitted",ylab="Residuals")
> Error in plot.window(...) : need finite 'xlim' values
> In addition: Warning messages:
> 1: In min(x) : no non-missing arguments to min; returning Inf
> 2: In max(x) : no non-missing arguments to max; returning -Inf
> 3: In min(x) : no non-missing arguments to min; returning Inf
> 4: In max(x) : no non-missing arguments to max; returning -Inf
>
> So my question might be answered by someone saying that this isn't a
> good
> way to do this. Still, can anyone suggest some strategies that might
> help me
> out here? Specifically, how might I be able work with these
> regressions once
> I have fitted all the models. Obviously I need to examine the
> residuals,
> look at the model fit, etc. Any suggestions on what I might be doing
> wrong
> would be much appreciated. I am also including a subset of the data
> and the
> commands I used in case that is helpful.
>
> I am using Ubuntu 9.04 and R 2.8.1
>
> Thanks in advance!
>
> Sam
Sorry to not use your data but it's not in a form that lends itself
very well to quick testing. If you had included the input commands I
might have tried it.
>
>> attach(master)
>> Sectionf <- factor(Section)
>> startd=as.Date(startdate, format="%d/%m/%Y")
>
>> lm.r <- by(master, list(Sectionf=Sectionf, startd=startd),
>> function(x) lm
> (tot.c ~ starttime, data = x))
>
> Data:
>
> "Section","starttime","startdate","tot.c"
> "Upstream",0,04/09/09,0.17
> "Upstream",0,04/09/09,0.19
> "Upstream",0,04/09/09,0.14
> "Middle",0,04/09/09,0.2
> "Middle",0,04/09/09,0.13
> "Middle",0,04/09/09,0.11
> "Downstream",0,04/09/09,0.16
> "Downstream",0,04/09/09,0.17
> "Downstream",0,04/09/09,0.17
> "Upstream",25,04/09/09,0.17
> "Upstream",25,04/09/09,0.19
> "Upstream",25,04/09/09,0.14
> "Middle",25,04/09/09,0.2
> "Middle",25,04/09/09,0.13
> "Middle",25,04/09/09,0.11
> "Downstream",25,04/09/09,0.16
> "Downstream",25,04/09/09,0.17
> "Downstream",25,04/09/09,0.17
> "Upstream",50.06,04/09/09,0.17
> "Upstream",50.06,04/09/09,0.19
> "Upstream",50.06,04/09/09,0.14
> "Middle",50.06,04/09/09,0.2
> "Middle",50.06,04/09/09,0.13
> "Middle",50.06,04/09/09,0.11
> "Downstream",50.06,04/09/09,0.16
> "Downstream",50.06,04/09/09,0.17
> "Downstream",50.06,04/09/09,0.17
> "Upstream",75.42,04/09/09,0.17
> "Upstream",75.42,04/09/09,0.19
> "Upstream",75.42,04/09/09,0.14
> "Middle",75.42,04/09/09,0.2
> "Middle",75.42,04/09/09,0.13
> "Middle",75.42,04/09/09,0.11
> "Downstream",75.42,04/09/09,0.16
> "Downstream",75.42,04/09/09,0.17
> "Downstream",75.42,04/09/09,0.17
> "Upstream",100.14,04/09/09,0.17
> "Upstream",100.14,04/09/09,0.19
> "Upstream",100.14,04/09/09,0.14
> "Middle",100.14,04/09/09,0.2
> "Middle",100.14,04/09/09,0.13
> "Middle",100.14,04/09/09,0.11
> "Downstream",100.14,04/09/09,0.16
> "Downstream",100.14,04/09/09,0.17
> "Downstream",100.14,04/09/09,0.17
> "Upstream",125.31,04/09/09,0.17
> "Upstream",125.31,04/09/09,0.19
> "Upstream",125.31,04/09/09,0.14
> "Middle",125.31,04/09/09,0.2
> "Middle",125.31,04/09/09,0.13
> "Middle",125.31,04/09/09,0.11
> "Downstream",125.31,04/09/09,0.16
> "Downstream",125.31,04/09/09,0.17
> "Downstream",125.31,04/09/09,0.17
> "Upstream",150.29,04/09/09,0.17
> "Upstream",150.29,04/09/09,0.19
> "Upstream",150.29,04/09/09,0.14
> "Middle",150.29,04/09/09,0.2
> "Middle",150.29,04/09/09,0.13
> "Middle",150.29,04/09/09,0.11
> "Downstream",150.29,04/09/09,0.16
> "Downstream",150.29,04/09/09,0.17
> "Downstream",150.29,04/09/09,0.17
> "Upstream",0,11/09/09,0.12
> "Upstream",0,11/09/09,0.16
> "Upstream",0,11/09/09,0.12
> "Middle",0,11/09/09,0.08
> "Middle",0,11/09/09,0.12
> "Middle",0,11/09/09,0.1
> "Downstream",0,11/09/09,0.11
> "Downstream",0,11/09/09,0.13
> "Downstream",0,11/09/09,0.13
> "Upstream",25,11/09/09,0.12
> "Upstream",25,11/09/09,0.16
> "Upstream",25,11/09/09,0.12
> "Middle",25,11/09/09,0.08
> "Middle",25,11/09/09,0.12
> "Middle",25,11/09/09,0.1
> "Downstream",25,11/09/09,0.11
> "Downstream",25,11/09/09,0.13
> "Downstream",25,11/09/09,0.13
> "Upstream",50,11/09/09,0.12
> "Upstream",50,11/09/09,0.16
> "Upstream",50,11/09/09,0.12
> "Middle",50,11/09/09,0.08
> "Middle",50,11/09/09,0.12
> "Middle",50,11/09/09,0.1
> "Downstream",50,11/09/09,0.11
> "Downstream",50,11/09/09,0.13
> "Downstream",50,11/09/09,0.13
> "Upstream",75,11/09/09,0.12
> "Upstream",75,11/09/09,0.16
> "Upstream",75,11/09/09,0.12
> "Middle",75,11/09/09,0.08
> "Middle",75,11/09/09,0.12
> "Middle",75,11/09/09,0.1
> "Downstream",75,11/09/09,0.11
> "Downstream",75,11/09/09,0.13
> "Downstream",75,11/09/09,0.13
> "Upstream",100,11/09/09,0.12
> "Upstream",100,11/09/09,0.16
> "Upstream",100,11/09/09,0.12
> "Middle",100,11/09/09,0.08
> "Middle",100,11/09/09,0.12
> "Middle",100,11/09/09,0.1
> "Downstream",100,11/09/09,0.11
> "Downstream",100,11/09/09,0.13
> "Downstream",100,11/09/09,0.13
> "Upstream",125.04,11/09/09,0.12
> "Upstream",125.04,11/09/09,0.16
> "Upstream",125.04,11/09/09,0.12
> "Middle",125.04,11/09/09,0.08
> "Middle",125.04,11/09/09,0.12
> "Middle",125.04,11/09/09,0.1
> "Downstream",125.04,11/09/09,0.11
> "Downstream",125.04,11/09/09,0.13
> "Downstream",125.04,11/09/09,0.13
>
> --
> *****************************************************
> Sam Albers
> Geography Program
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list