[R] Apply same linear model to subset of dataframe
David Winsemius
dwinsemius at comcast.net
Thu Nov 8 18:19:37 CET 2012
On Nov 8, 2012, at 4:40 AM, Ross Ahmed wrote:
> There is a slight problem with the code. Due to the collapse=Œ+¹ argument,
> the code only works if there are >1 predictor variables. How can I amend
> code so it also works if number of predictor variable==1?
>
> # WORKING LENGTH OF EACH SET OF PREDICTOR >1
>
> DV <- c("mpg", "drat", "gear")
> IV <- list(c("cyl", "disp", "hp"), c("wt", "qsec"), c("carb", "hp"))
> fits <- vector("list", length(DV))
>
> for(i in seq(DV)) {
> fit <- lm(formula=paste(DV[i], paste(IV[[i]], collapse="+"), sep="~"),
> data=mtcars)
> plot(fit$fitted, fit$resid, main=paste("DV", DV[i], sep="="))
> lapply(fit$model[, -1], function(x) plot(x, fit$resid))
> fits[[i]] <- fit
> }
>
> # NOT WORKING LENGTH OF LAST PREDICTOR (CARB) ==1
>
> DV <- c("mpg", "drat", "gear")
> IV <- list(c("cyl", "disp", "hp"), c("wt", "qsec"), c("carb"))
> fits <- vector("list", length(DV))
>
> for(i in seq(DV)) {
> fit <- lm(formula=paste(DV[i], paste(IV[[i]], collapse="+"), sep="~"),
> data=mtcars)
> plot(fit$fitted, fit$resid, main=paste("DV", DV[i], sep="="))
> lapply(fit$model[, -1], function(x) plot(x, fit$resid))
The above line is the culprit .... try instead:
lapply(fit$model[, -1, drop=FALSE], function(x) plot(x, fit$resid))
# ---------------------^^^^^^^^^^^
> fits[[i]] <- fit
> }
>
> Many thanks
> Ross
>
> From: Jean V Adams <jvadams at usgs.gov>
> Date: Tuesday, 6 November 2012 19:20
> To: Ross Ahmed <rossahmed at googlemail.com>
> Cc: <r-help at r-project.org>
> Subject: Re: [R] Apply same linear model to subset of dataframe
>
> Ross,
>
> You can store the lm() results in a list, if you like.
> For example:
>
> DV <- c("mpg", "drat", "gear")
> IV <- list(c("cyl", "disp", "hp"), c("wt", "qsec"), c("carb", "hp"))
> fits <- vector("list", length(DV))
>
> for(i in seq(DV)) {
> fit <- lm(formula=paste(DV[i], paste(IV[[i]], collapse="+"),
> sep="~"), data=mtcars)
> plot(fit$fitted, fit$resid, main=paste("DV", DV[i], sep="="))
> lapply(fit$model[, -1], function(x) plot(x, fit$resid))
> fits[[i]] <- fit
> }
>
> Jean
>
>
>
> Ross Ahmed <rossahmed at googlemail.com> wrote on 11/06/2012 09:25:13 AM:
>>
>> Thanks Jean
>>
>> This works for the plots, but it only stores the last lm() computed
>>
>> Ross
>>
>> From: Jean V Adams <jvadams at usgs.gov>
>> Date: Tuesday, 6 November 2012 14:12
>> To: Ross Ahmed <rossahmed at googlemail.com>
>> Cc: <r-help at r-project.org>
>> Subject: Re: [R] Apply same linear model to subset of dataframe
>>
>> Ross,
>>
>> Here's one way to condense the code ...
>>
>> DV <- c("mpg", "drat", "gear")
>> IV <- list(c("cyl", "disp", "hp"), c("wt", "qsec"), c("carb", "hp"))
>>
>> for(i in seq(DV)) {
>> fit <- lm(formula=paste(DV[i], paste(IV[[i]], collapse="+"),
>> sep="~"), data=mtcars)
>> plot(fit$fitted, fit$resid, main=paste("DV", DV[i], sep="="))
>> lapply(fit$model[, -1], function(x) plot(x, fit$resid))
>> }
>>
>> Jean
>>
>>
>>
>> Ross Ahmed <rossahmed at googlemail.com> wrote on 11/04/2012 09:57:34 AM:
>>>
>>> I have applied the same linear model to several different subsets of a
>>> dataset. I recently read that in R, code should never be repeated.I feel my
>>> code as it currently stands has a lot of repetition, which could be
>>> condensed into fewer lines. I will use the mtcars dataset to replicatewhat
>>> I have done. My question is: how can I use fewer lines of code (for example
>>> using a for loop, a function or plyr) to achieve the same output as below?
>>>
>>> data(mtcars)
>>>
>>> # Apply the same model to the dataset but choosing different combinations of
>>> dependent (DV) and independent (IV) variables in each case:
>>> lm.mpg= lm(mpg~cyl+disp+hp, data=mtcars)
>>> lm.drat = lm(drat~wt+qsec, data=mtcars)
>>> lm.gear = lm(gear~carb+hp, data=mtcars)
>>>
>>> # Plot residuals against fitted values for each model
>>> plot(lm.mpg$fitted,lm.mpg$residuals, main = "lm.mpg")
>>> plot(lm.drat$fitted,lm.drat$residuals, main = "lm.drat")
>>> plot(lm.gear$fitted,lm.gear$residuals, main = "lm.gear")
>>>
>>> # Plot residuals against IVs for each model
>>> plotResIV <- function (IV,lmResiduals)
>>> {
>>> lapply(IV, function (x) plot(x,lmResiduals))
>>> }
>>>
>>> plotResIV(lm.mpg$model[,-1],lm.mpg$residuals)
>>> plotResIV(lm.drat$model[,-1],lm.drat$residuals)
>>> plotResIV(lm.gear$model[,-1],lm.gear$residuals)
>>>
>>> Many thanks
>>> Ross Ahmed
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list