[R-sig-eco] multiple regression

Kingsford Jones kingsfordjones at gmail.com
Mon Feb 8 19:56:35 CET 2010


...and you can also read in Frank Harrell's book why standardized
coefficients are a bad idea.  There is a large statistical literature
on variable importance in regression models.  For a discussion and
accompanying R package see

@article{grömping2006relative,
  title={{Relative importance for linear regression in R: the package
relaimpo}},
  author={Gr{\\"o}mping, U.},
  journal={Journal of Statistical Software},
  volume={17},
  number={1},
  pages={139--147},
  year={2006},
  publisher={American Statistical Association}
}


hth,

Kingsford Jones

2010/2/8 Aitor Gastón <aitor.gaston at upm.es>:
>
> Hi Nathan,
>
> Many authors criticize stepwise variable selection, e.g., Harrell, F.E.,
> 2001, Regression modelling strategies with applications to linear models,
> logistic regression and survival analysis.  You can find some of his
> arguments and extra references in
> http://childrens-mercy.org/stats/faq/faq12.asp
>
> Cheers,
>
> Aitor
>
> --------------------------------------------------
> From: "Nathan Lemoine" <lemoine.nathan at gmail.com>
> Sent: Saturday, February 06, 2010 5:17 PM
> To: <r-sig-ecology at r-project.org>
> Subject: [R-sig-eco] multiple regression
>
>> Hi everyone,
>>
>> I'm trying to fit a multiple regression model and have run into some
>> questions regarding the appropriate procedure to use. I am trying to compare
>> fish assemblages (species richness, total abundance, etc.) to metrics of
>> habitat quality. I swam transects are recorded all fish observed, then I
>> measured the structural complexity and live coral  cover over each transect.
>> I am interested in weighting which of these  two metrics has the largest
>> influence on structuring fish assemblages.
>>
>> My strategy was to use a multiple linear regression. Since the data  were
>> in two different measurement units, I scaled the variables to a  mean of 0
>> and std. dev. of 1. This should allow me to compare the  sizes of the beta
>> coefficients to determine the relative (but not  absolute) importance of
>> each habitat variable on the fish assemblage,  correct?
>>
>> My model was lm(Species Richness~Complexity+Coral Cover). I had run a full
>> model and found no evidence of interactions, so I ran it without  the
>> interaction present.
>>
>> It turns out coral cover was not significant in any regression. I have
>> been told that the test I used was incorrect and that the appropriate
>> procedure is a stepwise regression, which would, undoubtedly, provide  me
>> with Complexity as a significant variable and remove Coral Cover.  This
>> seems to me to be the exact same interpretation as the above  model. So,
>> since I'm very new to all of this, I am wondering how to  tell whether one
>> model is 'incorrect' or 'inappropriate' given that  they yield almost
>> identical results? What are the advantages of a  stepwise regression over a
>> standard multiple regression like I have run?
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>



More information about the R-sig-ecology mailing list