[R-sig-eco] interactions in stepAIC
Christopher David Desjardins
desja004 at umn.edu
Thu Apr 5 00:07:10 CEST 2012
Hi Vincenzo,
There are a couple of things that might be worth considering.
In your first model you consider only main effects and no interactions.
Do any of your main effects drop off after you run stepAIC? If so, when
you go to build an interaction model don't include these main effects
only the significant parasites.
Given that you don't have any a prior predictions about interactions,
why do you think there are any? Maybe it's best not to look for them
then to prevent stumble upon some interactions by chance? Have you tried
plotting your data? This could help guide you with interactions. I would
also recommend against include higher order interactions that you won't
be able to interpret.
What do you hope to get from the interactions?
Finally, since your approach is somewhat data driven and you seem to
want to reduce the number of parasite predictors, have you considered a
LASSO regression? There are several LASSO implementations in R.
Best,
Chris
On 4/4/12 4:29 PM, Vincenzo Ellis wrote:
> Dear R Ecology Group Members,
>
> I have data on parasite prevalences (coded as 0s or 1s) for several species
> of parasites of one host species, and I am interested in seeing if these
> parasites can predict health parameters that I measured in the hosts. I
> wanted to tackle this with a multiple regression approach. I used the MASS
> package's stepAIC function to first figure out what parasites might be good
> predictors, if any. Code is:
>
> x<- lm(HealthVar ~ Par1 + Par2 + Par3 + Par4 + Par5 + Par6, data= mydata)
>
> step<- stepAIC(x, direction= "both")
>
> step$anova
>
> The problem with this method is it does not take into account interactions
> between parasites. I have tried rewriting the code to look for
> interactions:
>
> x<- lm(HealthVar ~ Par1 * Par2 * Par3 * Par4 * Par5 * Par6, data= mydata)
>
> step<- stepAIC(x, direction= "both")
> step$anova
>
> The resulting models from this code, however, don't make much sense (lots
> and lots of terms, and many two, and three way interactions). I would try
> to code for interactions manually, but I have no a prior predictions about
> which parasites might be interacting, nor do I have any sense about what
> parasites might be making hosts sick. It just seems reasonable to assume
> that there may be interactions between parasites, even if I don't know
> which ones would be involved.
>
> Any thoughts on how to attack a dataset like this would be much appreciated.
>
> Thanks so much!!
>
> Vincenzo
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
More information about the R-sig-ecology
mailing list