[R-sig-eco] interactions in stepAIC

Christopher David Desjardins desja004 at umn.edu
Thu Apr 5 00:07:10 CEST 2012


Hi Vincenzo,
There are a couple of things that might be worth considering.

In your first model you consider only main effects and no interactions. 
Do any of your main effects drop off after you run stepAIC? If so, when 
you go to build an interaction model don't include these main effects 
only the significant parasites.

Given that you don't have any a prior predictions about interactions, 
why do you think there are any? Maybe it's best not to look for them 
then to prevent stumble upon some interactions by chance? Have you tried 
plotting your data? This could help guide you with interactions. I would 
also recommend against include higher order interactions that you won't 
be able to interpret.

What do you hope to get from the interactions?

Finally, since your approach is somewhat data driven and you seem to 
want to reduce the number of parasite predictors, have you considered a 
LASSO regression? There are several LASSO implementations in R.

Best,
Chris

On 4/4/12 4:29 PM, Vincenzo Ellis wrote:
> Dear R Ecology Group Members,
>
> I have data on parasite prevalences (coded as 0s or 1s) for several species
> of parasites of one host species, and I am interested in seeing if these
> parasites can predict health parameters that I measured in the hosts.  I
> wanted to tackle this with a multiple regression approach. I used the MASS
> package's stepAIC function to first figure out what parasites might be good
> predictors, if any.  Code is:
>
> x<- lm(HealthVar ~ Par1 + Par2 + Par3 + Par4 + Par5 + Par6, data= mydata)
>
>   step<- stepAIC(x, direction= "both")
>
> step$anova
>
> The problem with this method is it does not take into account interactions
> between parasites.  I have tried rewriting the code to look for
> interactions:
>
> x<- lm(HealthVar ~ Par1 * Par2 * Par3 * Par4 * Par5 * Par6, data= mydata)
>
> step<- stepAIC(x, direction= "both")
> step$anova
>
> The resulting models from this code, however, don't make much sense (lots
> and lots of terms, and many two, and three way interactions).  I would try
> to code for interactions manually, but I have no a prior predictions about
> which parasites might be interacting, nor do I have any sense about what
> parasites might be making hosts sick.  It just seems reasonable to assume
> that there may be interactions between parasites, even if I don't know
> which ones would be involved.
>
> Any thoughts on how to attack a dataset like this would be much appreciated.
>
> Thanks so much!!
>
> Vincenzo
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



More information about the R-sig-ecology mailing list