[R-sig-ME] Quadratic term in linear model and model over-parameterization

Wed Mar 15 10:46:02 CET 2017

Dear all,

I’m new to this mailing list and really hope that somebody here can help me with the following issue:

I calculated the following linear models on a BoxCox transformed response variable with 382 data points:
Model 1: Y~x+a+b+c+d+e+(a*b)+(a*c)+ (a*d)+…+(a*b*c)+(a*b*d)+(a*b*e)+…
a: 'Experimental Temperature' (Temp1, Temp2)
b: 'Host Population' (PopX, PopY)
c: 'Parasite Population' (PopX, PopY)
d: 'Host Gender' (male, female)
Additionally, I included the continuous predictor variable 'Parasite Weight' (e) and all possible 2-way (10 interactions) and 3-way (10 interactions) interactions into the model.

In model 2 I replaced the two main effects 'Host Population' and 'Parasite Population' with one variable ('Sympatry/Allopatry') that combines the two effects. Apart from this, model 2 (six 2-way interactions and four 3-way interactions) was identical to model 1.

I am interested now in all interactions that include the continuous predictor variable 'Parasite Weight'. I got such a significant interaction ('Experimental Temperature x Parasite Population x Parasite Weight', p = 0.010) from model 1.

We sent a manuscript containing these two models to a journal for review and got it back now with a comment from a reviewer who suggested that we look for non-linear relationships involving 'Parasite Weight'.

Thus, I calculated model 1.2 which corresponds to model 1 but additionally added the quadratic term of 'Parasite Weight' ('Parasite Weight^2') and the respective interactions (in total 14 x 2-way interactions and 16 x 3-way interactions). I did the same for model 2, which resulted in model 2.2 with nine 2-way interactions and seven 3-way interactions.

The significant interaction I found with model 1 was not significant anymore with model 1.2 and in model 2.2 two interactions became significant ('Host Gender x Sympatry/Allopatry x Parasite Weight', p = 0.038 and 'Host Gender x Sympatry/Allopatry x Parasite Weight^2', p = 0.044) that were not significant in model 2.

Here are my questions:
1. Why is it that including the quadratic term removes some significant effects while adding others?
2. What does it mean when both an interaction including the linear term and the same interaction including the quadratic term become significant? Does this suggest a non-linear relationship or both a linear and a non-linear relationship?
3. Could it be that the disappearance of the interaction that was significant in model 1, is caused by an over-parameterization of model 1.2 and how can I prove this (with all the models we have the potential problem of many interactions and main effects)?
4. Are there any general arguments for when to include a quadratic term into a model and when quadratic terms should be avoided?
5. Which model can I trust?

Thank you very much in advance for any advice you can give me,

Fred.