[R] Varying statistical significance in estimates of linear model
Stathis Kamperis
ekamperi at gmail.com
Thu Aug 8 12:43:27 CEST 2013
Hi everyone,
I have a response variable 'y' and several predictor variables 'x_i'.
I start with a linear model:
m1 <- lm(y ~ x1); summary(m1)
and I get a statistically significant estimate for 'x1'. Then, I
modify my model as:
m2 <- lm(y ~ x1 + x2); summary(m2)
At this moment, the estimate for x1 might become non-significant while
the estimate of x2 significant.
As I add more predictor variables (or interaction terms), the
estimates for which I get a statistically significant result vary. So
sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.
It seems to me that I could tweak my model in such a way (by
adding/removing predictor variables or "suitable" interaction terms)
that I could "prove" whatever I'd like to prove.
What is the proper methodology involved here ? What do you people do
in such cases ? I can provide the data if anyone cares and would like
to have a look at them.
Best regards,
Stathis Kamperis
More information about the R-help
mailing list