[R-sig-ME] LME and overall treatment contrasts / estimates

Fri Jul 15 12:35:29 CEST 2011

Hello fellow R users,

I am having a problem finding the estimates (and therefore fulfilling  
my hypothesis testing) for each separate treatment within my mixed  
models using 'lme' (package nlme). I have had problems explaining this  
on another forum, so apologies if sometimes my terminology isn't  
perfect, and that this ends up being a long message for a simple  
problem, but with patience I hope someone can help.

Firstly then, the model:
The data: Plant biomass
Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009)
Random Factors: 5 plots per treatment, 10 quadrats per plot (N=1200  
(3*5*10*8)).

I am modelling this in two ways, firstly with year as a continuous variable
(interested in the difference in estimated slope over time in each treatment
'year*treatment'), and secondly with year as a categorical variable
(interested in difference between 'treatments'). My problem only comes  
with the categorical data, but I'll take you through the continuous  
first.

--------------
ie: (with Year as either numeric or factor)

Model<-lme(Species~Year*Treatment,random=~1|Plot/Quadrat,na.action =  
na.omit,data=UDD)
---------------------

When using Year as a continuous variable, the summary output of the  
lme function means that I can compare the 3 treatments within my  
model...
i.e. it takes one of the Treatment*year interactions as the baseline  
(Year in the example below) and compares (contrasts) the other two to  
that.
-----------------
ie
summary(Model)
Fixed effects: Species ~ Year * Treatment
                      Value Std.Error   DF   t-value p-value
(Intercept)      1514.3700  352.7552 1047  4.292978  0.0000
Year               -0.7519    0.1759 1047 -4.274786  0.0000
Treatment0       -461.9500  498.8711   12 -0.925991  0.3727
Treatment1      -1355.0450  498.8711   12 -2.716222  0.0187
Year:Treatment0     0.2305    0.2488 1047  0.926537  0.3544
Year:Treatment1     0.6776    0.2488 1047  2.724094  0.0066

so the baseline trend for Year:Treatment-1 is indicated by Year  
-0.7519, Year:Treatment0 differs from that baseline Year:Treatment-1  
by 0.2305
and Year:Treatment1 is significantly different (p=0.0066) from  
Year:Treatment-1
----------------------

I can then calculate the overall treatment*year effect using  
'anova.lme(Model)'.
-------------------------
anova.lme(Model1)
                numDF denDF   F-value p-value (Intercept)        1   
1047 143.15245  <.0001
Year               1  1047  19.56663  <.0001
Treatment          2    12   3.73890  0.0547
Year:Treatment     2  1047   3.83679  0.0219

so there is an overall difference in slope between treatments  
(Year:Treatment interaction) p=0.0219
--------------------------

However, the problem comes when I use Year as a categorical variable.  
Here, I am interested solely in the Treatment effect (not the  
interaction with year, but I want to include this as a fixed effect in  
the model). However, the output for the two labelled 'Treatment's  
against the third comparison, are not the overall effect but are a  
comparison within a year (2002). I know this because if I was  
modelling data which was untransformed, non generalised, no missing  
data, with no other covariates (which in my real final model most of  
these things are included), the overall estimates for each treatment  
would be the grand mean for each treatment (across all years and  
quadrats). This is how I know the fixed effects output given in R is  
not what I want, as the estimates correspond to the mean within 2002.  
The contrasts for hypothesis testing then give you a t-value and  
p-value for tests between treatments within 2002
--------------------------
summary(Model1)
Fixed effects: Species ~ Year * Treatment
                     Value Std.Error   DF   t-value p-value
(Intercept)          6.42  1.528179 1029  4.201079  0.0000
Year2003             4.10  1.551578 1029  2.642471  0.0084
Year2004             5.00  1.551578 1029  3.222526  0.0013
Year2005            -1.52  1.551578 1029 -0.979648  0.3275
Year2006            -3.08  1.551578 1029 -1.985076  0.0474
Year2007            -2.40  1.551578 1029 -1.546813  0.1222
Year2008             2.24  1.551578 1029  1.443692  0.1491
Year2009            -4.30  1.551578 1029 -2.771372  0.0057
Treatment0           0.46  2.161171   12  0.212848  0.8350
Treatment1           0.50  2.161171   12  0.231356  0.8209
Year2003:Treatment0 -2.46  2.194262 1029 -1.121106  0.2625
Year2004:Treatment0 -1.34  2.194262 1029 -0.610684  0.5415
Year2005:Treatment0  0.34  2.194262 1029  0.154950  0.8769
Year2006:Treatment0  1.60  2.194262 1029  0.729174  0.4661
Year2007:Treatment0  1.76  2.194262 1029  0.802092  0.4227
Year2008:Treatment0 -3.22  2.194262 1029 -1.467464  0.1426
Year2009:Treatment0  1.80  2.194262 1029  0.820321  0.4122
Year2003:Treatment1  0.22  2.194262 1029  0.100261  0.9202
Year2004:Treatment1  3.48  2.194262 1029  1.585954  0.1131
Year2005:Treatment1  5.00  2.194262 1029  2.278670  0.0229
Year2006:Treatment1  4.70  2.194262 1029  2.141950  0.0324
Year2007:Treatment1  6.08  2.194262 1029  2.770863  0.0057
Year2008:Treatment1  2.32  2.194262 1029  1.057303  0.2906
Year2009:Treatment1  5.56  2.194262 1029  2.533881  0.0114

so Treatment0 (in year 2002) is different to baseline Treatment-1 (in  
year 2002)  by 0.46 p=0.8350
----------------------------------------

I can still get my overall effects (using anova.lme) which for year  
the year*treatment interaction, with year as a categorical variable,  
tells me if the differences in treatment values change in different  
years. And for Treatment (the thing I am interested in) it tells me if  
there is a difference between Treatments.
However, for my hypothesis testing, I want to compare the grand means  
for each separate treatment, taking into account the variance at my  
different scales of time and space. I can do this in SAS using  
'estimate' or 'lsmeans', but for various reasons I want to do it in R  
as well.
I tried 'glht' (package 'multcomp')
ie
require(multcomp)
summary(glht(Model, linfct = mcp(Treatment = "Tukey")))
but this only works if there are no interactions present, otherwise  
again it gives a comparison for one particular year. I would prefer to  
use contrasts rather the multiple comparison testing, but this is  
still an option.

I think this is a fairly simple ask / hypothesis, but not something I  
have not found yet in R. That is the joy of statistics and R is that  
there are always new things to learn.

Very grateful for any assistance,
Mark