[R-sig-ME] Help establishing mixed model equation for split, plot design

Wed Apr 4 11:14:34 CEST 2018

Hi, 

> Sorry, lme only takes nested random effects 

Other designs are possible, e.g. fitting crossed random effects in lme is possible but fiddly. Pinheiro & Bates (Mixed Effects Models in S and S-Plus, 2000, p163-167) show how to do this using pdBlocked.

Best wishes,
Paul

> On 4 Apr 2018, at 11:34, Samuel Knapp <samuel.knapp at tum.de> wrote:
> 
> Hi Felipe,
> 
> 
> please also respond to the mailing list.
> 
> 
> Sorry, lme only takes nested random effects (see 
> https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified). 
> In order to use the rep:salinity effect, you could add a column for the 
> main plots in your data set:
> 
> 
> sp.datos$rep_salinityF <- paste(sp.datos$rep,sp.datos$salinityF,paste="_")
> 
> lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= 
> ~1|rep_salinityF, data=sp.datos)
> 
> 
> I hope, it works like this.
> 
> 
> I'm afraid, I don't fully understand your question regarding equal 
> number of samples. Do you mean missing observations? In lme, you need to 
> add na.action="na.omit", while lmer is automatically omitting missing 
> observations.
> 
> You're right, that the ANOVA is doing type III SS. If you're interested 
> in other SS type, you could look at the Anova() function from the car 
> package. Apparently it can also deal with mixed models. Furthermore, the 
> new version of lmerTest can also handle other SS type. See 
> https://github.com/runehaubo/lmerTestR/blob/master/pkg_notes/new_lmerTest.pdf
> 
> 
> Best,
> 
> Samuel
> 
> 
> 
> On 04/04/18 10:15, CALLEJA APESTEGUI, FELIPE FRANCISCO wrote:
>> 
>> Hi Samuel,
>> 
>> 
>> first of all thanks a lot for the quick and clear response.
>> 
>> 
>> As you state, all models give the same results once they are 
>> established following your examples. Now I have a clear view about 
>> what is really doing the command and how to continue.
>> 
>> 
>> Just one quick comment. The line:
>> 
>> *lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= 
>> ~**1|rep:salinityF, data=sp.datos)*
>> 
>> 
>> Throws the following error:
>> 
>> *Error in getGroups.data.frame(dataMix, groups) : **invalid formula 
>> for groups*
>> 
>> I leave it here in case is interesting for you. I've decided to use 
>> the lmer package because I have other variables that are not balanced, 
>> so I prefer to use the same model in all cases, and that package is 
>> working fine.
>> 
>> Just one final question. Is there any consideration I should have when 
>> working with variables without equal number of samples? I mean, 
>> something specific in the command of the lmer I should be careful with 
>> or investigate further on. Or does the function itself has all 
>> considerations included. i see from the *anova(fit.okay3, 
>> ddf="Kenward-Roger") *that the anova is made using the type III SS so 
>> I wouldn't have to change that part. *
>> *
>> 
>> Also, fyi,� the aov function in the case of not grouping by rep throws 
>> the same result as the lmer.
>> 
>> Again, I really appreciate your help, it has been a lifesaver.
>> 
>> Best regards,
>> 
>> *Felipe Calleja Ap�stegui*
>> 
>> *Predoctoral researcher*
>> 
>> *
>> *
>> 
>> *Instituto de Hidr�ulica Ambiental "IH Cantabria"*
>> 
>> C/ Isabel Torres, N� 15
>> 
>> Parque Cient�fico y Tecnol�gico de Cantabria
>> 
>> 39011 Santander (Espa�a)
>> 
>> www.ihcantabria.es <http://www.ihcantabria.es/>
>> 
>> Tel:� +34 942 20 16 16 Ext. 1153
>> 
>> Fax: +34 942 26 63 61
>> 
>> e-mail: f*elipe-francisco.calleja at alumnos.unican.es*
>> 
>> 
>> ------------------------------------------------------------------------
>> *De:* Samuel Knapp <samuel.knapp at tum.de>
>> *Enviado:* martes, 3 de abril de 2018 21:26:31
>> *Para:* r-sig-mixed-models at r-project.org; CALLEJA APESTEGUI, FELIPE 
>> FRANCISCO
>> *Asunto:* Re: Help establishing mixed model equation for split, plot 
>> design
>> Hi Felipe,
>> 
>> if I understand the design correctly, it is a split-plot design with
>> salinityF as the whole-plot. Only, if you have arranged the boxes in a
>> way that replicate blocks are formed, you should include the rep effect
>> in a model. These replicate blocks could be, that you always have three
>> boxes with the three different salinity levels grouped together, e.g. on
>> one shelf. If you have simply replicated each box 5 times and they are
>> set up fully randomly, you should not include a repliate block.
>> 
>> If you have no missing values, and you specify the right model, both
>> aov() and lme() should give you the same results. I would suggest the
>> following models:
>> 
>> 
>> # with replicate blocks
>> 
>> aovmodel <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF +
>> Error(rep/salinityF), data=sp.datos)
>> 
>> lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random=
>> ~1|rep/salinityF, data=sp.datos)
>> 
>> # or with lmer from lme4 package (library (lme4)
>> 
>> lmermodel <- lmer(Plantsurvival ~ salinityF*immersionF*SpecTF
>> +(1|rep/salinityF), data=sp.datos)
>> 
>> # without replicate blocks (I'm not completely sure, if aov will report
>> the exact same results here)
>> 
>> aovmodel <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF +
>> Error(rep:salinityF), data=sp.datos)
>> 
>> lmemodel <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random=
>> ~1|rep:salinityF, data=sp.datos)
>> 
>> lmermodel <- lmer(Plantsurvival ~ salinityF*immersionF*SpecTF
>> +(1|rep:salinityF), data=sp.datos)
>> 
>> 
>> As rep/salinityF is simply a short form for rep+rep:salinityF the
>> difference will be that you have no rep block effect in the models
>> without replicate blocks.
>> 
>> 
>> If you have no missing values, the point about using a mixed model here
>> is less about variance estimation through a random effect, but
>> automatically getting the right F-Test of the main-plot effect in ANOVA
>> (which is the main part of doing a proper split-plot analysis!!!).
>> 
>> My experience is that the anova() function from the lmerTest package
>> will give you the right denominator df when using the Kenward-Roger
>> method. You need to use lmer for this:
>> 
>> library(lmerTest)
>> 
>> anova(lmermodel,ddf="Kenward-Roger")
>> 
>> For proper mean comparisons, I suggest to use the emmeans package, with
>> the Tukey test and letter display from the cld() function:
>> 
>> library(emmeans)
>> 
>> cld(emmeans(lmermodel,"salinityF") #replace salinityF by the factor you
>> are interested
>> 
>> 
>> Finally, if you have one or more missing values, aov() will return
>> strange anova tables, thus better use a mixed model!
>> 
>> 
>> Best regards,
>> 
>> Samuel
>> 
>> -- 
>> Samuel Knapp
>> 
>> Lehrstuhl f�r Pflanzenern�hrung
>> Technische Universit�t M�nchen
>> (Chair of Plant Nutrition
>> Technical University of Munich)
>> 
>> Emil-Ramann-Strasse 2
>> D-85354 Freising
>> 
>> Tel. +49 8161 71-3578
>> samuel.knapp at tum.de
>> www.researchgate.net/profile/Samuel_Knapp 
>> <http://www.researchgate.net/profile/Samuel_Knapp>
>> 
>> 
>> 
>> On 03/04/18 19:16, r-sig-mixed-models-request at r-project.org wrote:
>>> Message: 1
>>> Date: Tue, 3 Apr 2018 07:53:15 +0000
>>> From: "CALLEJA APESTEGUI, FELIPE FRANCISCO"
>>> <felipe-francisco.calleja at alumnos.unican.es>
>>> To: "r-sig-mixed-models at r-project.org"
>>> ������� <r-sig-mixed-models at r-project.org>
>>> Subject: [R-sig-ME] Help establishing mixed model equation for split
>>> ������� plot design
>>> Message-ID:
>>> 
>> <DB6P191MB0088D09AF911107B33C1DE5C87A50 at DB6P191MB0088.EURP191.PROD.OUTLOOK.COM>
>>> 
>>> Content-Type: text/plain; charset="iso-8859-1"
>>> 
>>> Hello,
>>> 
>>> 
>>> I'm looking for some help establishing a mixed model ANOVA using R, 
>> for a split plot design I've made for an experiment of saltmarsh 
>> germination. I'll explain as clear as possible the experimental design 
>> and afterwards what I've done and my doubts. Hope you can help me. I 
>> haven't worked very much with R so many of my doubts are about what 
>> I'm "telling" it to do with one or other command.
>>> 
>>> 
>>> The experiment consists in meassuring the germination percentage of 
>> one species in variable conditions of salinity, immersion time and 
>> presence of other species. I sow seeds in soil core's, that are inside 
>> plastic boxes that are filled periodically with saline water. There 
>> are three factors: salinity (3 levels: 0, 5 and 18), immersion time (3 
>> levels: 0, 20, 40%), species treatment (2 levels: Baccharis, Baccharis 
>> + Juncus). Inside each box there are 6 cores that combine in a 
>> complete random design the factors of immersion and species treatment. 
>> The box is filled with water at one of the levels of salinity. Thus, 
>> I'm using a split plot design with salinity as the whole plot factor, 
>> and immersion and species treatment as the subplot factors. All 
>> factors are considered fixed. Each box is repeated 5 times. Thus, 
>> there are 15 boxes and 90 soil cores. The dependent variable is the 
>> percentage of germination of the species Baccharis in each core.
>>> 
>>> 
>>> I have several doubts about how to analyze the array.
>>> 
>>> 
>>> 1 - As far as I understand, although I'm treating all my factors as 
>> fixed, this is a mixed model because of the interaction between 
>> subjects (the boxes I believe), and the "split plot nature" of the 
>> array, right? In that sense, which function would be better to analyze 
>> this, the aov of the stats package, the lme of the nlme package, the 
>> lmer of lme4?
>>> 
>>> 
>>> 
>>> �� 2 - I've had trouble calculating the degrees of freedom for the 
>> residuals. The only reference I have is that the error of the whole 
>> plot part should have 12 df's, and the within error should have 60 
>> df's. With that reference I've established two possible R commands:
>>> 
>>> 
>>> fit.aov2 <- aov(Plantsurvival ~ salinityF*immersionF*SpecTF + 
>> Error(rep:salinityF/immersionF:SpecTF), data=sp.datos)
>>> 
>>> 
>>> With rep being the number of repetition. This last one gives the 12 
>> and 60 df's for the error terms.
>>> 
>>> 
>>> The other option is:
>>> 
>>> fit.okay <- lme(Plantsurvival ~ salinityF*immersionF*SpecTF, random= 
>> ~1|rep/salinityF, data=sp.datos)
>>> 
>>> But in this last case, the df's are 8 and 60, which makes me suspect 
>> maybe there is something wrong. But as I said, I haven't cleared my 
>> head on which should be the correct df's.
>>> 
>>> 
>>> Questions: Is the aov line solving a mixed model adequate for my 
>> design?,
>>> 
>>> 
>>> 
>>> ���������������������� Is the lme line considering salinityF as a 
>> random factor? If it is, how can I tell it to consider all factors as 
>> fixed, but put salinity at the "higher level" of the whole plot and 
>> the other ones in the "lower level" of the subplot?
>>> 
>>> 
>>> I hope the questions and the experimental array are clear. If there 
>> is any doubt or need more information please let me know. I attach a 
>> csv file with the data in case you want to see it.
>>> 
>>> 
>>> Finally, if is not too much to ask, I'm fairly new to the splitplot 
>> anova's and R, so I would really appreciate if you could answer be 
>> with as much detail as possible, to fully understand what's going on 
>> and where to continue.
>>> 
>>> 
>>> Thanks a lot,
>>> 
>>> 
>>> Felipe Calleja Ap�stegui
>>> 
>>> Predoctoral researcher
>>> 
>>> 
>>> Instituto de Hidr�ulica Ambiental "IH Cantabria"
>>> 
>>> C/ Isabel Torres, N� 15
>>> 
>>> Parque Cient�fico y Tecnol�gico de Cantabria
>>> 
>>> 39011 Santander (Espa�a)
>>> 
>>> www.ihcantabria.es<http://www.ihcantabria.es/>
>>> 
>>> Tel:� +34 942 20 16 16 Ext. 1153
>>> 
>>> Fax: +34 942 26 63 61
>>> 
>>> e-mail: felipe-francisco.calleja at alumnos.unican.es
>>> 
>>> 
>>> 
>> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models