[R-sig-ME] random effect variance greater than output variable variance

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Fri Nov 18 01:52:49 CET 2022


   Yes, the BLUPs are the output of ranef().

On 2022-11-17 4:01 a.m., Norman DAURELLE via R-sig-mixed-models wrote:
> 
> Dear Thierry,
> 
> thank you for your answer. You say "inspect the BLUP of the random effects". Does that mean using ranef() ? If not, then could you please explain what you mean ? I don't really understand it.
> 
> Yes, you're probably right about the fixed effect of rainfall, thanks.
> 
> regards,
> 
> Norman
> 
> 
> De: "Thierry Onkelinx" <thierry.onkelinx using inbo.be>
> À: "Norman DAURELLE" <norman.daurelle using agroparistech.fr>
> Cc: "r-sig-mixed-models" <r-sig-mixed-models using r-project.org>
> Envoyé: Jeudi 10 Novembre 2022 10:19:02
> Objet: Re: [R-sig-ME] random effect variance greater than output variable variance
> 
> Dear Norman,
> 
> I think this might be due to the unbalance in your design. You need to inspect the BLUP of the random effects. Look for the extremes in location and variety. I would expect some combinations with an extreme positive (negative) location effect compensated by an extreme negative (positive) variety effect.
> 
> Furthermore look into the fixed effects. Long.term apr-jun is highly correlated with long.term total. Their effects cancel each other to a certain extent. I recommend to replace long.term total with its difference with long.term apr-jun.
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> [ mailto:thierry.onkelinx using inbo.be | thierry.onkelinx using inbo.be ]
> Havenlaan 88 bus 73, 1000 Brussel
> [ http://www.inbo.be/ | www.inbo.be ]
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> [ https://www.inbo.be/ ]
> 
> 
> Op wo 9 nov. 2022 om 21:40 schreef Norman DAURELLE < [ mailto:norman.daurelle using agroparistech.fr | norman.daurelle using agroparistech.fr ] >:
> 
> 
> 
> 
> Dear Thierry,
> 
> i used these lines :
> 
> MELM.1 <- lmer(Yield..kg.Ha. ~ Rep.severity.means + Long.term.Apr.Jun + Long.term.total
> + (1|Location) + (1|Year) + (1|Variety),
> data = yield.disease.rainfall.df)
> 
> summary(MELM.1)
> 
> and compared the outputs of the summary
> 
> summary(MELM.1)
> Linear mixed model fit by REML ['lmerMod']
> Formula: Yield..kg.Ha. ~ Rep.severity.means + Long.term.Apr.Jun + Long.term.total +
> (1 | Location) + (1 | Year) + (1 | Variety)
> Data: yield.disease.rainfall.df
> 
> REML criterion at convergence: 19679.6
> 
> Scaled residuals:
> Min 1Q Median 3Q Max
> -4.1926 -0.5998 -0.0246 0.5572 5.0190
> 
> Random effects:
> Groups Name Variance Std.Dev.
> Variety (Intercept) 106888 326.9
> Location (Intercept) 512674 716.0
> Year (Intercept) 15724 125.4
> Residual 109754 331.3
> Number of obs: 1352, groups: Variety, 22; Location, 16; Year, 4
> 
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 160.9075 236.6696 0.680
> Rep.severity.means -3.7333 0.6512 -5.733
> Long.term.Apr.Jun -10.1864 0.8009 -12.719
> Long.term.total 9.8103 0.4631 21.182
> 
> Correlation of Fixed Effects:
> (Intr) Rp.sv. L..A.J
> [ http://rp.svrty.mn/ | Rp.svrty.mn ] -0.038
> Lng.trm.A.J -0.061 -0.061
> Lng.trm.ttl -0.314 0.016 -0.699
> 
> to var() of my output variable :
> 
>> var(yield.disease.rainfall.df$Yield..kg.Ha.)
> [1] 435938
> 
> and it bothers me that this variance is inferior to the one of the location factor reported for random effects in the summary, because it prevents me from using the method I wanted to use to show the results. I wanted to show how much each factor (year, location, and variety/cultivar) influences yield outside of disease severity and rainfalls.
> 
> Do I not understand what these variance values mean for the random effects in the summary ?
> Can it not be compared to the var() of my variable of interest ?
> 
> Thanks !
> 
> Norman
> 
> 
> 
> 
> De: "Thierry Onkelinx" < [ mailto:thierry.onkelinx using inbo.be | thierry.onkelinx using inbo.be ] >
> À: "Norman DAURELLE" < [ mailto:norman.daurelle using agroparistech.fr | norman.daurelle using agroparistech.fr ] >
> Cc: "r-sig-mixed-models" < [ mailto:r-sig-mixed-models using r-project.org | r-sig-mixed-models using r-project.org ] >
> Envoyé: Mercredi 9 Novembre 2022 09:34:07
> Objet: Re: [R-sig-ME] random effect variance greater than output variable variance
> 
> Dear Norman,
> 
> Can you show us the full code of the lme4 call and the output of summary(model). How did you calculate the variances for Y and the random effect?
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> [ mailto:thierry.onkelinx using inbo.be | thierry.onkelinx using inbo.be ]
> Havenlaan 88 bus 73, 1000 Brussel
> [ http://www.inbo.be/ | www.inbo.be ]
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> [ https://www.inbo.be/ ]
> 
> 
> Op di 8 nov. 2022 om 17:37 schreef Norman DAURELLE via R-sig-mixed-models < [ mailto:r-sig-mixed-models using r-project.org | r-sig-mixed-models using r-project.org ] >:
> 
> BQ_BEGIN
> 
> Dear list members,
> 
> I used a mixed-effect linear model to estimate the effect of a disease on the yield of a crop,
> and used a formula that was as follows :
> 
> Y ~ X + R1 + R2 + (1|year) + (1|location) + (1|cultivar)
> 
> where for each observation :
> 
> Y is the yield of the crop ,
> X the average disease severity in the field,
> R1 and R2 the rainfall values in the 1st and 2nd part of the growing season respectively,
> and year, location and cultivar, the year location and cultivar of the observation.
> 
> I have 5 years, 16 locations and a lot of cultivars, with an unbalanced experiment design.
> 
> The variance given in the summary for the factor Location is greater than the variance of the yield variable taken by itself, and this surprises me.
> 
> I wanted to show the relative importance of each factor over yield through a Venn diagram presenting the variances of each factor as part of the overall yield variance, with each factor's variance overlapping with the others', but the fact that the variance associated with a factor is greater than the variance of the output variable makes me doubt my understanding of the variances shown in a summary for a mixed-effect model.
> 
> Would someone have a simple explanation of what exactly these variances represent ?
> 
> I thought that for a factor with N levels, you had V= ( Σ (xi-μ)² ) / N, with i = 1,..., N, and xi the output variable's mean in the i-th level of the factor, and μ the overall output variable's mean.
> 
> Is this not how the variance for a random effect is computed ?
> 
> Thanks for any answer !
> 
> Cheers,
> 
> Norman
> 
> 
> 
> 
> 
> 
> [[alternative HTML version deleted]]
> 
> _______________________________________________
> [ mailto:R-sig-mixed-models using r-project.org | R-sig-mixed-models using r-project.org ] mailing list
> [ https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models | https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models ]
> 
> 
> 
> 
> BQ_END
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.



More information about the R-sig-mixed-models mailing list