[R-sig-ME] Question about flexible LME4 Variance-Covariance Structure

Mon Aug 16 23:18:43 CEST 2021

   I agree that the 'hack' is suboptimal.  It can be done if you set up 
a bunch of dummy variables **and** use the lowest-variance group as the 
reference, e.g.

   NUM_VALUE ~ TREATMENT + (1|block) + (0+block2dummy|obs) + (0 + 
block3dummy|obs) + ...

   This is ugly (but can be automated); I think it *is* the correct 
model, but reconstructing the residual variances for each block is a 
nuisance.

   glmmTMB can also  do dispersion models, via

glmmTMB(NUM_VALUE ~ TREATMENT + (1|block), dispformula  = ~ block, ...)

   cheers
    Ben Bolker

On 8/16/21 5:12 PM, Peter Claussen via R-sig-mixed-models wrote:
> Jared,
> 
> I’ll preface this by saying I’m the biometrician for GDM Solutions, and we provide software for managing agronomic trials. Our software is used by a wide range of scientists, many who collaborate with companies like BCS (and, IIRC, BCS holds a large number of licenses for our software). I think I’m familiar with the type of problem you’re trying to solve. We provide an interface for diagnostic screening of individual trials, including RCB designs, and we use R as the computational engine; we also write R scripts programmatically for independent analysis.
> 
> That said, I wouldn’t use lmer for this. I think lme might give you what you’re looking for. Try
> 
> 
> lme(NUM_VALUE ~ TREATMENT, random =  ~ 1 | BlockingName, data = zdf, weights=varIdent(form= ~ 1| BlockingName)
> 
> 
> I would avoid specifying TREATMENT as a random effect to ’trick’ lmer into calculating variances. You might get numbers, but they don’t correspond to the correct model, so you really can’t be certain the variances are correct. The code I’ve given above should allow you to keep TREATMENT as fixed but will give ratios of residual variance within replicates. From an example trial, I get,
> 
> 
> Linear mixed-effects model fit by REML
>    Data: ARMdata
>    Log-restricted-likelihood: 5.914592
>    Fixed: assessment18 ~ treatment
> (Intercept)  treatment2  treatment3  treatment4  treatment5  treatment6  treatment7  treatment8  treatment9 treatment10
>    0.9841966  -0.8691926  -0.8234544  -0.9225589  -0.9772064  -0.8645547  -0.9772064  -0.9772064  -0.9772064  -0.7643471
> treatment11 treatment12
>   -0.9657557  -0.9772064
> 
> Random effects:
>   Formula: ~1 | replicate
>          (Intercept) Residual
> StdDev:  0.02308651 0.134405
> 
> Variance function:
>   Structure: Different standard deviations per stratum
>   Formula: ~1 | replicate
>   Parameter estimates:
>          1         2         3         4         5         6
> 1.0000000 1.5128266 2.3479912 0.6976083 2.3369190 0.8236288
> Number of Observations: 72
> Number of Groups: 6
> 
> 
> I’m making some assumptions about what you intend to do with the variances, but those would be beyond the scope of this group, I think. You can direct questions relating to those assumptions to me at Pete using gdmdata.com <mailto:Pete using gdmdata.com>
> 
> Cheers,
> 
> 
>> On Aug 11, 2021, at 6:19 PM, Jared Anderson <jared.anderson.ext using bayer.com> wrote:
>>
>> Hello All,
>>
>> I am a Data Scientist with Bayer Crop Science and I deal often with generalized linear mixed models in my role. I am new to this subscription group and wanted to reach out because I am trying to fit two different kinds of models on the same data and use their residual variances as a statistic for assessing data quality.
>> For our variance-ratio statistic, two slightly different linear models are used. Both models are simple single treatment factor randomized complete block design models. The difference is in the assumed covariance structure of the residuals.
>> The first model makes the standard assumption that the residuals all have the same variance, while the second model assumes that the residuals variances differ from rep to rep.
>> Model 1: y = m + T + R + e, where m = overall mean, T = treatment effect, R = blocking effect and e = random error which is assumed to be homoschedastic: cov(e) = common variance*I = Diag(common variance).
>> Model 2: y = m + T + R + e, where m = overall mean, T = treatment effect, R = blocking effect and e = random error which is assumed to be heteroschedastic: cov(e) = D(var[i]) i=1, …, number of reps. The variance of the residuals is different for each rep.
>> VR[i] = var[i]/common variance, one VR value for each rep.
>>
>> I have been able to build the desired Model1 in LME4
>> Formula_LME4 <- as.formula(paste0(zParams$value,” ~ 1 + (1|”,zParams$factorLevelId,”) + (1|”BlockingName,”)”))
>> HomoS_LME4 <- lmer(Formula_LME4,data=zdf) #covariance structure is the same for all reps
>> (Treating the fixed effects (factorLevelId) as random is a way to trick the routines into calculating the needed variances.)
>>
>>
>> Where I am running into a problem and was hoping to get your advice: I cannot seem to implement Model 2 successfully
>> I think the LME4 Heteroscedastic formula is supposed to be this, if I understand it correctly:
>> lmer(NUM_VALUE ~ 1 + (1 | TREATMENT) + (1 | BlockingName) + (0 | BlockingName),data=zdf)
>> But it is throwing an error:
>> Linear mixed model fit by REML ['lmerMod']
>> Formula: NUM_VALUE ~ 1 + (1 | TREATMENT) + (1 | BlockingName) + (0 | BlockingName)
>>    Data: zdf
>> REML criterion at convergence: 1384.89
>> Error in thl[[i]] : subscript out of bounds
>> And I was wondering if you’ve ever come across the error above?
>>
>>
>> If it helps, back when we leveraged Asreml licenses for our work, I was able to generate Model 2 this way:
>>
>> fixedFormula_ASREML <- as.formula(paste0(zParams$value," ~ 1"))
>> randomFormula_ASREML <- as.formula(paste0(" ~ ",zParams$factorLevelId," + ", BlockingName))
>> covFormula_ASREML    <- as.formula(paste0(" ~ at(", BlockingName, "):units"))
>>
>> HeteroS_ASREML <- asreml(fixed  = fixedFormula_ASREML,
>>                   random = randomFormula_ASREML,
>>                   rcov   = covFormula_ASREML,
>>                   data   = zdf)
>>
>>
>> Best,
>> Jared A.
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
Graduate chair, Mathematics & Statistics