[R-sig-ME] How to interpret verbose output

Thu May 22 02:05:26 CEST 2014

On 14-05-21 06:15 PM, Stuart Luppescu wrote:
> 
> Hello, I'm trying to do a variance decomposition of teacher performance
> ratings. Each teacher is observed four times and rated on 9 components.
> The model looks like this:
> 
>  lme8 <- lmer(rating ~
>               (1|tid.f) + (1|obsorder.f) + (1|comp.f) +
>                (1|tid.f:comp.f) +  (1|obsorder.f:comp.f) + (1|
> tid.f:obsorder.f) ,
>                data=ratings, REML=FALSE, verbose=2)
> 
> where tid.f is the teacher identifier, comp.f the component identifier,
> and obsorder.f the observation number. 
> 
> This works fine for the whole dataset, but I want to do it separately by
> decile based on the average rating for each teacher, so I added this: ,
> subset=ratings$bins==quantile
> where the bins are {1, ..., 10} indicating the decile, and quantile is a
> loop index used like this: for(quantile in 1:10) {}
> 
> It works fine for the lowest decile, but fails for every decile after
> that. I get 0.00 for the tid.f variance component, which is the one I'm
> really interested in. I have no idea why. I checked the distribution of
> average ratings by decile and it all looks unremarkable. I think there
> may be a clue in the iteration details as shown by the verbose output
> but I don't know how to interpret it. Here it is for decile 1:
> 

  This printing comes from minqa::bobyqa.  ?bobyqa says:

iprint The value of ‘iprint’ should be set to ‘0, 1, 2 or 3’,
          which controls the amount of printing. Specifically, there is
          no output if ‘iprint=0’ and there is output only at the
          return if ‘iprint=1’. Otherwise, each new value of ‘rho’ is
          printed, with the best vector of variables so far and the
          corresponding value of the objective function. Further, each
          new value of the objective function with its variables are
          output if ‘iprint=3’.  Default value is ‘0’.

> npt = 8 , n =  6

Number of points used for quadratic approximation, number of points
(parameters)

> rhobeg =  0.2 , rhoend =  2e-07 
>    0.020:  16:      18359.8;0.454554 0.735854 0.376302 0.705437 0.787591
> 0.802885 

   rho (current trust region radius): number of function evaluations
(?): current value of objective function (PWRSS/deviance); values of
parameters ("theta") -- lower triangle of the Cholesky factor of the RE
variance-covariance matrices, scaled by the residual standard deviation.

>   0.0020:  32:      18346.2;0.454004 0.699893 0.407403 0.574987 0.702683
> 0.880645 
>  0.00020:  77:      18263.7;0.432231 0.697895 0.398860 0.0461275
> 0.326475  1.55039 
>  2.0e-05: 292:      18252.6;0.432051 0.700841 0.394275 0.0472994
> 0.315551 0.130071 
>  2.0e-06: 339:      18252.6;0.432029 0.700839 0.394515 0.0472285
> 0.314849 0.129962 
>  2.0e-07: 364:      18252.6;0.432028 0.700822 0.394548 0.0472181
> 0.314777 0.129904 
> At return
> 395:     18252.578: 0.432030 0.700822 0.394554 0.0472204 0.314755
> 0.129913
>>   print(summary(lme8))
> Linear mixed model fit by maximum likelihood  ['lmerMod']
> Formula: rating ~ (1 | tid.f) + (1 | obsorder.f) + (1 | comp.f) + (1 |  
>     tid.f:comp.f) + (1 | obsorder.f:comp.f) + (1 | tid.f:obsorder.f)
>    Data: ratings
>  Subset: ratings$bins == quantile
> 
>      AIC      BIC   logLik deviance df.resid 
>  18268.6  18328.1  -9126.3  18252.6    12622 
> 
> Scaled residuals: 
>     Min      1Q  Median      3Q     Max 
> -3.2200 -0.5592 -0.0306  0.5711  4.4942 
> 
> Random effects:
>  Groups            Name        Variance Std.Dev.
>  tid.f:comp.f      (Intercept) 0.033067 0.18184 
>  tid.f:obsorder.f  (Intercept) 0.087013 0.29498 
>  tid.f             (Intercept) 0.027579 0.16607 
>  obsorder.f:comp.f (Intercept) 0.000395 0.01988 
>  comp.f            (Intercept) 0.017551 0.13248 
>  obsorder.f        (Intercept) 0.002990 0.05468 
>  Residual                      0.177161 0.42090 
> Number of obs: 12630, groups: tid.f:comp.f, 3159; tid.f:obsorder.f,
> 1404; tid.f, 351; obsorder.f:comp.f, 45; comp.f, 9; obsorder.f, 5
> 
> Fixed effects:
>             Estimate Std. Error t value
> (Intercept)  2.01657    0.05355   37.65
> 
> And here for decile 5:
> 
> npt = 8 , n =  6 
> rhobeg =  0.2 , rhoend =  2e-07 
>    0.020:  18:      15847.4;0.261469 0.514852 0.178718 0.0962381
> 0.152470 0.160530 
>   0.0020:  35:      15749.5;0.356245 0.532056  0.00000 0.0961818
> 0.219586 0.166497 
>  0.00020:  90:      15742.1;0.349036 0.516678  0.00000 0.0755764
> 0.330633 0.248861 
>  2.0e-05: 107:      15742.1;0.348612 0.515454  0.00000 0.0766747
> 0.332641 0.251693 
>  2.0e-06: 250:      15742.1;0.348335 0.515211  0.00000 0.0765898
> 0.330456 0.260112 
>  2.0e-07: 273:      15742.1;0.348339 0.515207  0.00000 0.0765856
> 0.330510 0.260162 
> At return
> 281:     15742.076: 0.348339 0.515207 1.97296e-07 0.0765858 0.330510
> 0.260162
>>   print(summary(lme8))
> Linear mixed model fit by maximum likelihood  ['lmerMod']
> Formula: rating ~ (1 | tid.f) + (1 | obsorder.f) + (1 | comp.f) + (1 |  
>     tid.f:comp.f) + (1 | obsorder.f:comp.f) + (1 | tid.f:obsorder.f)
>    Data: ratings
>  Subset: ratings$bins == quantile
> 
>      AIC      BIC   logLik deviance df.resid 
>  15758.1  15817.7  -7871.0  15742.1    12772 
> 
> Scaled residuals: 
>     Min      1Q  Median      3Q     Max 
> -3.6535 -0.3833  0.1362  0.5225  3.4610 
> 
> Random effects:
>  Groups            Name        Variance  Std.Dev.
>  tid.f:comp.f      (Intercept) 0.0192757 0.13884 
>  tid.f:obsorder.f  (Intercept) 0.0421666 0.20535 
>  tid.f             (Intercept) 0.0000000 0.00000 
>  obsorder.f:comp.f (Intercept) 0.0009318 0.03052 
>  comp.f            (Intercept) 0.0173531 0.13173 
>  obsorder.f        (Intercept) 0.0107521 0.10369 
>  Residual                      0.1588568 0.39857 
> Number of obs: 12780, groups: tid.f:comp.f, 3195; tid.f:obsorder.f,
> 1420; tid.f, 355; obsorder.f:comp.f, 45; comp.f, 9; obsorder.f, 5
> 
> Fixed effects:
>             Estimate Std. Error t value
> (Intercept)  2.85126    0.06774   42.09
> 
> For decile 5, I notice that the column fourth from the right goes to 0
> after the first line. Does that mean something? Any ideas about why this
> is failing?
> 
> Thanks in advance.
> 
> 

  It means the second element of the theta vector (which controls the
among-'tid.f' variation in the intercept) is going to zero.  This is
probably the correct numeric value ...

  I would consider further

(1) scaling and centering continuous predictors to see if that helps
(2) simulating pseudo-data and fitting it to see how it performs
(?simulate.merMod will help with this)
(3) using bbmle::slice2D to view 2D slices of the deviance surface.