[R-sig-ME] [R] understanding I() in lmer formula

Sat Jun 17 05:48:41 CEST 2017

(Probably I should change the subject line and admit that this is no
longer about I().)

Emmanuel Curis writes:
 > L = \prod_{i=1}^n \prod_{,j=1}^{n_i} F'(Yij=y_ij|Ai=ai,Bi=bi) G'(Ai=ai, Bi=Bi)
 > 
 > where F' is the density of a Gaussian of mean ai + bi * days_{i,j} and
 > variance sigma (the variance of epsilon, the residual) and G' the
 > joint density of a binormal Gaussian vector of expectation (uA, uB)
 > and of covariance matrix Sigma = ( (f, g ), ( g, h ) ) and you
 > minimize -log L. i the subject index, j the measure index within the subject.
 ...
 > Once you wrote it, you can use dnorm to compute it, and use the logLik
 > function to check the results.

I'm trying to do this and things are not working out quite as I hoped.
Starting with the result of lmer on my sample data:
  > summary(xm) 
  Linear mixed model fit by maximum likelihood  ['lmerMod'] 
  Formula: xout ~ xin + (xin | xid) 
     Data: xd 

       AIC      BIC   logLik deviance df.resid  
     -16.2    -15.7     14.1    -28.2        2  

  Scaled residuals:  
       Min       1Q   Median       3Q      Max  
  -0.89383 -0.29515  0.05005  0.33286  0.80360  

  Random effects: 
   Groups   Name        Variance  Std.Dev. Corr  
   xid      (Intercept) 2.814e-03 0.053051       
            xin         6.583e-03 0.081134 -0.03 
   Residual             4.917e-05 0.007012       
  Number of obs: 8, groups:  xid, 3 

  Fixed effects: 
              Estimate Std. Error t value 
  (Intercept)  0.06061    0.03172   1.911 
  xin          1.00397    0.04713  21.301 

  Correlation of Fixed Effects: 
      (Intr) 
  xin -0.058 

  > coef(xm) 
  $xid 
    (Intercept)      xin 
  a  0.09870105 1.100632 
  b -0.01183068 1.008098 
  c  0.09496031 0.903172 

The good news is that I seem to be correctly computing the predicted
values, e.g., 0.09870105 + 1.100632 * xin for a data point in
group "a" gives a value which differs from its xout value by the same
amount as the corresponding value of xm at resp$wtres (which I surmise
is a vector of the residues).
So I expect that F' will be 
 dnorm(xm at resp$wtres ...)
The mean should be zero, by design ...
  > mean(xm at resp$wtres) 
  [1] -2.775558e-17 
as expected, essentially zero

The sd to pass to dnorm should be what?  Is that the .007012 from 
this line above?
   Residual             4.917e-05 0.007012       
That's not very close to what I see in the actual residuals:
  > var(xm at resp$wtres) 
  [1] 1.566477e-05 
  > sd(xm at resp$wtres) 
  [1] 0.003957874 
And I'd expect the value to be at least close to the actual SD of
the residuals.  If the correct value is neither of the two above,
please tell me what it is (or how to find it).

Then for G, I'm even less certain.  I suspect I'm supposed to use
dmvnorm(...)
The first argument would be one of the lines from the coef output above,
such as c(0.09870105, 1.100632), right?
The second (mean) would be the estimates of the fixed effects,
 c(0.06061,1.00397), right?
And the third, is supposed to be the variance-covariance matrix.
That would be your 
 f g
 g h
where f and h, are the two variances shown under:
  Random effects: 
   Groups   Name        Variance  Std.Dev. Corr  
   xid      (Intercept) 2.814e-03 0.053051       
            xin         6.583e-03 0.081134 -0.03 
   Residual             4.917e-05 0.007012       
so f = 2.814e-03, h = 6.583e-03
and g, I *think* would be the product of the two SD's and the Corr,
or 0.053051 * 0.081134 * -0.03
Is that all correct?

I hope you can see some errors above, since this doesn't look like
it's going to end up computing anything like the logLik of 14.1 !

Also, I have the feeling that all of this output is rounded, and that
I'd get better results (or at least results that agree with the other
results computed in the model) if I could find these data in the model
rather than copying them from the output.  I've found some of it, like
the residues (at least that's what I think wtres is!), but other data
above, such f,g,h I have not yet found.  If you can tell me how to find
(or compute) them, that would be helpful.