[R-sig-ME] ordinal model and lsmeans mean class computing

Thu Jun 16 18:56:48 CEST 2016

One way to answer this is to say that `lsmeans` does not use `fm22.clmm$fitted` to obtain its predictions. Instead, it uses the regression coefficients for the fixed-effects portion of the model, along with the model matrix for each combination of `temp` and `contact`, and then uses these to obtain predictions on the cumulative logit scale (along with an artificial factor, `cut`, for which intercept is being used).  These are then suitably back-transformed to obtain the mean class for each combination of `temp` and `contact`. The random effects of `judge` are not used at all by `lsmeans` (and in general, for any mixed model, `lsmeans` always uses only the fixed effects in calculating predictions). I suspect that the regression coefficients are obtained as some kind of average of those for the judges, and that this averaging takes place on the cumulative-logit scale -- whereas your averaging with `ddply` is being done on the back-transformed-cell-probability scale. There is a nonlinear relationship between these, so you can't expect to obtain the same results.

Here are some more details of the calculations:

R> # Obtain the reference grid upon which the lsmeans are based:
R> rg <- ref.grid(fm22.clmm, mode = "linear.predictor")
R> rg
'ref.grid' object with variables:
    temp = cold, warm
    contact = no, yes
    cut = multivariate response levels: 1|2, 2|3, 3|4, 4|5
Transformation: "logit"

R> # Here are the regression coefficients used:
R> rg at bhat
       1|2        2|3        3|4        4|5   tempwarm contactyes 
 -1.623667   1.513365   4.228527   6.088773   3.062997   1.834885 

R> # Here is the model matrix used for predictions
R> rg at linfct
      1|2 2|3 3|4 4|5 tempwarm contactyes
 [1,]   1   0   0   0        0          0
 [2,]   1   0   0   0       -1          0
 [3,]   1   0   0   0        0         -1
 [4,]   1   0   0   0       -1         -1
 [5,]   0   1   0   0        0          0
 [6,]   0   1   0   0       -1          0
 [7,]   0   1   0   0        0         -1
 [8,]   0   1   0   0       -1         -1
 [9,]   0   0   1   0        0          0
[10,]   0   0   1   0       -1          0
[11,]   0   0   1   0        0         -1
[12,]   0   0   1   0       -1         -1
[13,]   0   0   0   1        0          0
[14,]   0   0   0   1       -1          0
[15,]   0   0   0   1        0         -1
[16,]   0   0   0   1       -1         -1

The linear predictions are thus `rg at linfct %*% rg at bhat`, and these are on the logit scale. Back-transform them and you get cumulative probabilities; difference those results to get cell probabilities; and then use those to obtain mean class values.

I'm dismayed to see that warning message about `is.na`, and will try to track that down and fix it in the next update.

Russell V. Lenth  -  Professor Emeritus
Department of Statistics and Actuarial Science   
The University of Iowa  -  Iowa City, IA 52242  USA   
Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017

-----Original Message-----

Date: Wed, 15 Jun 2016 23:33:06 +0200
From: espesser <robert.espesser at lpl-aix.fr>
To: R-sig-mixed-models at r-project.org
Subject: [R-sig-ME] ordinal model and  lsmeans  mean class computing
Message-ID: <5761C992.3050707 at lpl-aix.fr>
Content-Type: text/plain; charset="UTF-8"

Dear All,

I tried to understand how the predicted mean class of an ordinal mixed model was computed in lsmeans().
(I want to compute mean classes from the fixed effects only; lsmeans included all the random terms)

# from the data wine, I estimated the model
library(ordinal)
  data(wine)
  fm22.clmm= clmm(rating~temp+contact +(1|judge),data=wine, Hess=T)
# get mean classes with lsmeans
lsmeans(fm22.clmm,   ~ temp, mode="mean.class")
  temp mean.class        SE df asymp.LCL asymp.UCL
  cold    2.339608 0.166799 NA   2.012688   2.666528
  warm    3.479936 0.201921 NA   3.084178   3.875694

Results are averaged over the levels of: contact
Confidence level used: 0.95
Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

I supposed that the predicted mean class is the mean class number (from
1 to 5) weighted by the estimated probabilities. Therefore I tried:

library(plyr)
wine$fitted = fm22.clmm$fitted
ddply(wine, ~temp, summarise, sum(fitted*as.numeric(rating)) /sum(fitted))
   temp       ..1
1 cold 2.269931
2 warm 3.487144

There are some discrepancies with the results of lsmeans.
I got similar (small) discrepancies for other data and clmm models.

I certainly missed something, and I would really appreciate your help!

Best regards,

Robert