[R-sig-ME] two questions about clmm

Mon Aug 20 10:16:05 CEST 2012

Dear Malcolm

On 3 August 2012 19:11, Malcolm Fairbrother <m.fairbrother at bristol.ac.uk> wrote:
> Dear Rune (and list),
>
> I've been making use of clmm, and have two (potentially over-ambitious) questions. If Rune or anyone else can offer any insights about either, that would be much appreciated.
>
> First, I noticed something intriguing in the ordinal package documentation: the "## Binomial example with data from the lme4-package example", for clmm2. This suggests a way of shortening large datasets (with one Bernoulli trial per row) into shorter ones (with counts on each row, representing many trials), rather like "glmer(cbind(incidence, size - incidence)…" does for lme4. This obviously speeds up model fitting tremendously. However, I was wondering if there's any way to do this for outcomes with more than two levels (i.e., not just binomial, but multinomial)? This may not be possible or even make sense, but I thought I'd ask, given the example that was in the documentation.

It makes very good sense, but different computational strategies make
different specifications more natural.

In a binomial glm there are basically three ways to specify the data:
1) binary trials with one row for each trial, 2) binary trials with
with a row for each unique covariate setting and a weight indicating
the number of times this trial/setting was observed, and, 3) as a
two-column matrix with the number of successes and failures in each
row for each covariate setting.

All three structures extend to the ordered multinomial situation, but
clm and clmm only work with the first 2 due to the way the internal
computations are carried out. As briefly mentioned in this vignette
(http://www.cran.r-project.org/web/packages/ordinal/vignettes/clm_intro.pdf)
on page 10-11, setting up the data as in 2) is much more efficient
than in 1). Other packages like VGAM work with the matrix
representation of the multinomial response as in 3), and uses an
iterated weighted least squares estimation scheme somewhat different
from the Newton-Raphson scheme employed in clm and clmm.

>
> Second, I often use "simulate" with fitted mer objects (from lme4), to get confidence intervals for quantities of interest. (Using "refit" and "simulate" together is fast.) Is there any similar way to simulate and refit fitted clmm objects?

Not currently, though that may change at some point.

Cheers,
Rune

>
> Many thanks,
> Malcolm
>
>
> Dr Malcolm Fairbrother
> School of Geographical Sciences
> University of Bristol
>

-- 
Rune H B Christensen, PhD
DTU Informatics, Section for Statistics
Technical University of Denmark, Build. 305, Room 122,
DK-2800 Kgs. Lyngby, Denmark
Phone: (+45) 45 25 33 63
Mobile: (+45) 30 26 45 54