[R-sig-ME] [R] lmer syntax, matrix of (grouped) covariates?

Martin Maechler maechler at stat.math.ethz.ch
Tue Aug 19 08:59:58 CEST 2008


>>>>> "AM" == Aaron Mackey <ajmackey at gmail.com>
>>>>>     on Mon, 18 Aug 2008 15:41:20 -0400 writes:

    AM> Dear Dr. Bates, Thanks for the info about the SIG on
    AM> mixed models.

    AM> A followup question: if I did not want a generalized
    AM> variance-covariance matrix, and instead wanted to assume
    AM> independence between grouped covariates, then instead of
    AM> (3n-n^2)/2 variance-covariance parameters per grouping
    AM> (of size n), I'd have only n variance parameters, yes?
    AM> Would that be specified (x1|fac) + (x2|fac) instead of
    AM> (x1 + x2 | fac)?

yes.

    AM> And either way I formulate it, my syntax question
    AM> remains: is there a way to write the formula that is
    AM> generic for a given covariate matrix of unspecified
    AM> dimension?  Said another way, how would you write a lmer
    AM> "wrapper" function that was given a matrix of
    AM> covariates?  How would you "unpack" the columns of that
    AM> matrix to generate the formula required for lmer()?
    AM> Would you build up a formula string programmatically,
    AM> and then "eval" it?

yes / "kind of" :

I would build the formula programmatically
but then use  as.formula(.)  or possibly in some cases  parse().

Package 'sfsmisc' has a utility  wrapFormula() that I had need
for at some time:

----------------------------------------

install.packages("sfsmisc")
library(sfsmisc)

myF <- wrapFormula(Fertility ~ . , data = swiss)
myF
## Fertility ~ s(Agriculture) + s(Examination) + s(Education) + 
##             s(Catholic) + s(Infant.Mortality)

------------------------------------------

and you can look at the short definition of  wrapFormula
to see how I use gsub(.) on character strings to construct the
new formula.

Martin Maechler, ETH Zurich


    AM> Thanks again,

    AM> -Aaron

    AM> On Mon, Aug 18, 2008 at 3:21 PM, Douglas Bates
    AM> <bates at stat.wisc.edu> wrote:

    >> On Mon, Aug 18, 2008 at 11:20 AM, Aaron Mackey
    >> <ajmackey at gmail.com> wrote: > I have a fairly large
    >> model:
    >> 
    >> >> length(Y) > [1] 3051 >> dim(covariates) > [1] 3051 211
    >> 
    >> > All of these 211 covariates need to be nested
    >> hierarchically within a > grouping "class", of which
    >> there are 8.  I have an accessory vector, " > cov2class"
    >> that specifies the mapping between covariates and the 8
    >> classes.
    >> 
    >> > Now, I understand I can break all this information up
    >> into individual > vectors (cov1, cov2, ..., cov211,
    >> class1, class2, ..., class8), and do > something like
    >> this:
    >> 
    >> > model <- lmer(Y ~ 1 + cov1 + cov2 + ... + cov211 + >
    >> (cov1 + cov2 + ... | class1) + > (...) + > (... + cov210
    >> + cov211 | class8)
    >> 
    >> > But I'd like keep things syntactically simpler, and use
    >> the covariates > and cov2class > variables directly.  I
    >> haven't been able to find the right syntactic sugar > to
    >> get this done.
    >> 
    >> Before considering the syntax you should consider the
    >> complexity of the model.  Even with 3000+ observations
    >> estimating fixed effects for over 200 covariates is a
    >> risky business.  To follow that up by incorporating
    >> dozens of covariates in a vector-valued random effects
    >> term with a general variance covariance matrix is not
    >> likely to be productive.  When estimating variances and
    >> covariances of random effects the complexity of the
    >> estimation problem increases according to the square of
    >> the dimension of each random-effects vector.  For a term
    >> of the form (1|fac) there is one variance-covariance
    >> parameter to estimate.  For a term of the form (1+x|fac)
    >> there are 3 such parameters to estimate, (1+x1+x2|fac)
    >> requires 6, etc.  Assuming that your 211 covariates are
    >> more-or-less equally distributed among the 8 classes you
    >> would have about 27 per class which means that you have
    >> 378 variance-covariance parameters to estimate for each
    >> of your 8 classes.  Even with 3000+ observations it would
    >> be optimistic to expect to estimate that many
    >> variance-covariance parameters.
    >> 
    >> You may want to move your question to the
    >> R-SIG-Mixed-Models mailing list and follow up on the
    >> model specification there.
    >> 

	[[alternative HTML version deleted]]

    AM> _______________________________________________
    AM> R-sig-mixed-models at r-project.org mailing list
    AM> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list