[R-sig-ME] modified lmer2 code problem

Wed Feb 14 12:18:50 CET 2007

I meant to send this to the list too.

>>>> "Douglas Bates" <bates at stat.wisc.edu> 13/02/2007 18:54 >>>
>On 2/13/07, Colin Beale <c.beale at macaulay.ac.uk> wrote:
>
>> I'm trying to replicate some code I have that works in GenStat
>> (http://www.vsni.co.uk/products/genstat/) using a modification of
the
>> lmer2 function. This modification allows a matrix to be passed
directly
>> to the random effects, rather than being built up from the factor
>> levels. I've written a little wrapper that passes things to the
modified
>> version of lmer2 and called it lmerWrap (code for both the modified
>> lmer2 and lmerWrap functions are at the end). I've tested it by
passing
>> model matrices to the function and comparing this to the lmer2
output
>> and all works well. For example:

>> fm1 <- lmer(Reaction ~ 1 + (1|Days) + (1|Subject), data =
sleepstudy)
>> fm2 <- lmerWrap(Reaction ~ 1 + (1|Days) , data = sleepstudy, mat =
>> model.matrix ( ~ -1 + sleepstudy$Subject ))
>> give apparently identical parameter estimates, as they should do.
I've
>> then added a small random number to model.matrix(~ -1 +
>> sleepstudy$Subject ):

>lmer or lmer2?  You mentioned lmer2 above but seem to be calling lmer
here

Sorry, the modified code adapts lmer2, so:

 fm1 <- lmer2(Reaction ~ 1 + (1|Days) + (1|Subject), data =
sleepstudy)

is indeed the appropriate model to fit - it makes no practical
difference though (identical parameter estimates).

>> mat2 <- model.matrix ( ~ -1 + sleepstudy$Subject ) + runif (length
>> (prod (dim (model.matrix ( ~ -1 + sleepstudy$Subject)))))
>>
>> and fitted a third model:
>>
>> fm3 <- lmerWrap (Reaction ~ 1 + (1|Days), data = sleepstudy,
>>        mat = mat2)
>>
>> I export this "smudged" dataset to GenStat and run the code I have
>> there and I get results identical to 4 significant figures, so all
seems
>> to be working well.
>>
>> However, I then try to run the code using the real dataset for which
I
>> want to replicate the analysis, and I get completely different
results
>> to that when run in GenStat, as the variance is estimated to be 0 in
R:
>> here's my code and output (if anyone wants the data too let me
know).
>>
>> try2 <- lmerWrap(BirthWeight ~ m_sum + Motherrs + MotherAge
>>          + MotherAge2 + Sex + birthday +  (1|Mother)
>>          + (1|BirthYear), data = deer, mat = rand_mat, control =
>> list(msVerbose = T,
>>          niterEM = 0, gradient = FALSE))
>>   0      3564.56: 0.916339 0.235294 0.436794
>>   1      3563.77: 0.823463 0.263738 0.435422
>>   2      3563.44: 0.838527 0.253850 0.434686
>>   3      3563.39: 0.844450 0.232247 0.406417
>>   4      3563.31: 0.852666 0.247116 0.405113
>>   5      3563.31: 0.853871 0.244576 0.403450
>>   6      3563.31: 0.852928 0.245936 0.400634
>>   7      3563.30: 0.854648 0.244316 0.394544
>>   8      3563.24: 0.871616 0.260905 0.198572
>>   9      3562.99: 0.859845 0.253159 0.00167238
>>  10      3562.96: 0.852482 0.245837 0.00164461
>>  11      3562.96: 0.853410 0.247334  0.00000
>>  12      3562.96: 0.853197 0.246962  0.00000
>>  13      3562.96: 0.853194 0.246967  0.00000

>> As I know the GenStat code recovers the correct parameters for
>> simulated data, and I get similar results in the sleepstudy example
>> above, I am at a loss as to what is happening here. Does anyone have
any
>> suggestions as to where I go from here? Is lmer struggling to find
>> sensible starting values or does anyone have any other suggestions?

>I'm not sure I understand the model.  As I see it there are two
>variance components in the model, (1|Mother) and (1|BirthYear) so I
>don't understand the 3 parameters in the optimization.

The purpose of the modified code is to be able to pass directly a
matrix to the lmer function as a random effect. In this model there are
two 'standard' variance components and one matrix I want to use directly
- rand_mat - hence three variance components. In the working examples I
gave, fm1 is the unmodified version of lmer2 with two variance
components ((1|Days) and (1|Subject)); fm2 has one 'standard' variance
component (1|Days) and one matrix to replace the Subject one in fm1
(given by model.matrix ( ~ -1 + sleepstudy$Subject)), thus the model
fitted by the adapted lmer2 code has two variance components and is (as
it should be) identical to fm1.

In my function lmerWrap, the code 'tricks' the modified lmer2 code into
fitting a model with one extra variance component that generates a
matrix of the right dimensions in Ztl, which is then switched with the
matrix I actually want to use.

Hope that makes it clearer,

Colin

>>> "Douglas Bates" <bates at stat.wisc.edu> 13/02/2007 18:54 >>>
On 2/13/07, Colin Beale <c.beale at macaulay.ac.uk> wrote:

> I'm trying to replicate some code I have that works in GenStat
> (http://www.vsni.co.uk/products/genstat/) using a modification of
the
> lmer2 function. This modification allows a matrix to be passed
directly
> to the random effects, rather than being built up from the factor
> levels. I've written a little wrapper that passes things to the
modified
> version of lmer2 and called it lmerWrap (code for both the modified
> lmer2 and lmerWrap functions are at the end). I've tested it by
passing
> model matrices to the function and comparing this to the lmer2
output
> and all works well. For example:

> fm1 <- lmer(Reaction ~ 1 + (1|Days) + (1|Subject), data =
sleepstudy)
> fm2 <- lmerWrap(Reaction ~ 1 + (1|Days) , data = sleepstudy, mat =
> model.matrix ( ~ -1 + sleepstudy$Subject ))
> give apparently identical parameter estimates, as they should do.
I've
> then added a small random number to model.matrix(~ -1 +
> sleepstudy$Subject ):

lmer or lmer2?  You mentioned lmer2 above but seem to be calling lmer
here

> mat2 <- model.matrix ( ~ -1 + sleepstudy$Subject ) + runif (length
> (prod (dim (model.matrix ( ~ -1 + sleepstudy$Subject)))))
>
> and fitted a third model:
>
> fm3 <- lmerWrap (Reaction ~ 1 + (1|Days), data = sleepstudy,
>        mat = mat2)
>
> I export this "smudged" dataset to GenStat and run the code I have
> there and I get results identical to 4 significant figures, so all
seems
> to be working well.
>
> However, I then try to run the code using the real dataset for which
I
> want to replicate the analysis, and I get completely different
results
> to that when run in GenStat, as the variance is estimated to be 0 in
R:
> here's my code and output (if anyone wants the data too let me
know).
>
> try2 <- lmerWrap(BirthWeight ~ m_sum + Motherrs + MotherAge
>          + MotherAge2 + Sex + birthday +  (1|Mother)
>          + (1|BirthYear), data = deer, mat = rand_mat, control =
> list(msVerbose = T,
>          niterEM = 0, gradient = FALSE))
>   0      3564.56: 0.916339 0.235294 0.436794
>   1      3563.77: 0.823463 0.263738 0.435422
>   2      3563.44: 0.838527 0.253850 0.434686
>   3      3563.39: 0.844450 0.232247 0.406417
>   4      3563.31: 0.852666 0.247116 0.405113
>   5      3563.31: 0.853871 0.244576 0.403450
>   6      3563.31: 0.852928 0.245936 0.400634
>   7      3563.30: 0.854648 0.244316 0.394544
>   8      3563.24: 0.871616 0.260905 0.198572
>   9      3562.99: 0.859845 0.253159 0.00167238
>  10      3562.96: 0.852482 0.245837 0.00164461
>  11      3562.96: 0.853410 0.247334  0.00000
>  12      3562.96: 0.853197 0.246962  0.00000
>  13      3562.96: 0.853194 0.246967  0.00000

> As I know the GenStat code recovers the correct parameters for
> simulated data, and I get similar results in the sleepstudy example
> above, I am at a loss as to what is happening here. Does anyone have
any
> suggestions as to where I go from here? Is lmer struggling to find
> sensible starting values or does anyone have any other suggestions?

I'm not sure I understand the model.  As I see it there are two
variance components in the model, (1|Mother) and (1|BirthYear) so I
don't understand the 3 parameters in the optimization.

> Thanks very much,
>
> Colin
>
>
> lmerWrap <- function (formula, data = NULL, mat, ...)
> {
>     colsums <- colSums(mat)
>     matAdd <- as.factor(rep(1:(dim(mat)[2]), length = dim(mat)[1]))
>     formula <- as.formula(formula)
>     formula <- update(formula, .~. +  (1|matAdd))
>     if (!is.null(data)) dat <- as.data.frame(cbind(data, matAdd))
>     smoothMat <- Matrix(mat, sparse = T)
>     run <- lmer2a(formula = formula, data = dat, matDimI = dim(mat),
>                  rand_matI = smoothMat, ...)
> }
>
>
> lmer2a <- function (formula, data, family = gaussian, method =
> c("REML",
>     "ML", "PQL", "Laplace", "AGQ"), control = list(), start = NULL,
>     subset, weights, na.action, offset, contrasts = NULL, model =
> TRUE,
>     rand_matI, ...)
> {
>     print(t(rand_matI))
>     method <- match.arg(method)
>     formula <- as.formula(formula)
>     if (length(formula) < 3)
>         stop("formula must be a two-sided formula")
>     cv <- do.call("lmerControl", control)
>     mc <- match.call()
>     fr <- lmerFrames(mc, formula, data, contrasts)
>     Y <- fr$Y
>     X <- fr$X
>     weights <- fr$weights
>     offset <- fr$offset
>     mf <- fr$mf
>     mt <- fr$mt
>     if (is.character(family))
>         family <- get(family, mode = "function", envir =
> parent.frame())
>     if (is.function(family))
>         family <- family()
>     if (is.null(family$family)) {
>         print(family)
>         stop("'family' not recognized")
>     }
>     fltype <- mkFltype(family)
>     FL <- lmerFactorList(formula, mf, fltype)
>     cnames <- with(FL, c(lapply(Ztl, rownames), list(.fixed =
> colnames(X))))
>     nc <- with(FL, sapply(Ztl, nrow))
>     Ztl <- with(FL, .Call(Ztl_sparse, fl, Ztl))
>     nFacs <- list(length(Ztl))                 #My insert
>     for (i in 1:length(Ztl)) {                  #My insert
>       nFacs[[i]] <- dim(Ztl[[i]])              #My insert
>     }                                         #My insert
>     id <- unlist(lapply (nFacs, all.equal, dim(t(rand_matI))))   
#My
> insert
>     id <- which(id == "TRUE")                                   #My
> insert
>     Ztl[[id]] <- t(rand_matI)                                  #My
> insert
>     Zt <- if (length(Ztl) == 1)
>         Ztl[[1]]
>     else do.call("rbind", Ztl)
>     fl <- FL$fl
>     if (fltype < 0) {
>         mer <- .Call(mer2_create, fl, Zt, t(X), as.double(Y),
>             method == "REML", nc, cnames, fr$offset, fr$weights)
>         if (!is.null(start))
>             mer <- setST(mer, start)
>         const <- unlist(lapply(mer at nc, function(n) rep(1:0, c(n,
>             (n * (n - 1))/2))))
>         optimRes <- nlminb(.Call(mer2_getPars, mer), function(x)
> .Call(mer2_deviance,
>             .Call(mer2_setPars, mer, x), as.integer(0)), lower =
> ifelse(const,
>             0, -Inf), control = list(trace = cv$msVerbose, iter.max
=
> cv$msMaxIter))
>         if (optimRes$convergence)
>             warning(paste("nlminb failed to converge:",
> optimRes$message))
>         .Call(mer2_setPars, mer, optimRes$par)
>         .Call(mer2_update_effects, mer)
>         return(new("lmer2", mer, frame = if (model) fr$mf else
> data.frame(),
>             terms = mt, call = mc))
>     }
>     mer
> }
> environment(lmer2a) <- environment(lmer2)
>
>
> --
> Please note that the views expressed in this e-mail are those of the
> sender and do not necessarily represent the views of the Macaulay
> Institute. This email and any attachments are confidential and are
> intended solely for the use of the recipient(s) to whom they are
> addressed. If you are not the intended recipient, you should not
read,
> copy, disclose or rely on any information contained in this e-mail,
and
> we would ask you to contact the sender immediately and delete the
email
> from your system. Thank you.
> Macaulay Institute and Associated Companies, Macaulay Drive,
> Craigiebuckler, Aberdeen, AB15 8QH.
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 
>

-- 
Please note that the views expressed in this e-mail are those of the
sender and do not necessarily represent the views of the Macaulay
Institute. This email and any attachments are confidential and are
intended solely for the use of the recipient(s) to whom they are
addressed. If you are not the intended recipient, you should not read,
copy, disclose or rely on any information contained in this e-mail, and
we would ask you to contact the sender immediately and delete the email
from your system. Thank you.
Macaulay Institute and Associated Companies, Macaulay Drive,
Craigiebuckler, Aberdeen, AB15 8QH.