[R-sig-ME] Fw: Nested subject-longitudinal logit design

Sat May 5 16:50:05 CEST 2012

HI Ben,

Thanks for helping me.
I posted this 2 days back.  Probably, you haven't seen this.  Any ideas on how to fix the convergence issue.  The parameter estimates are now half of the all the other models.  I used verbose=TRUE in the statement.  Is it because of the unbalanced data?  For some of the treatment combinations, there are 25 replications, while for some there are only 10 replications. 

----- Forwarded Message -----
From: arun <smartpink111 at yahoo.com>
To: Ben Bolker <bbolker at gmail.com>
Cc: R mixed models <r-sig-mixed-models at r-project.org>
Sent: Thursday, May 3, 2012 5:11 PM
Subject: Re: [R-sig-ME] Nested subject-longitudinal logit design

Hi Ben,

I tried to run the model,

(fm<-lmer(Response~Wavelength*Start_Resp*time+(1|resid)+(1+time|Subject)+(1|RepNest), family=binomial,data=BehavdatOrig2))

I got the result, but at the end there was a warning sign 

Warning message:
In mer_finalize(ans) : false convergence (8)

  Data: BehavdatOrig2 
  AIC  BIC logLik deviance
 1357 1452 -661.7     1323
Random effects:
 Groups  Name        Variance Std.Dev. Corr   
 resid   (Intercept) 0.094236 0.30698         
 RepNest (Intercept) 2.859728 1.69107         
 Subject (Intercept) 1.921401 1.38615         
         time        0.172905 0.41582  -0.724 
Number of obs: 1960, groups: resid, 1960; RepNest, 98; Subject, 98

Then, I tried the uncorrelated model.
 (fma<-lmer(Response~Wavelength*Start_Resp*time+(1|resid)+(1|RepNest)+(0+time|Subject), family=binomial,data=BehavdatOrig2))
There were no warnings.  

 AIC  BIC logLik deviance
 1333 1417 -651.6     1303
Random effects:
 Groups  Name        Variance   Std.Dev.  
 resid   (Intercept) 8.4733e-12 2.9109e-06
 Subject time        1.0964e-01 3.3112e-01
 RepNest (Intercept) 1.1732e+01 3.4252e+00

I compared the two models:

> anova(fma, fm)
Data: BehavdatOrig2
Models:
fma: Response ~ Wavelength * Start_Resp * time + (1 | resid) + (1 | 
fma:     RepNest) + (0 + time | Subject)
fm: Response ~ Wavelength * Start_Resp * time + (1 | resid) + (1 + 
fm:     time | Subject) + (1 | RepNest)
    Df    AIC    BIC  logLik Chisq Chi Df Pr(>Chisq)
fma 15 1333.2 1416.9 -651.57                        
fm  17 1357.4 1452.2 -661.69     0      2          1

The results implies I should select the uncorrelated model over the correlated one (-0.724).  It is like the one I described in one of my previous posts.

I am also worried about the false convergence in my correlated model.  This was not observed when I run my previous correlated model 
BehavdatOrig <- within(BehavdatOrig, Subject <- factor(Treatment:Rep)) 
((fm<-lmer(Response~Wavelength*Start_Resp*time+(1|resid)+(1+time|Subject), 
     family=binomial,data=BehavdatOrig)  

The correlated model was selected with model comparison (P< 0.0006031 ***).

Thanking you,
A.K.

----- Original Message -----
From: Ben Bolker <bbolker at gmail.com>
To: r-sig-mixed-models at r-project.org
Cc: 
Sent: Thursday, May 3, 2012 2:44 PM
Subject: Re: [R-sig-ME] Nested subject-longitudinal logit design

arun <smartpink111 at ...> writes:

> I posted recently regarding inputs for a longitudinal logit model
> using lme4.  

[snip]

> ... I am measuring the binary response (1- insect present in light
> area, 0- insect present in dark area) as dependent variable, with
> fixed effects of Wavelength (3 levels) of light applied on insect in
> a petridish (half-covered with aluminium foil) for a period starting
> from 1 min to 20 min.   Depending on the starting response (animal
> present in dark or light zone of petridish at 0 min), there is
> another factor (Start_Resp - 2 levels - L starting in light zone, D-
> starting in dark zone).  It was really important to introduce this
> factor, as the response is drastically different in both levels of
> the factor for each of the wavelengths.   Since, we are measuring
> the response every 1 min (0 or 1), for 20 min, time is also a factor
> with 20 levels or a covariate.   In the present model, I introduce
> time as a covariate and extracted deviations for individual
> measurements as a random effect in "resid".

> BehavdatOrig$resid<-as.factor(1:dim(BehavdatOrig)[1])

  I would say that factor(1:nrow(BehavdatOrig)) is slightly
more readable, but OK

[A] (fm<-lmer(Response~Wavelength*Start_Resp*time+(1|resid)+(1+time|Subject), 
     family=binomial,data=BehavdatOrig)

This seems reasonable

> If Wavelength, starting response are fixed effects and time as
> covariate, I think I can have an interaction term as
> Wavelength*Start_Resp*time.  But, subjects are not repeated for the
> experiment.  

I'm guessing this means that each individual is only measured
in a single level of Wavelength*Start_Resp ... ?

> If I have column called "Rep" (replication) in the
> dataset for each treatment (another column - Treatment = Combination
> of Wavelength*Strain*Start_Resp) does it make sense to introduce
> Subject as nested within Treatment using the 

> BehavdatOrig <- within(BehavdatOrig, Subject <- factor(Treatment:Rep)) 

How is Subject coded?  i.e. is it coded 1..n_i for each Treatment:Rep
combination (explicit nesting), or is it coded 1..N for the entire
data set (implicit nesting)?

Similarly, how is Rep coded?

Assuming for the moment that Subject is implicitly nested and Rep is
explicitly nested, and that there is more than one Subject per Rep
(and that Rep is not crossed with Subject, i.e. each Subject is
measured only within a single Rep), then you should use something like

BehavdatOrig$RepNest <- with(BehavdatOrig,interaction(Treatment,Rep))

(1|resid) + (1+time|Subject) + (1|RepNest)

This allows for variation in intercept and slope across Subject, and
intercept (only) across RepNest.

  This specification would also be correct if Subject *were* crossed
with RepNest and numbered appropriately (i.e. 1..n for each level of
RepNest).  The only problem is if it is explicitly nested, in which
case you need (1+time|Subject:RepNest)

> It is an unbalanced dataset with respect to number of replications. 
> I made a mistake in my previous analysis as I used Subject as
> replicates which gave me completely different results.  The shrink
> fit model graph for each subject looks to have small deviations from
> the population.  In this scenario, should I have to replace this
> model with quasibinomial model.

  I don't know what the "shrink fit model graph" is ...

>  Quasibinomial model with cbind( sum of response for every 5 minute,
> and 5-response) with this new settings.  For one thing, there is no
> quasibinomial link function in lmer, still I used binomial link to
> get results.  Another problem with quasibinomial is that I am not
> able to get the shrink fit graph with quasibinomal response.  Should
> be a problem with lattice graph plots.  Any advice on this direction
> will be appreicated.

   Not sure what's going on here.  Are you using

glmmPQL(...,family="quasibinomial")

(which is the only way I know of to fit quasibinomial GLMMs in R?)

Your inclusion of the 'resid' random effect above should have
taken care of overdispersion.

# The above fm model provides output as:
# Formula: Response ~ Wavelength * Start_Resp * time + 
## (1 | resid) + (1 + time | Subject) 
#    Data: BehavdatOrig 
#   AIC  BIC logLik deviance
#  1323 1413 -645.7     1291
# Random effects:
#  Groups  Name        Variance   Std.Dev.   Corr   
#  resid   (Intercept) 8.3103e-12 2.8828e-06        
#  Subject (Intercept) 1.6026e+01 4.0033e+00        
#          time        1.2760e-01 3.5722e-01 -0.552 
# Number of obs: 1960, groups: resid, 1960; Subject, 98
# 
# I also tried uncorrelated model fma. 
# 
# Formula: Response ~ Wavelength * Start_Resp * time + 
## (1 | resid) + 
## (1 |     Subject) + (0 + time |
# Subject) 
#    Data: BehavdatOrig 
#   AIC  BIC logLik deviance
#  1333 1417 -651.6     1303
# Random effects:
#  Groups  Name        Variance   Std.Dev.  
#  resid   (Intercept) 6.9660e-14 2.6393e-07
#  Subject time        1.0964e-01 3.3112e-01
#  Subject (Intercept) 1.1732e+01 3.4252e+00
# et results:
# 
# Anova comparison favors the correlated model:
# > anova(fma,fm)
# Data: BehavdatOrig
# Models:
# fma: Response ~ Wavelength * Start_Resp * time + (1 | resid) + (1 | 
# fma:     Subject) + (0 + time | Subject)
# fm: Response ~ Wavelength * Start_Resp * time + (1 | resid) + (1 + 
# fm:     time | Subject)
#     Df    AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)    
# fma 15 1333.2 1416.9 -651.57                             
# fm  16 1323.4 1412.7 -645.69 11.767      1  0.0006031 ***

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models