[R] Syntax for a three-level logistic model

Diaz-Escamilla, Rafael E rdiaz at saclink.csus.edu
Thu Aug 25 20:56:35 CEST 2011


Dear People at R help, I am trying to figure out the syntax for a three-level logistic model with a single random effect (intercept):

Data Collected
My data consist of three levels: level 1 is four setting for each student (setting nested within student), and each student is registered in one of 14 universities (students nested within university). More detailed:

A. 2,479 students who have a dichotomous outcome, engaged in risk behavior (yes or no) in EACH of 4 different settings.  Then I am using these settings as the level-1 units nested within the students, and the students as level-2 units.  I also have three dichotomous covariates for each student: age (less than 21 and 21 and older), and gender (male and female), and weather the student got drunk in each of the four instances (yes and no).  However, I am considering this last covariate “drunk”  as a level 1 covariate since the values for each individual student may change in each setting.
B. The students, however, are also nested within 14 universities (Level-3 units)

Data File
My data file consists of the following columns:
V1 is student ID (each student id appears four times to account for the outcome of each of the four settings).  Student ID’s range from 1001 to 3,479 (total of 2,479)
V2 and V3 contain the age and gender of each student (each of these values repeats 4 times, same as the student ID) 
V4, V5, and V6 contain the dummy coding of the 4 settings, say, as:
                      V1      V2   V3        V4   V5  V6 …
                   1001       0     1        0      0      0  
                   1001       0     1         0      0     1
                   1001       0     1         0      1     0
                   1001       0     1         1      0     0
                      .           .         .      .
                    Etc.

V7 contains the values of the covariate “drunk”  (whether the student was or not drunk during in a setting)
V8 contains the ID of the university where the student is enrolled.  Universities are coded with the numbers 11 – 24 (14 universities total).
V9 contains the dichotomous response variable: the outcome in each setting for each student.

Model (random intercept only):
Logit(p|G00) = P0 + P1(V4) + P2(V5) + P3(V6) + P4(V7)     (Level 1, setting and whether “drunk” in setting)
                  P0 =  B00 + B01(V2) + B02(V3)                           (Level 2, age and gender of student )
                  B00 = G00                                                             (Level 3, university – random effect)

Questions:
a) How to set the level 2 covariates (V2, V3) nested within student (V1) in the model below, is the nesting of students within universities correct as (1|V8) for the random intercept?  
model <- lmer(V9 ~ V4 + V5 + V6  + ….+ (1|V8) ???, data=datos, family = "binomial")
or should I use “glmPQL”
b) Do I need to change any coding in the column(s) of the data file?

Any help would be greatly appreciated. Sincerely,

Rafael Diaz
Assistant Professor
Dept of Mathematics and Statistics
California State University Sacramento
   



More information about the R-help mailing list