[R-sig-ME] data layout for crossed factors w/interaction in linear mix models

Douglas Bates bates at stat.wisc.edu
Sat Aug 22 15:45:05 CEST 2009


On Fri, Aug 21, 2009 at 4:59 PM, Rafael Diaz<tuteson at yahoo.com> wrote:
> Please find attached data files "datos1" and "datos2" containing the data layout as described in 1) and 2), respectively, in original email below.

Thank you for sending the data.  I should have mentioned that only the
first form, "datos1", is needed.

> Note: when opening "datos2" with Notepad, the columns get scrambled (this doesn't happen with "datos1"; however, both files open nicely in Wordpad.

Thanks for the warning.  It happens that I never use Windows so it is
not an issue for me.

I did not have any problem with fitting the model.  I enclose a
modified form of the data as a csv file (suitable for reading with
read.csv) and a transcript of fitting the model.

> Thank you very much again,
>
> Rafael Diaz
>
> --- On Fri, 8/21/09, Douglas Bates <bates at stat.wisc.edu> wrote:
>
>> From: Douglas Bates <bates at stat.wisc.edu>
>> Subject: Re: [R-sig-ME] data layout for crossed factors w/interaction in  linear mix models
>> To: "Rafael Diaz" <tuteson at yahoo.com>
>> Cc: r-sig-mixed-models at r-project.org
>> Date: Friday, August 21, 2009, 2:01 PM
>> On Fri, Aug 21, 2009 at 2:22 PM,
>> Rafael Diaz<tuteson at yahoo.com>
>> wrote:
>> > Dear All,
>>
>> > I am trying to fit a simple linear mixed model (see
>> below
>> > this paragraph) arising from a crossed factorial
>> design with
>> > 2 factors and ubalanced number of replicates (from two
>> to
>> > five) in each cell, but I keep getting an error
>> message (see
>> > bottom of message).  The model is:
>>
>> > yijk = intercept + ai + bj + abij + ejik, where:
>>
>> > "intercept" is fixed, and the crosss factors, ai, i =
>> > 1,..,10, and bj, j= 1,..,10, are random.  I am
>> > interested in estimating the variance components of
>> these
>> > factors AND their interaction.  I have tried:
>>
>> > fm1 <- lmer(formula = V1~1 + (1|V2) + (1|V3) +
>> (1|V4),
>> > data = 'datos') using two types of data layout for "
>>
>> We will need more information before we can help you.
>> It is best if
>> you can make the data available in some form.
>> Otherwise, please
>> include the results of
>>
>> library(lme4)
>> sessionInfo()
>> str(datos)
>> summary(datos)
>> fm1 <- lmer(V1 ~ 1 + (1|V2) + (1|V3) + (1|V4), datos)
>>
>> and, if the error still occurs,
>>
>> traceback()
>>
>> > 1) using a matrix with 3 columns:
>> >
>> > y     intercept   ai's  bj's  abij's
>> > y111  1           1     1     1 (1x1)
>> > y112  1           1     1     "
>> > y121  1           1     2     2 (1x2)
>> > y122  1           1     2     "
>> > y123  1           1     2     "
>> > y131  1           1     3     3 (1x3)
>> > .     .           .     .     .
>> > .     .           .     .     .
>> >
>> >
>> >
>> > 2) using the design matrix from  Y = XBeta +Zb.
>> > That is, using the same first two columns as above,
>> but
>> > substituting 120 columns (10 for ai's, 10 for bj's
>> and 100
>> > for abij's) for the last three columns.
>> >
>> > I get the message: "Error in eval(predvars, data, env)
>> :
>> > invalid envir argument"
>> >
>> > Is my data layout mispecified? Do I need to input
>> initial
>> > values for the random components in order to get the
>> REML
>> > estimates?  I lmer valid for unbalanced designs?
>> Any help would be greatly appreciated.
>> >
>> > Rafael Diaz
>> > California State University Sacramento
>> > Math and Stats
>> >
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org
>> mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>>
>
>
>
-------------- next part --------------
options(show.signif.stars = FALSE)
library(lme4)
sessionInfo()
datos <- within(read.table("~/Desktop/datos1.txt",
                           col.names = c("y", "one", "A", "B", "AB")),
            {
                A <- factor(A)
                B <- factor(B)
                AB <- factor(A:B)
            })
str(datos)
summary(datos)
datos$one <- NULL
xtabs(~ A + B, datos)
print(fm1 <- lmer(y ~ (1|A) + (1|B) + (1|AB), datos))
-------------- next part --------------

R version 2.9.1 (2009-06-26)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> options(show.signif.stars = FALSE)
> library(lme4)
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'


	The following object(s) are masked from package:stats :

	 contr.helmert,
	 contr.poly,
	 contr.SAS,
	 contr.sum,
	 contr.treatment,
	 xtabs 


	The following object(s) are masked from package:base :

	 rcond 

> sessionInfo()
R version 2.9.1 (2009-06-26) 
i486-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lme4_0.999375-32   Matrix_0.999375-30 lattice_0.17-25   

loaded via a namespace (and not attached):
[1] grid_2.9.1
> datos <- within(read.table("~/Desktop/datos1.txt",
+                            col.names = c("y", "one", "A", "B", "AB")),
+             {
+                 A <- factor(A)
+                 B <- factor(B)
+                 AB <- factor(A:B)
+             })
> str(datos)
'data.frame':	301 obs. of  5 variables:
 $ y  : num  -0.324 -0.468 -0.152 -1.341 -0.79 ...
 $ one: num  1 1 1 1 1 1 1 1 1 1 ...
 $ A  : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ B  : Factor w/ 10 levels "1","2","3","4",..: 1 1 1 2 2 3 3 3 4 4 ...
 $ AB : Factor w/ 100 levels "1:1","1:2","1:3",..: 1 1 1 2 2 3 3 3 4 4 ...
> summary(datos)
       y                one          A             B             AB     
 Min.   :-5.2603   Min.   :1   2      : 35   5      : 34   2:1    :  7  
 1st Qu.:-1.1330   1st Qu.:1   3      : 35   1      : 33   3:10   :  6  
 Median : 0.2585   Median :1   4      : 32   7      : 32   4:7    :  6  
 Mean   : 0.4048   Mean   :1   8      : 32   8      : 32   6:4    :  5  
 3rd Qu.: 1.9831   3rd Qu.:1   7      : 31   4      : 30   7:5    :  5  
 Max.   : 5.9446   Max.   :1   9      : 31   10     : 30   9:5    :  5  
                               (Other):105   (Other):110   (Other):267  
> datos$one <- NULL
> xtabs(~ A + B, datos)
    B
A    1 2 3 4 5 6 7 8 9 10
  1  3 2 3 3 4 2 3 2 2  3
  2  7 2 2 3 3 4 4 3 3  4
  3  3 2 4 2 4 3 3 4 4  6
  4  3 3 2 3 2 3 6 4 4  2
  5  2 3 3 3 2 4 4 2 2  2
  6  2 2 4 5 2 2 2 2 3  2
  7  3 2 2 4 5 2 3 4 2  4
  8  2 3 4 2 4 4 3 4 3  3
  9  4 3 3 3 5 2 2 5 2  2
  10 4 2 2 2 3 3 2 2 3  2
> print(fm1 <- lmer(y ~ (1|A) + (1|B) + (1|AB), datos))
Linear mixed model fit by REML 
Formula: y ~ (1 | A) + (1 | B) + (1 | AB) 
   Data: datos 
  AIC  BIC logLik deviance REMLdev
 1039 1058 -514.7     1030    1029
Random effects:
 Groups   Name        Variance Std.Dev.
 AB       (Intercept) 1.56560  1.25124 
 B        (Intercept) 1.69080  1.30031 
 A        (Intercept) 0.57243  0.75659 
 Residual             0.87333  0.93452 
Number of obs: 301, groups: AB, 100; B, 10; A, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)   0.4284     0.4951  0.8653
> 
> proc.time()
   user  system elapsed 
 11.268   0.108  11.448 


More information about the R-sig-mixed-models mailing list