[R-sig-ME] nested factor
Douglas Bates
bates at stat.wisc.edu
Thu Jan 27 23:48:09 CET 2011
On Wed, Jan 19, 2011 at 11:27 AM, DUYME Florent
<F.DUYME at arvalisinstitutduvegetal.fr> wrote:
> Hi,
>
>
>
> I would like to include a nested factor in a mixed model (using lmer):
>
>
>
> The data:
>
> my_data<-structure(list(Y = c(4L, 6L, 8L, 2L, 4L, 5L, 7L, 8L, 9L, 4L,
>
> 5L, 6L, 3L, 8L, 3L, 6L, 6L, 7L, 8L, 10L, 11L, 15L, 9L, 13L),
>
> A = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
>
> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("a",
>
> "b", "c"), class = "factor"), B = structure(c(1L, 2L, 3L,
>
> 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L,
>
> 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("b1", "b2", "b3", "b4",
>
> "b5", "b6", "b7", "b8"), class = "factor"), C = structure(c(1L,
>
> 12L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 2L, 3L, 4L, 5L, 6L,
>
> 7L, 8L, 9L, 10L, 11L, 13L, 14L, 15L, 16L, 17L), .Label = c("c1",
>
> "c10", "c11", "c12", "c13", "c14", "c15", "c16", "c17", "c18",
>
> "c19", "c2", "c20", "c21", "c22", "c23", "c24", "c3", "c4",
>
> "c5", "c6", "c7", "c8", "c9"), class = "factor"), fact = structure(c(1L,
>
> 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L,
>
> 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L), .Label = c("f1", "f2"), class = "factor")), .Names = c("Y",
>
> "A", "B", "C", "fact"), class = "data.frame", row.names = c(NA,
>
> -24L))
>
>
>
> A = countries
>
> B = locations in each country (b1 = 1st location , b2 = 2nd location, ...); that factor is nested within A
>
> C = other way for coding the locations (as each location is different form each others)
>
> fact = qualitative fixed factor
>
>
>
> the model :
>
> res_mm<-lmer(Y~fact+(1|C%in%A),my_data)
>
>
>
> the results:
>
> ranef(res_mm) gives
>
>
>
> $`C %in% A`
>
> (Intercept)
>
> FALSE 5.211116e-15
>
>
>
> and ranef(lmer(Y~fact+(1|B%in%A),my_data)) gives
>
>
>
> $`B %in% A`
>
> (Intercept)
>
> FALSE 5.211116e-15
> I don't understand the "FALSE" in the result. Is there something wrong with the coding or in the model ?
Well, two things. Because you have a factor C with the appropriate
number of levels then you don't need to do anything to indicate that
the factors are nested. Nested factors can be, and are, determined
from the data as long as you don't use "implicit nesting". So you
would use the formula
Y ~ fact + (1|A) + (1|C)
The factor B is the implicitly nested factor. You could still use it
if you specified the model as
Y ~ fact + (1|A) + (1|A:B)
There is no use of the %in% specification in lme4.
However, neither of these specifications will produce a fitted model
because you have 24 observations and 24 levels of factor C. Hence the
random effect for factor C is confounded with the residual error term.
So you need more than one observation per region or you just simplify
the model to
Y ~ fact + (1|A)
More information about the R-sig-mixed-models
mailing list