[R-sig-ME] Differing variable lengths (missing data) and Model errors in lmer()
Jerome Goudet
jerome.goudet at unil.ch
Thu Jul 9 21:21:57 CEST 2009
Dear Aditi,
I'm puzzled with the nesting of loci within families (the presence of an
allele at a locus means the same whatever the family, I would think). I
would treat loci as fixed effect and hence use something like that:
(fm1<-lmer(peg.no~P1L55+P1L73+P1L74+P1L77+P1L91+P1L96+P1L98+P1L100+P1L114+P1L118+(1|family),dat))
Linear mixed model fit by REML
Formula: peg.no ~ P1L55 + P1L73 + P1L74 + P1L77 + P1L91 + P1L96 + P1L98
+ P1L100 + P1L114 + P1L118 + (1 | family)
Data: dat
AIC BIC logLik deviance REMLdev
2954 3006 -1464 2963 2928
Random effects:
Groups Name Variance Std.Dev.
family (Intercept) 89.043 9.4363
Residual 91.377 9.5592
Number of obs: 390, groups: family, 57
Fixed effects:
Estimate Std. Error t value
(Intercept) 96.2793 3.2169 29.929
P1L55 -2.0185 1.5831 -1.275
P1L73 -3.0256 1.8900 -1.601
P1L74 1.5112 1.6189 0.933
P1L77 -2.7898 2.3529 -1.186
P1L91 0.6480 3.3118 0.196
P1L96 -1.4718 1.4938 -0.985
P1L98 2.6431 2.6401 1.001
P1L100 -3.4529 2.0842 -1.657
P1L114 0.8099 1.9218 0.421
P1L118 -2.7119 2.3635 -1.147
But perhaps I missed something?
Best wishes,
A Singh wrote:
> Dear All,
>
> I am trying to run a nested random effects model in lmer (for R 2.9.1,
> lme4 version 0.999375-31 ) using data which is structured as follows:
>
> family offsp. P1L74 P1L77 P1L91 P1L96..(n=426) peg.no ec.length
> syll.length
> 1 2 1 0 0 0 86
> 5.445 2.479
> 1 3 1 0 0 0 91
> 5.215 2.356
> 1 4 0 0 0 0 79
> 5.682 2.896
> 1 5 1 0 0 0 83
> 5.149 2.641
> 1 6 0 0 0 0 77
> 5.044 2.288
> 1 7 1 0 0 0 78
> 5.450 2.420
> 1 8 1 0 1 1 82
> 5.377 2.505
> 1 9 1 0 0 0 95
> 5.389 2.706
> 1 10 1 0 0 0 88
> 5.354 2.461
> 1 11 1 0 0 0 87
> 5.262 3.079
> 1 12 1 0 0 0 84
> 5.191 2.858
> 1 13 1 0 0 1 87
> 5.194 2.264
> 2 23 1 0 0 1 116
> 5.863 2.876
> 2 24 1 0 0 0 122
> 5.475 3.114
> 2 25 1 0 0 0 110
> 5.563 3.059
> . . . . . . . . .
> . . . . . . . . .
> .
> .
> (60 families)
>
> 'Family' is the first Random effect (categorical variable), with 60
> levels.
>
> All columns labeled P1L(x) are a matrix of presence/absence genetic
> markers for each individual in each family. There are 426 such
> columns(not numbered in sequence) and each one is a random effect.
>
> The last three columns (peg.no, ec.length and syll.length) are the
> three dependent variables.
>
> Each genetic marker column needs to be nested within each family,
> which means that if I take the first phenotype, peg.no, for example,
> then I need to run an analysis that partitions variance 60*426 times
> (~25,00 runs) for that phenotype.
> I also need an output with the Anova table, and summary, at each stage
> of the run, so that I can get P values for each marker for each
> family, as a way of determining whether it contributes significantly
> to explaining within-family variance for that trait.
>
> To do the above, I tried to write code for lmer() using two nested
> 'for' loops, one for each level of random factor nesting (marker
> within family) as follows (using a test data set [please find
> attached] with only the first 10 marker columns, to see if this works):
>
>
>> vc<-read.table("P:\\R\\Testvcomp10.txt",header=T)
>> attach(vc)
>
>> family<-factor(family)
>> colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker
> data to a new variable, as column names are themselves not in any
> recognisable sequence
>
>> vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms)
>> library(lme4)
>
>> for (c in levels(family))
> + { for (i in 1:length(colms))
> + { fit<-lmer(peg.no~1 + (1|c/i), vcdf)
> + }
> + summ<-summary(fit)
> + av<-anova(fit)
> + print(summ)
> + print(av)
> + }
>
> This gives me:
>
> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + :
> variable lengths differ (found for 'c')
>
>
>
> On suggestion from a colleague I reframed this as:
>
>
>> for(c in levels(family))
> + {
> + print("----New C:----")
> + print(c)
> + for (i in 1:length(colms))
> + {
> + fit<-lmer(peg.no~1+ (1|c/i),vcdf)
> + print(i)
> + summ<-summary(fit)
> + av<-anova(fit)
> + print(summ)
> + print(av)
> + }
> + }
>
> ..and this gave me the output:
>
> [1] "----New C:----"
> [1] "1"
> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + :
> variable lengths differ (found for 'c')
>
>
> Google-ing the error message has led me to plenty of links that
> suggest forcing data into a data frame to fix this, but that hasn't
> worked.
> My markers and phenotypes both have plenty of missing data (NA's in
> the data.frame), and na.action=na.omit isn't solving the problem.
>
> (I have tried this with lme, and tried to do it with the aov() command
> as well and the error is pretty much the same).
>
> I am completely new to R, and despite searching and trying various
> things, can't get the code to work.
>
> I really appreciate any corrections to this code, or alternative
> command/function suggestions that I can look into, to try to do this
> again.
>
>
> Thanks a bunch for your help,
>
> Aditi
>
>
> ----------------------
> A Singh
> Aditi.Singh at bristol.ac.uk
> School of Biological Sciences
> University of Bristol
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Jérôme Goudet
Dept. Ecology & Evolution
UNIL-Sorge, CH-1015 Lausanne
mail:jerome.goudet at unil.ch
Tel:+41 (0)21 692 4242
Fax:+41 (0)21 692 4265
More information about the R-sig-mixed-models
mailing list