[R-sig-ME] Differing variable lengths (missing data) and Model errors in lmer()

Jerome Goudet jerome.goudet at unil.ch
Thu Jul 9 21:21:57 CEST 2009


Dear Aditi,

I'm puzzled with the nesting of loci within families (the presence of an 
allele at a locus means the same whatever the family, I would think). I 
would treat loci as fixed effect and hence use something like that:

 (fm1<-lmer(peg.no~P1L55+P1L73+P1L74+P1L77+P1L91+P1L96+P1L98+P1L100+P1L114+P1L118+(1|family),dat))
Linear mixed model fit by REML
Formula: peg.no ~ P1L55 + P1L73 + P1L74 + P1L77 + P1L91 + P1L96 + P1L98 
+      P1L100 + P1L114 + P1L118 + (1 | family)
   Data: dat
  AIC  BIC logLik deviance REMLdev
 2954 3006  -1464     2963    2928
Random effects:
 Groups   Name        Variance Std.Dev.
 family   (Intercept) 89.043   9.4363 
 Residual             91.377   9.5592 
Number of obs: 390, groups: family, 57

Fixed effects:
            Estimate Std. Error t value
(Intercept)  96.2793     3.2169  29.929
P1L55        -2.0185     1.5831  -1.275
P1L73        -3.0256     1.8900  -1.601
P1L74         1.5112     1.6189   0.933
P1L77        -2.7898     2.3529  -1.186
P1L91         0.6480     3.3118   0.196
P1L96        -1.4718     1.4938  -0.985
P1L98         2.6431     2.6401   1.001
P1L100       -3.4529     2.0842  -1.657
P1L114        0.8099     1.9218   0.421
P1L118       -2.7119     2.3635  -1.147


But perhaps I missed something?

Best wishes,

A Singh wrote:
> Dear All,
>
> I am trying to run a nested random effects model in lmer (for R 2.9.1, 
> lme4 version 0.999375-31 ) using data which is structured as follows:
>
> family offsp. P1L74 P1L77 P1L91 P1L96..(n=426) peg.no ec.length  
> syll.length
> 1        2      1      0      0      0                86    
> 5.445         2.479
> 1        3      1      0      0      0                91    
> 5.215         2.356
> 1        4      0      0      0      0                  79    
> 5.682         2.896
> 1        5      1      0      0      0               83    
> 5.149         2.641
> 1        6      0      0      0      0                77    
> 5.044         2.288
> 1        7      1      0      0      0                  78    
> 5.450         2.420
> 1        8      1      0      1      1                82    
> 5.377         2.505
> 1        9      1      0      0      0                95    
> 5.389         2.706
> 1       10      1      0      0      0                88    
> 5.354         2.461
> 1       11      1      0      0      0                87    
> 5.262         3.079   
> 1       12      1      0      0      0                84    
> 5.191         2.858
> 1       13      1      0      0      1                  87    
> 5.194         2.264
> 2       23      1      0      0      1                  116    
> 5.863         2.876
> 2       24      1      0      0      0                122    
> 5.475         3.114
> 2       25      1      0      0      0                110    
> 5.563         3.059
> .       .       .   .     .     .               .         .          .
> .       .       .   .     .     .               .         .          .
> .
> .
> (60 families)
>
> 'Family' is the first Random effect (categorical variable), with 60 
> levels.
>
> All columns labeled P1L(x) are a matrix of presence/absence genetic 
> markers for each individual in each family. There are 426 such 
> columns(not numbered in sequence) and each one is a random effect.
>
> The last three columns (peg.no, ec.length and syll.length) are the 
> three dependent variables.
>
> Each genetic marker column needs to be nested within each family, 
> which means that if I take the first phenotype, peg.no, for example, 
> then I need to run an analysis that partitions variance 60*426 times 
> (~25,00 runs) for that phenotype.
> I also need an output with the Anova table, and summary, at each stage 
> of the run, so that I can get P values for each marker for each 
> family, as a way of determining whether it contributes significantly 
> to explaining within-family variance for that trait.
>
> To do the above, I tried to write code for lmer() using two nested 
> 'for' loops, one for each level of random factor nesting (marker 
> within family) as follows (using a test data set [please find 
> attached] with only the first 10 marker columns, to see if this works):
>
>
>> vc<-read.table("P:\\R\\Testvcomp10.txt",header=T)
>> attach(vc)
>
>> family<-factor(family)
>> colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker 
> data    to a new variable, as column names are themselves not in any 
> recognisable sequence
>
>> vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms)
>> library(lme4)
>
>> for (c in levels(family))
> + {    for (i in 1:length(colms))
> +        { fit<-lmer(peg.no~1 + (1|c/i), vcdf)
> +        }
> +    summ<-summary(fit)
> +    av<-anova(fit)
> +    print(summ)
> +    print(av)
> + }
>
> This gives me:
>
> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 +  :
>  variable lengths differ (found for 'c')
>
>
>
> On suggestion from a colleague I reframed this as:
>
>
>> for(c in levels(family))
> + {
> + print("----New C:----")
> + print(c)
> + for (i in 1:length(colms))
> + {
> + fit<-lmer(peg.no~1+ (1|c/i),vcdf)
> + print(i)
> + summ<-summary(fit)
> + av<-anova(fit)
> + print(summ)
> + print(av)
> + }
> + }
>
> ..and this gave me the output:
>
> [1] "----New C:----"
> [1] "1"
> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 +  :
>  variable lengths differ (found for 'c')
>
>
> Google-ing the error message has led me to plenty of links that 
> suggest forcing data into a data frame to fix this, but that hasn't 
> worked.
> My markers and phenotypes both have plenty of missing data (NA's in 
> the data.frame), and na.action=na.omit isn't solving the problem.
>
> (I have tried this with lme, and tried to do it with the aov() command 
> as well and the error is pretty much the same).
>
> I am completely new to R, and despite searching and trying various 
> things, can't get the code to work.
>
> I really appreciate any corrections to this code, or alternative 
> command/function suggestions that I can look into, to try to do this 
> again.
>
>
> Thanks a bunch for your help,
>
> Aditi
>
>
> ----------------------
> A Singh
> Aditi.Singh at bristol.ac.uk
> School of Biological Sciences
> University of Bristol
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Jérôme Goudet
Dept. Ecology & Evolution
UNIL-Sorge, CH-1015 Lausanne

mail:jerome.goudet at unil.ch
Tel:+41 (0)21 692 4242
Fax:+41 (0)21 692 4265




More information about the R-sig-mixed-models mailing list