[R] Error: length(f1) == length(f2) is not TRUE

Sat Sep 19 00:35:41 CEST 2009

See brief comment inline below.

A Singh wrote:
> 
> Bill,
> 
> It worked!!!
> 
>> lmer(data=vcdf, peg.no~1 + (1|family/P1L55))
> 
> Linear mixed model fit by REML
> Formula: peg.no ~ 1 + (1 | family/P1L55)
>   Data: vcdf
>  AIC  BIC logLik deviance REMLdev
> 2981 2997  -1487     2976    2973
> Random effects:
> Groups       Name        Variance   Std.Dev.
> P1L55:family (Intercept) 4.6659e-11 6.8308e-06
> family       (Intercept) 8.7571e+01 9.3579e+00
> Residual                 9.2269e+01 9.6057e+00
> Number of obs: 390, groups: P1L55:family, 82; family, 57
> 
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)   96.577      1.376    70.2
> 
> It seems as if all along it was the na.action=na.exclude that was 
> working against me, because it probably threw out NAs in a way that made 
> column lengths incompatible.

I don't think so. It was not ensuring that 'family' was a factor.

Peter Ehlers

> 
> It seemed to work with some other models I had tried, but here it was 
> clearly a problem.
> Your example was brilliant for breaking it down.
> 
> Thanks a huge bunch, and cheers!
> 
> Aditi
> 
> 
> --On 18 September 2009 11:40 -0700 William Dunlap <wdunlap at tibco.com> 
> wrote:
> 
>>> From: A Singh [mailto:Aditi.Singh at bristol.ac.uk]
>>> Sent: Friday, September 18, 2009 11:24 AM
>>> To: William Dunlap
>>> Subject: RE: [R] Error: length(f1) == length(f2) is not TRUE
>>>
>>> Yup, they are all factors- and its still doesn't work.
>>> Getting to the stage where I can use 'summary()' is the
>>> problem- the error
>>> stalls the process before a summary can be output.
>>
>> I meant to get a summary of the input, vcdf, not of
>> the output of lmer.  Make sure that vcdf$family and
>> all the vcdf$P* things are factors.
>>
>> Here is a short example that gives an error like yours.
>> Without seeing your data I can't say for sure yours
>> is similar.  You need to at least show the output of
>> summary(vcdf).
>>
>>> library(lme4)
>>> d <- data.frame(y=log(1:36), expand.grid(g2=1:2,g3=11:13,g6=101:106))
>>> sapply(d,class)
>>         y        g2        g3        g6
>> "numeric" "integer" "integer" "integer"
>>> lmer(data=d, y~1 + (1|g2))
>> Linear mixed model fit by REML
>> Formula: y ~ 1 + (1 | g2)
>>    Data: d
>>              AIC             BIC           logLik        deviance
>>  98.784754747615 103.53531156298 -46.392377373808 90.735690977203
>>          REMLdev
>>  92.784754747615
>> Random effects:
>>  Groups   Name        Variance          Std.Dev.
>>  g2       (Intercept) 0.000000000000000 0.000000000000000
>>  Residual             0.748809753022119 0.865337941513094
>> Number of obs: 36, groups: g2, 2
>>
>> Fixed effects:
>>                     Estimate       Std. Error         t value
>> (Intercept) 2.65888040394842 0.14422299035915 18.435898446754
>>> lmer(data=d, y~1 + (1|g2/g6))
>> Error: length(f1) == length(f2) is not TRUE
>> In addition: Warning messages:
>> 1: In g6:g2 : numerical expression has 36 elements: only the first used
>> 2: In g6:g2 : numerical expression has 36 elements: only the first used
>>
>> Now change the d$g* columns to be factors with
>>> d[2:4] <- lapply(d[2:4],factor)
>>> # and check with summary(d)
>>> summary(d) # or sapply(d,class)
>>        y                 g2      g3       g6
>>  Min.   :0.00000000000   1:18   11:12   101:6
>>  1st Qu.:2.27624496408   2:18   12:12   102:6
>>  Median :2.91740536853          13:12   103:6
>>  Mean   :2.65888040395                  104:6
>>  3rd Qu.:3.30492877705                  105:6
>>  Max.   :3.58351893846                  106:6
>>> lmer(data=d, y~1 + (1|g2))
>> Linear mixed model fit by REML
>> Formula: y ~ 1 + (1 | g2)
>>    Data: d
>>              AIC             BIC           logLik        deviance
>>  98.784754747615 103.53531156298 -46.392377373808 90.735690977203
>>          REMLdev
>>  92.784754747615
>> Random effects:
>>  Groups   Name        Variance          Std.Dev.
>>  g2       (Intercept) 0.000000000000000 0.000000000000000
>>  Residual             0.748809753022119 0.865337941513094
>> Number of obs: 36, groups: g2, 2
>>
>> Fixed effects:
>>                     Estimate       Std. Error         t value
>> (Intercept) 2.65888040394842 0.14422299035915 18.435898446754
>>> lmer(data=d, y~1 + (1|g2/g6))
>> Linear mixed model fit by REML
>> Formula: y ~ 1 + (1 | g2/g6)
>>    Data: d
>>              AIC             BIC           logLik        deviance
>>  63.097147876589 69.431223630414 -27.548573938295 54.113964539198
>>          REMLdev
>>  55.097147876589
>> Random effects:
>>  Groups   Name        Variance           Std.Dev.
>>  g6:g2    (Intercept) 0.6928532833231706 0.832378089165717
>>  g2       (Intercept) 0.0000000000000000 0.000000000000000
>>  Residual             0.0955486486847417 0.309109444509128
>> Number of obs: 36, groups: g6:g2, 12; g2, 2
>>
>> Fixed effects:
>>                    Estimate      Std. Error         t value
>> (Intercept) 2.6588804039484 0.2457461012621 10.819623954533
>>
>>>
>>> I tried the same code, but using 'family' and P1L55 as
>>> independent random
>>> factors, and that works.
>>>
>>> (I actually tried this with a new marker P1L74)
>>>
>>> > try(fit<-lmer(peg.no~1 + (1|family) + (1|P1L74),
>>> na.action=na.exclude))
>>> > summary(fit)
>>>
>>> Linear mixed model fit by REML
>>> Formula: peg.no ~ 1 + (1 | family) + (1 | P1L74)
>>>   AIC  BIC logLik deviance REMLdev
>>>  2981 2997  -1487     2976    2973
>>> Random effects:
>>>  Groups   Name        Variance Std.Dev.
>>>  family   (Intercept) 87.86380 9.37357
>>>  P1L74    (Intercept)  0.48815 0.69868
>>>  Residual             92.11681 9.59775
>>> Number of obs: 390, groups: family, 57; P1L74, 2
>>>
>>> Fixed effects:
>>>             Estimate Std. Error t value
>>> (Intercept)    96.40       1.49   64.71
>>>
>>> ..but the nested random effect ..(1|family/P1L74).. is where
>>> the error
>>> occurs. And its the same error.
>>>
>>> Error : length(f1) == length(f2) is not TRUE
>>> In addition: Warning messages:
>>> 1: In P1L55:family :
>>>   numerical expression has 390 elements: only the first used
>>> 2: In P1L55:family :
>>>   numerical expression has 390 elements: only the first used
>>>
>>> My design is a very unbalanced one. Do you think that can be
>>> the problem?
>>>
>>>
>>>
>>>
>>> --On 18 September 2009 09:11 -0700 William Dunlap
>>> <wdunlap at tibco.com> wrote:
>>>
>>> > Look at the output of summary(vcdf) and make sure
>>> > that 'family' and 'P1L55' columns are factors.  Perhaps
>>> > columns 4:13 did not include them.
>>> >
>>> > Bill Dunlap
>>> > TIBCO Software Inc - Spotfire Division
>>> > wdunlap tibco.com
>>> >
>>> >> -----Original Message-----
>>> >> From: A Singh [mailto:Aditi.Singh at bristol.ac.uk]
>>> >> Sent: Friday, September 18, 2009 9:03 AM
>>> >> To: William Dunlap; r-help at r-project.org
>>> >> Subject: RE: [R] Error: length(f1) == length(f2) is not TRUE
>>> >>
>>> >> Hi Bill,
>>> >>
>>> >> Thanks, I did try out what you suggested but it doesn't
>>> seem to work.
>>> >> I get the same error again.
>>> >>
>>> >> There's obviously something here that I don't get. Need to
>>> >> figure it out.
>>> >>
>>> >> Aditi
>>> >>
>>> >> --On 18 September 2009 08:34 -0700 William Dunlap
>>> >> <wdunlap at tibco.com> wrote:
>>> >>
>>> >> >> From: r-help-bounces at r-project.org
>>> >> >> [mailto:r-help-bounces at r-project.org] On Behalf Of A Singh
>>> >> >> Sent: Friday, September 18, 2009 4:42 AM
>>> >> >> To: r-help at r-project.org
>>> >> >> Subject: [R] Error: length(f1) == length(f2) is not TRUE
>>> >> >>
>>> >> >>
>>> >> >> Dear R users,
>>> >> >>
>>> >> >> I am trying to fit an lmer model with only random effects
>>> >> >> which is giving
>>> >> >> me the following error:
>>> >> >>
>>> >> >> Error : length(f1) == length(f2) is not TRUE
>>> >> >> In addition: Warning messages:
>>> >> >> 1: In P1L55:family :
>>> >> >>   numerical expression has 390 elements: only the first used
>>> >> >> 2: In P1L55:family :
>>> >> >>   numerical expression has 390 elements: only the first used
>>> >> >>
>>> >> >>
>>> >> >> I am trying to extract variance components for a phenotype
>>> >> >> 'peg.no', using
>>> >> >> the variable 'family' and a marker column 'P1L55' (which is
>>> >> >> categorical and
>>> >> >> has 2 levels- 0 and 1), as random effects. There are no
>>> >> fixed effects.
>>> >> >>
>>> >> >> The code I used is as follows:
>>> >> >>
>>> >> >> vc<-read.table(...)
>>> >> >>
>>> >> >> vcdf<-data.frame(vc)
>>> >> >> colms<-(vc)[4:13] ##these are the markers
>>> >> >> lapply(colms,factor)
>>> >> >
>>> >> > Note that you did not change columns 4:13 of vcdf to factors:
>>> >> > you changed copies of them to factors.  Do
>>> >> >    vcdf[4:13] <- lapply(vcdf[4:13], factor)
>>> >> > and lmer should be happier.
>>> >> >
>>> >> > Bill Dunlap
>>> >> > TIBCO Software Inc - Spotfire Division
>>> >> > wdunlap tibco.com
>>> >> >
>>> >> >>
>>> >> >> try(fit<-lmer(peg.no~1 + (1|family/P1L55),
>>> >> >> na.action=na.include, data=vcdf))
>>> >> >>
>>> >> >>
>>> >> >> I thought that putting the data into a dataframe would help,
>>> >> >> along with the
>>> >> >> na.exclude command, because there is a lot of missing data in
>>> >> >> patches which
>>> >> >> I have replaced with NA's, but I  don't know how to fix this
>>> >> >> error at all.
>>> >> >>
>>> >> >> Its a bit urgent, and any help is hugely appreciated.
>>> >> >>
>>> >> >> The data files are at:
>>> >> >>
>>> >> >>
>>> >> >> <http://www.4shared.com/file/131980362/460bdafe/Testvcomp10.ht
>>> >> >> ml> (excel)
>>> >> >> http://www.4shared.com/file/131980512/dc7308b/Testvcomp10.html
>>> >> >> (txt)
>>> >> >>
>>> >> >>
>>> >> >> Cheers,
>>> >> >> Aditi
>>> >> >>
>>> >> >> ----------------------
>>> >> >> A Singh
>>> >> >> Aditi.Singh at bristol.ac.uk
>>> >> >> School of Biological Sciences
>>> >> >> University of Bristol
>>> >> >>
>>> >> >> ______________________________________________
>>> >> >> R-help at r-project.org mailing list
>>> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> >> PLEASE do read the posting guide
>>> >> >> http://www.R-project.org/posting-guide.html
>>> >> >> and provide commented, minimal, self-contained,
>>> reproducible code.
>>> >> >>
>>> >>
>>> >>
>>> >>
>>> >> ----------------------
>>> >> A Singh
>>> >> Aditi.Singh at bristol.ac.uk
>>> >> School of Biological Sciences
>>> >> University of Bristol
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>>
>>>
>>>
>>> ----------------------
>>> A Singh
>>> Aditi.Singh at bristol.ac.uk
>>> School of Biological Sciences
>>> University of Bristol
>>>
>>>
>>>
>>>
>>>
> 
> 
> 
> ----------------------
> A Singh
> Aditi.Singh at bristol.ac.uk
> School of Biological Sciences
> University of Bristol
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>