[R-sig-ME] [Fwd: Errors in lme]

Mon Oct 15 22:51:33 CEST 2012

Gerhard Göldner <u28101104 at ...> writes:

> I am trying to do analysis on a dataset using mixed models because I have
> samples from repeated individual animals.  Please find attached my
> dataset.  The question I would like to pose is if and what are the effects
> of the variable of distances and time on the proportional mass
> change(propmc).  I have included other factorial effects such as age sex
> and phase as well as year, but I don't know if I am using them correctly
> in the model.
> 
> I have also attached the codes that I use for the model and the error
> messages I am getting.  I have done some internet searches and some
> suggest missing data as the culprit or fixed effects that are correlated.
> Even if I exclude correlated fixed effects and only include fixed effects
> which are not correlated with each other, I still get the same message
> "error in MEEM - singularity in backsolve, level 0, block 1".
> 
> In addition, if I use for example a simpler model with only one fixed
> effect, the model runs, but the output doesn't have p-values.. it only
> says NaN.  I also did a search on that and found that it means Not a
> number, though I don't know why it is producing that.  I noticed that the
> DF is negative however and that if I use less variables, the DF increases
> and the p value is calculated.  Is there a statistical way of using
> t-values, without using p-values, as someone told me using AIC and
> p-values are two different methodologies or philosophies.  In other words,
> is it correct to still use the AIC values even if I don't get p-values and
> just use the AIC values.
> 
> Can you please help me by making any suggestions about how to correct
> this.  The "missing" data in my dataset are not missing - the values are
> actually 0!

  I don't quite know what this means. I see seven observations in the data set
below, none missing. Individual YY193 has a few of its predictor variables
equal to zero, but that shouldn't necessarily be a problem.

> Thank you for any help you can give me - if not, could you please refer me
> elsewhere?  I have your book "Mixed effects models in S and S-plus" 2000,
> but would appreciate any help you can give me.

[snip]

> 
> > library(nlme)
> > dataf<-read.table("C:/Documents and Settings/New
user.HOME-2DCF24A8DD/Desktop/analysis/pdataf.txt",header=T)
> > dataf
>     tag year sex phase age   mstart     mend        mc    propmc    kmtoF1
> 1 PO043 2009  AF    PM  10 373.4657 475.6858 102.22008 27.370676  420.2365
> 2 RR435 2009  AF    PM   4 270.0599 516.3509 246.29104 91.198685 2195.4183
> 3 RR435 2010  AF    PM   5 331.8893 531.0843 199.19494 60.018484 1623.7955
> 4 WR336 2008  AF    PB  11 418.6430 507.9936  89.35066 21.342926  781.5365
> 5 YY189 2009  AF    PM   5 300.1781 508.0098 207.83174 69.236153 1010.0676
> 6 YY193 2007  AF    PB   4 346.3593 430.2214  83.86210 24.212458    0.0000
> 7 YY240 2007  AF    PB   4 372.1941 406.9324  34.73827  9.333375 1268.4038
>   meankmtoF1 htoF1 durF1 meandurFpat totFtime tripdur propforage    farkm fx
> 1   394.4627   280   232    214.2222     1928    5632   34.23295 1476.228  9
> 2  2217.2725  1136   240    433.0000     3464    6480   53.45679 2268.101  8
> 3  1642.6462  1240   168    288.0000     2880    8064   35.71429 2022.378 10
> 4   896.8356   376   736    736.0000      736    1536   47.91667 1031.621  1
> 5  1011.0565   496    96    190.0000     1520    5832   26.06310 2201.082  8
> 6     0.0000     0     0      0.0000        0    1824    0.00000 1931.764  0
> 7  1324.0800   616   448    448.0000      448    2056   21.78988 1425.120  1
>           mv    febpm
> 1 0.20312652 30.80966
> 2 0.25818408 48.11111
> 3 0.07446295 32.14286
> 4 1.96484928 43.12500
> 5 0.07896997 23.45679
> 6 0.00000000  0.00000
> 7 2.71668110 19.61089

  is this really your full dataset (i.e. you have a *total* of 7 observations)?

> > attach(dataf)

  A minor note, it's generally better (more stable) to use data=dataf in
your lme() call rather than attach()/detach()

> > fm0f<-lme(propmc~kmtoF1+htoF1+tripdur+farkm,random=~1|tag,method="ML")

> > fm0f<-lme(propmc~kmtoF1+tripdur,random=~1|tag,method="ML")
> > summary(fm0f)
> Linear mixed-effects model fit by maximum likelihood
>  Data: NULL 
>        AIC      BIC    logLik
>   68.36043 68.08998 -29.18022
> 
> Random effects:
>  Formula: ~1 | tag
>          (Intercept) Residual
> StdDev: 0.0007653108 15.63737
> 
> Fixed effects: propmc ~ kmtoF1 + tripdur 
>                 Value Std.Error DF   t-value p-value
> (Intercept) -1.619729 17.010746  5 -0.095218  0.9278
> kmtoF1       0.014981  0.013920 -1  1.076211     NaN
> tripdur      0.006514  0.003898 -1  1.671239     NaN

  [snip]

> Number of Observations: 7
> Number of Groups: 6 
> Warning message:
> In pt(q, df, lower.tail, log.p) : NaNs produced
> > 

  It looks like all of your predictor variables are numeric, and that you
have a single individual (RR435) for which you have repeated measurement in
two different years.  A general rule of thumb is that you should have at
least ten (!!) data points per parameter you are trying to estimate, which
in this case means that even your simpler model is quite badly overfitted (you
are estimating 3 fixed-effect parameters, so you should hope for at least
30 data points ... even with 15 this would be a little bit optimistic).
lme() follows a set of rules (described in detail in Pinheiro & Bates for
calculating the residual degrees of freedom, and in this case they come
out negative, hence the NaN p-values in your simpler table.

   The bottom line is that I think you just don't have enough data to
do what you want.  In your situation I would probably do separate linear
regressions on each predictor variable you're interested in, showing the
data graphically and indicating to the reader that (1) two of the points were
repeated measurements from the same individual and (2) you are not considering
any collinearity or overlap in the predictive ability, nor any multiple
comparisons corrections for considering multiple predictors. In other
words, this is an *exploratory* analysis.

Your estimate of the among-tag standard deviation is very small 
relative to the residual standard deviation (about 4 orders of magnitude
smaller); although it's probably extremely imprecisely estimated, it
does somewhat support the procedure of ignoring the random effect.