[R] Fix for augPred/gsummary problem (nlme library)

Wed May 17 14:59:36 CEST 2006

On Wed, 2006-05-17 at 09:48 +0000, Mark Difford wrote:
> Dear R-users,
> 
> I am a newbie to this site and a relative new-comer to S/R, so please tread lightly, for you tread...
> 
> There have been several posting relating to problems with augPred() from the nlme library. Here is a "fix" for one of these problems which may lie at the root of others.
> 
> In my case the problem with augPred() lay in gsummary(), which augPred() uses, causing it to fail. [From mucking around &c using getAnywhere("augPred.lme"), and setting: debug(gsummary).]
> 
> Further ferreting around showed that the data structures within gsummary() are fine, but that any (numeric only?) variable that has a label attached to it (in my case from using Harrell's Hmisc library) causes the following sub-routine in gsummary() to fail:
> 
> debug: if (dClass == "numeric") {
> 
>       value[[nm]] <- as.vector(tapply(object[[nm]], groups, FUN[["numeric"]],
>             ...)) 
> 
> } else {
> 
>       value[[nm]] <- as.vector(tapply(as.character(object[[nm]]),
>             groups, FUN[[dClass]])) if (inherits(object[, nm], "ordered")) {
>             value[[nm]] <- ordered(value[, nm], levels = levels(object[,
>                   nm]))[drop: TRUE] }
>       else {
>             value[[nm]] <- factor(value[, nm], levels = levels(object[,
>                   nm]))[drop: TRUE] }
> 
> }
> 
> Error Message:
> 
> Error in "[[<-.data.frame"(`tmp`, nm, value = c(1, 1, 1, 1, 1, 1, 1, : replacement has 170 rows, data has 5
> 
> The immediate problem is that dClass comes through as "labeled" rather than as "numeric", and the object is erroneously passed through to the else{} group.
> 
> In fact, the problem is general: any variable that carries the class "labeled" will cause the sub-routine to choke, as will any variable with a class attribute other than ' ordered' , e.g. POSIXt. This is true even if the variable carrying this 'other' class attribute isn't used in any lme() formula &c.
> 
> Code-wise the fix for this should be straight-forward. Though I've never coded in R/S, it's clear that the authors of the package should be using different conditional tests, something along the lines of is.numeric(obj)/is.factor(obj), if that's possible.
> 
> Until a fix is posted, here is a work-around for groupedData() objects (and for raw data frames). You need to do this for all variables in the groupedData() object, even if you are not using them in your lme() call:
> 
> 1) Use contents(obj) from the Hmisc package to look for variables with class attributes and labels. [You can also use str(obj); then look (i) for names in quotes immediately after the colon, e.g. DateTime: 'POSIXct'), or (ii) Class 'labeled' after the colon.] Remove these, or change them, using, e.g.:
> 
> class(obj$DateTime) <- NULL
> class(obj$AnyVariable) <- 'numeric'     ## leaves the actual labels/units intact so that you can later restore them.
> 
> 2) Execute your lme() statement &c on the object, e.g.:
> 
> test.1 <- lme(Chla ~ PO4, random=~1|Site, data=obj)    ## or simply: lme(obj)
> augPred(test.1)
> plot(augPred(test.1))
> 
> (Note that if you are using a data.frame() as your data object you will need to supply a 'primary' statement to augPred(), e.g. augPred(test.1, primary=~PO4).
> 
> Regards,
> 
> Mark Difford.
> 
> ---------------------------------------------------------------------
> Ph.D. candidate, Botany Department,
> Nelson Mandela Metropolitan University,
> Port Elizabeth, SA.

Is this related to the same problem? I fit an intercepts-only model
(both random and fixed):

> summary(fit.lme.1 <- lme(nflnas.diff~1,
+   random=~1|id,
+   data=nfl.diff.iopec.gd,method="ML"))
Linear mixed-effects model fit by maximum likelihood
 Data: nfl.diff.iopec.gd
      AIC      BIC   logLik
  561.682 567.8112 -277.841

Random effects:
 Formula: ~1 | id
        (Intercept) Residual
StdDev:    20.86548 28.10644

Fixed effects: nflnas.diff ~ 1
              Value Std.Error DF  t-value p-value
(Intercept) -26.384    7.8022 47 -3.38161  0.0015

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-2.15240420 -0.49224313  0.06435735  0.51602333  2.78229869

Number of Observations: 57
Number of Groups: 10

> plot(augPred(fit.lme.1,level=0:1),layout=c(5,2),aspect=1)
Error in terms.default(formula, data = data) :
        no terms component

If I replace:

nflnas.diff~1

with something like:

nflnas.diff~group

where group is a dichotomous factor, augPred works as expected.

Rick B.