[R] Fix for augPred/gsummary problem (nlme library)

Mark Difford mark_difford at yahoo.co.uk
Wed May 17 11:48:24 CEST 2006


Dear R-users,

I am a newbie to this site and a relative new-comer to S/R, so please tread lightly, for you tread...

There have been several posting relating to problems with augPred() from the nlme library. Here is a "fix" for one of these problems which may lie at the root of others.

In my case the problem with augPred() lay in gsummary(), which augPred() uses, causing it to fail. [From mucking around &c using getAnywhere("augPred.lme"), and setting: debug(gsummary).]

Further ferreting around showed that the data structures within gsummary() are fine, but that any (numeric only?) variable that has a label attached to it (in my case from using Harrell's Hmisc library) causes the following sub-routine in gsummary() to fail:

debug: if (dClass == "numeric") {

      value[[nm]] <- as.vector(tapply(object[[nm]], groups, FUN[["numeric"]],
            ...)) 

} else {

      value[[nm]] <- as.vector(tapply(as.character(object[[nm]]),
            groups, FUN[[dClass]])) if (inherits(object[, nm], "ordered")) {
            value[[nm]] <- ordered(value[, nm], levels = levels(object[,
                  nm]))[drop: TRUE] }
      else {
            value[[nm]] <- factor(value[, nm], levels = levels(object[,
                  nm]))[drop: TRUE] }

}

Error Message:

Error in "[[<-.data.frame"(`tmp`, nm, value = c(1, 1, 1, 1, 1, 1, 1, : replacement has 170 rows, data has 5

The immediate problem is that dClass comes through as "labeled" rather than as "numeric", and the object is erroneously passed through to the else{} group.

In fact, the problem is general: any variable that carries the class "labeled" will cause the sub-routine to choke, as will any variable with a class attribute other than ' ordered' , e.g. POSIXt. This is true even if the variable carrying this 'other' class attribute isn't used in any lme() formula &c.

Code-wise the fix for this should be straight-forward. Though I've never coded in R/S, it's clear that the authors of the package should be using different conditional tests, something along the lines of is.numeric(obj)/is.factor(obj), if that's possible.

Until a fix is posted, here is a work-around for groupedData() objects (and for raw data frames). You need to do this for all variables in the groupedData() object, even if you are not using them in your lme() call:

1) Use contents(obj) from the Hmisc package to look for variables with class attributes and labels. [You can also use str(obj); then look (i) for names in quotes immediately after the colon, e.g. DateTime: 'POSIXct'), or (ii) Class 'labeled' after the colon.] Remove these, or change them, using, e.g.:

class(obj$DateTime) <- NULL
class(obj$AnyVariable) <- 'numeric'     ## leaves the actual labels/units intact so that you can later restore them.

2) Execute your lme() statement &c on the object, e.g.:

test.1 <- lme(Chla ~ PO4, random=~1|Site, data=obj)    ## or simply: lme(obj)
augPred(test.1)
plot(augPred(test.1))

(Note that if you are using a data.frame() as your data object you will need to supply a 'primary' statement to augPred(), e.g. augPred(test.1, primary=~PO4).

Regards,

Mark Difford.

---------------------------------------------------------------------
Ph.D. candidate, Botany Department,
Nelson Mandela Metropolitan University,
Port Elizabeth, SA.




More information about the R-help mailing list