[Rd] unlist on nested lists of factors (PR#12572)
davison at stats.ox.ac.uk
davison at stats.ox.ac.uk
Wed Aug 20 15:25:10 CEST 2008
Here is a description and a proposed solution for a bug in unlist().
I've used version 2.7.2 RC (2008-08-18 r46382) to look at this, under
linux.
unlist(recursive=TRUE) incorrectly returns a factor with zero levels
when passed either a nested list of factors, or a data frame
containing only factor columns. You can't print() the result.
x <- list(list(v=factor("a")))
str(unlist(x))
## Factor w/ 0 levels: NA
## - attr(*, "names")= chr "v"
## Warning message:
## In str.default(unlist(x)) : 'object' does not have valid levels()
y <- list(data.frame(v=factor("a")))
str(unlist(y))
## Factor w/ 0 levels: NA
## - attr(*, "names")= chr "v"
## Warning message:
## In str.default(unlist(y)) : 'object' does not have valid levels()
unlist is defined as
unlist <- function(x, recursive=TRUE, use.names=TRUE)
{
if(.Internal(islistfactor(x, recursive))) {
lv <- unique(.Internal(unlist(lapply(x, levels), recursive, FALSE)))
nm <- if(use.names) names(.Internal(unlist(x, recursive, use.names)))
res <- .Internal(unlist(lapply(x, as.character), recursive, FALSE))
res <- match(res, lv)
## we cannot make this ordered as level set may have been changed
structure(res, levels=lv, names=nm, class="factor")
} else .Internal(unlist(x, recursive, use.names))
}
The error occurs because, in both cases, at the C level, islistfactor
recurses and finds that all elements are factors, and the if test
condition is TRUE. However, the two instances of lapply do not
recurse, and return inappropriate results. A possible solution is to
replace both instances of lapply with rapply. This results in
appropriate factor answers in this case:
str(unlist(x))
## Factor w/ 1 level "a": 1
## - attr(*, "names")= chr "v"
str(unlist(y))
## Factor w/ 1 level "a": 1
## - attr(*, "names")= chr "v"
An alternative is to not return a factor result, by altering the if
test condition so that nested lists of factors, and lists of
factor-only data frames, fail.
Dan
--
www.stats.ox.ac.uk/~davison
More information about the R-devel
mailing list