[Rd] unlist errors on a nested list of empty lists

Duncan Murdoch murdoch@dunc@n @ending from gm@il@com
Wed May 9 02:51:16 CEST 2018


On 08/05/2018 4:50 PM, Steven Nydick wrote:
> It also does the same thing if the factor is not on the first level of 
> the list, which seems to be due to the fact that the islistfactor is 
> recursive, but if a list is a list-factor, the first level lists are 
> coerced into character strings.
> 
>  > x <- list(list(factor(LETTERS[1])))
>  > unlist(x)
> Error in as.character.factor(x) : malformed factor
> 
> However, if one of the factors is at the top level, and one is nested, 
> then the result is:
> 
>  > x <- list(list(factor(LETTERS[1])), factor(LETTERS[2]))
>  > unlist(x)
> 
> [1] <NA> B
> Levels: B
> 
> ... which does not seem to me to be desired behavior.

The patch I suggested doesn't help with either of these.  I'd suggest 
collecting examples, and posting a bug report to bugs.r-project.org.

Duncan Murdoch


> 
> 
> On Tue, May 8, 2018 at 2:22 PM Duncan Murdoch <murdoch.duncan at gmail.com 
> <mailto:murdoch.duncan at gmail.com>> wrote:
> 
>     On 08/05/2018 2:58 PM, Duncan Murdoch wrote:
>      > On 08/05/2018 1:48 PM, Steven Nydick wrote:
>      >> Reproducible example:
>      >>
>      >> x <- list(list(list(), list()))
>      >> unlist(x)
>      >>
>      >> *> Error in as.character.factor(x) : malformed factor*
>      >
>      > The error comes from the line
>      >
>      > structure(res, levels = lv, names = nm, class = "factor")
>      >
>      > which is called because unlist() thinks that some entry is a factor,
>      > with NULL levels and NULL names.  It's not legal for a factor to have
>      > NULL levels.  Probably it should never get here; the earlier test
>      >
>      > if (.Internal(islistfactor(x, recursive))) {
>      >
>      > should have been false, and then the result would have been
>      >
>      > .Internal(unlist(x, recursive, use.names))
>      >
>      > (with both recursive and use.names being TRUE), which returns NULL.
> 
>     And the problem is in the islistfactor function in src/main/apply.c,
>     which looks like this:
> 
>     static Rboolean islistfactor(SEXP X)
>     {
>           int i, n = length(X);
> 
>           switch(TYPEOF(X)) {
>           case VECSXP:
>           case EXPRSXP:
>               if(n == 0) return NA_LOGICAL;
>               for(i = 0; i < LENGTH(X); i++)
>                   if(!islistfactor(VECTOR_ELT(X, i))) return FALSE;
>               return TRUE;
>               break;
>           }
>           return isFactor(X);
>     }
> 
>     One of those deeply nested lists is length 0, so at the lowest level it
>     returns NA_LOGICAL.  But then it does C-style logical testing on the
>     results.  I think to C NA_LOGICAL counts as true, so at the next level
>     up we get the wrong answer.
> 
>     A fix would be to rewrite it like this:
> 
>     static Rboolean islistfactor(SEXP X)
>     {
>           int i, n = length(X);
>           Rboolean result = NA_LOGICAL, childresult;
>           switch(TYPEOF(X)) {
>           case VECSXP:
>           case EXPRSXP:
>               for(i = 0; i < LENGTH(X); i++) {
>                   childresult = islistfactor(VECTOR_ELT(X, i));
>                   if(childresult == FALSE) return FALSE;
>                   else if(childresult == TRUE) result = TRUE;
>               }
>               return result;
>               break;
>           }
>           return isFactor(X);
>     }
> 
> 
> 
> -- 
> Steven Nydick
> PhD, Quantitative Psychology
> M.A., Psychology
> M.S., Statistics
> --
> "Beware of the man who works hard to learn something, learns it, and 
> finds himself no wiser than before, Bokonon tells us. He is full of 
> murderous resentment of people who are ignorant without having come by 
> their ignorance the hard way."
> -Kurt Vonnegut



More information about the R-devel mailing list