[R] Inconvenient behavior of as.data.frame() for lists without names

David Winsemius dwinsemius at comcast.net
Thu Sep 16 03:07:48 CEST 2010


On Sep 15, 2010, at 6:10 PM, Magnus Thor Torfason wrote:

> Hi all,
>
> I ran into a small issue when converting a list of vectors to a data  
> frame.  The Issue I'm having is described by the snippet below:
>
> #########################################################
> # Convert a list of vectors into a data.frame
> strlen = 256
> s.long.a = paste( letters[1+(0:strlen %% 26)], collapse="")
> s.long.b = paste( letters[1+(strlen:0 %% 26)], collapse="")
> v.long.a = rep(s.long.a, 2)
> v.long.b = rep(s.long.b, 2)
>
> # Convert when the list has no names for its elements
> my.list = list(v.long.a, v.long.b)
> my.df   = as.data.frame(my.list)
>
> # Here we get an error
> my.df
>

I have also been annoyed at that behavior. I can make the problem go  
away by shortening the assignment of names to nameless lists which  
occurs about halfway through the code of the data.frame function:

.
.
else if (no.vn[[i]]) {
                 tmpname <- substr(deparse(object[[i]])[1L], 1, 10)
# the base data.frame fn does not use the substring shortening
                 if (substr(tmpname, 1L, 2L) == "I(") {
                   ntmpn <- nchar(tmpname, "c")
                   if (substr(tmpname, ntmpn, ntmpn) == ")")
                     tmpname <- substr(tmpname, 3L, max(ntmpn, 20) - 1L)
                 }
                 vnames[[i]] <- tmpname
.
.

No error and the names are c..abcdefg and c..wvutsrq.

Whether you want to muck with the code of data.frame, well, it's your  
machine and if it breaks, the standard warranty applies, ..... you get  
to keep both pieces.

-- 
David.


> # This solves the problem
> names(my.list) = c("a","b")
> my.fixed.df = as.data.frame(my.list)
> my.fixed.df
> #########################################################
>
> In short, the problem is that when there are no names attached to  
> the elements of the list, it creates very long names - if the  
> elements of the vectors themselves are long. And further, that names  
> that are in some since disallowed (can't be printed, for one), are  
> silently injected into a data.frame, leading to an error later on.
>
> Better would be to error out in as.data.frame
>
> Best would be if way of generating default names in this function  
> would be intelligent enough to never create names longer than - say  
> 30 characters. Of course, explicit names should be honored.
>
> Anyway, that's my thoughts on this issue. No patch attached, and I  
> will work around this, but at least it is out there now.
>
> Best,
> Magnus Thor
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list