[R] as.data.frame doesn't set col.names

Peter Dalgaard pdalgd at gmail.com
Wed Oct 25 13:27:02 CEST 2017


> On 24 Oct 2017, at 22:45 , David L Carlson <dcarlson at tamu.edu> wrote:
> 
> You left out all the most important bits of information. What is yo? Are you trying to assign a data frame to a single column in another data frame? Printing head(samples) tells us nothing about what data types you have, especially if the things that look like text are really factors that were created when you used one of the read.*() functions. Use str(samples) to see what you are dealing with. 

Actually, I think there is enough information to diagnose this. The main issue is as you point out, assignment of an entire data frame to a column of another data frame:

> l <- letters[1:5]
> s <- as.data.frame(sapply(l,toupper))
> dput(s)
structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A", 
"B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)", row.names = c("a", 
"b", "c", "d", "e"), class = "data.frame")

(incidentally, setting col.names has no effect on this; notice that it is only documented as an argument to "list" and "matrix" methods, and sapply() returns a vector) 

Now, if we do this:

> dd <- data.frame(A=l)
> dd$B <- s

we end up with a data frame whose B "column" is another data frame

> dput(dd)
structure(list(A = structure(1:5, .Label = c("a", "b", "c", "d", 
"e"), class = "factor"), B = structure(list(`sapply(l, toupper)` = structure(1:5, .Label = c("A", 
"B", "C", "D", "E"), class = "factor")), .Names = "sapply(l, toupper)", row.names = c("a", 
"b", "c", "d", "e"), class = "data.frame")), .Names = c("A", 
"B"), row.names = c(NA, -5L), class = "data.frame")

in printing such data frames, the inner frame "wins" the column names, which is sensible if you consider what would happen if it had more than one column:

> dd
  A sapply(l, toupper)
1 a                  A
2 b                  B
3 c                  C
4 d                  D
5 e                  E

To get the effect that Ed probably expected, do

> dd <- data.frame(A=l)
> dd["B"] <- s
> dd
  A B
1 a A
2 b B
3 c C
4 d D
5 e E

(and notice that single-bracket indexing is crucial here)

-pd



More information about the R-help mailing list