[Rd] Unexpected result of as.character() and unlist() appliedto a data frame
Heinz Tuechler
tuechler at gmx.at
Wed Mar 28 13:16:49 CEST 2007
At 17:25 27.03.2007 +0200, Martin Maechler wrote:
>>>>>> "Herve" == Herve Pages <hpages at fhcrc.org>
>>>>>> on Mon, 26 Mar 2007 20:48:33 -0700 writes:
>
> Herve> Hi,
> >> dd <- data.frame(A=c("b","c","a"), B=3:1) dd
> Herve> A B 1 b 3 2 c 2 3 a 1
> >> unlist(dd)
> Herve> A1 A2 A3 B1 B2 B3 2 3 1 3 2 1
>
> Herve> Someone else might get something different. It all
> Herve> depends on the values of its 'stringsAsFactors' option:
>
>yes, and I don't like that (last) fact either.
>IMO, an option should never be allowed to influence such a basic
>function as data.frame().
>
>I know I would have had time earlier to start discussing this,
>but for some (probably good) reasons, I didn't get to it at the
>time.
>As Andy comments, everything is behaving as it should / is documented,
>including the 'stringsAsFactors' option;
>but personally, I really would want to consider changing
>the default for data.frame()s stringAsFactors back (as
>pre-R-2.4.0) to 'TRUE' instead of default.stringsAsFactors()
>which is a smart version of getOption("stringsAsFactors").
>I find it ok ("acceptable") if its influencing read.table()
>but feel differently for data.frame().
>
>Martin
>
Martin!
I see the problem with options influencing "such a basic function as
data.frame().", but in my view the difficulty starts earlier. In my
understanding data.frame() is _the_ basic way to store empirical source
data in R and I found the earlier default behaviour, to change character
variables to factors, problematic.
If changing character variables to factors were only an internal process,
not visible to the user, I would not mind, but to include a character
variable in a data frame and get a factor out of it, is somewhat disturbing.
A naive user like me was especially confused by the fact that I could read
an SPSS file with spss.get (default: charfactor=FALSE) and get a character
variable in a data.frame as a character variable but then putting it in a
different data.frame it changed to factor.
I would wish a data.frame() function that behaves as a "data container"
with the idea of rows(=cases) and columns(=variables) but without changing
the mode/class of the objects.
Heinz
>
>
>
>
> >> dd2 <- data.frame(A=c("b","c","a"), B=3:1,
> >> stringsAsFactors=FALSE)
> >> dd2
> Herve> A B 1 b 3 2 c 2 3 a 1
> >> unlist(dd2)
> Herve> A1 A2 A3 B1 B2 B3 "b" "c" "a" "3" "2" "1"
>
> Herve> Same thing with as.character:
>
> >> as.character(dd)
> Herve> [1] "c(2, 3, 1)" "c(3, 2, 1)"
> >> as.character(dd2)
> Herve> [1] "c(\"b\", \"c\", \"a\")" "c(3, 2, 1)"
>
> Herve> Bug or "feature"?
>
> Herve> Note that as.character applied directly on dd$A
> Herve> doesn't have this "feature":
>
> >> as.character(dd$A)
> Herve> [1] "b" "c" "a"
> >> as.character(dd2$A)
> Herve> [1] "b" "c" "a"
>
> Herve> Cheers, H.
>
> Herve> ______________________________________________
> Herve> R-devel at r-project.org mailing list
> Herve> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>______________________________________________
>R-devel at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list