[Rd] should `data` respect default.stringsAsFactors()?
Cook, Malcolm
MEC at stowers.org
Fri Feb 19 15:54:36 CET 2016
Joshua,
> On Thu, Feb 18, 2016 at 6:03 PM, Cook, Malcolm <MEC at stowers.org> wrote:
> > Hi Peter,
> >
> > Sorry if I was not clear. Perhaps an example will make my point:
> >
> >> data(iris)
> >> class(iris$Species)
> > [1] "factor"
> >> write.table(iris,'data/myiris.tab')
> >> data(myiris)
> >> class(myiris$Species)
> > [1] "factor"
> >> rm(myiris)
> >> options(stringsAsFactors = FALSE)
> >> data(myiris)
> >> class(myiris$Species)
> > [1] "factor"
> >> myiris<-read.table("data/myiris.tab",header=TRUE)
> >> class(myiris$Species)
> > [1] "character"
> >
> > I am surprised to find that in the above
> > setting the global option stringsAsFactors = FALSE does NOT effect
> how Species is being read in by the `data` function
> > whereas
> > setting the global option stringsAsFactors = FALSE DOES effect how
> Species is being read in by read.table
> >
> > especially since data is documented as calling read.table.
> >
> To be explicit, it's documented as calling read.table(..., header =
> TRUE) in this case, but it actually calls read.table(..., header =
> TRUE, as.is = FALSE), which results in class(myiris$Species) of
> "factor".
Aha - makes sense.
>
> R> myiris<-read.table("data/myiris.tab",header=TRUE,as.is=FALSE)
> R> class(myiris$Species)
> [1] "factor"
>
> So it seems like adding as.is = FALSE to the call in the documentation
> would clear this up.
I agree - thanks for digging into the source - you have unearthed the root cause.
~Malcolm
> > In my opinion, one or the other should change (the behavior of data, or the
> documentation).
> >
> > <bleep> <bleep>,
> >
> > ~ Malcolm
> >
> >
> > > -----Original Message-----
> > > From: peter dalgaard [mailto:pdalgd at gmail.com]
> > > Sent: Thursday, February 18, 2016 3:32 PM
> > > To: Cook, Malcolm <MEC at stowers.org>
> > > Cc: r-devel at stat.math.ethz.ch
> > > Subject: Re: [Rd] should `data` respect default.stringsAsFactors()?
> > >
> > > What the <bleep> are you on about? data() does many things, only some
> of
> > > which call read.table() et al., and the ones that do have no special
> treatment
> > > of stringsAsFactors.
> > >
> > > -pd
> > >
> > > > On 18 Feb 2016, at 21:25 , Cook, Malcolm <MEC at stowers.org> wrote:
> > > >
> > > > Hiya,
> > > >
> > > > Probably been debated elsewhere....
> > > >
> > > > I note that R's `data` function does not respect default.stringsAsFactors
> > > >
> > > > By my lights, it should, especially as it is documented to call read.table,
> > > which DOES respect.
> > > >
> > > > Oh, but: http://r.789695.n4.nabble.com/stringsAsFactors-FALSE-
> > > tp921891p921893.html
> > > >
> > > > Compelling. I have to agree.
> > > >
> > > > So, I change my mind.
> > > >
> > > > By my lights, `data` should then be documented to NOT respect
> > > default.stringsAsFactors.
> > > >
> > > > Else?
> > > >
> > > > ~Malcolm Cook
> > > >
> > > > ______________________________________________
> > > > R-devel at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> > > --
> > > Peter Dalgaard, Professor,
> > > Center for Statistics, Copenhagen Business School
> > > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > > Phone: (+45)38153501
> > > Office: A 4.23
> > > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Joshua Ulrich | about.me/joshuaulrich
> FOSS Trading | www.fosstrading.com
> R/Finance 2016 | www.rinfinance.com
More information about the R-devel
mailing list