[Rd] stringsAsFactors = FALSE

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Nov 17 16:03:16 CET 2008


On Mon, 17 Nov 2008, hadley wickham wrote:

> Hi all,
>
> I love the option to not automatically convert strings into factors,
> but there are three places that the current option doesn't work where
> I think it should:

Perhaps you mean 'when I would like it to'?   Things *should* work as 
documented, surely?

> options(stringsAsFactors = FALSE)
>
> str(expand.grid(letters))
> str(type.convert(letters))
>
> df <- read.fwf(textConnection(paste(letters,collapse="\n")), 1)
> str(df)

I get

> str(df)
'data.frame':   26 obs. of  1 variable:
  $ V1: chr  "a" "b" "c" "d" ...

so what is wrong with that?  read.fwf just calls read.table, so the 
default options of read.table apply.

> I think type.convert and read.fwf can be fixed by giving them a
> stringsAsFactors argument and then using asis = !stringsAsFactors
> (like read.table).

Seems to me that there is nothing wrong with read.fwf.  For type.convert() 
we could have the default

as.is = !default.stringsAsFactors()

but I think a strong case needs to be made to change the documented 
behaviour.

>  The key lines in expand.grid would seem to be
>
>            if (!is.factor(x) && is.character(x))
>                x <- factor(x, levels = unique(x))
>
> but I'm not sure why they are being converted to factors in the first place.

Nor I am, but it goes back to at least r2107, over 10 years ago.  I don't 
see much problem with adding a 'stringsAsFactors' argument there.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list