[R] factor always have type integer
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Sep 8 22:44:56 CEST 2004
On Wed, 8 Sep 2004, Erich Neuwirth wrote:
> typeof applied to a factor always seems to return "integer",
> independently of the type of the levels.
typeof is telling you the internal structure. From ?factor
'factor' returns an object of class '"factor"' which has a set of
integer codes the length of 'x' with a '"levels"' attribute of
mode 'character'.
(Despite that, we don't enforce this and people have managed to create
factors with non-integer numeric codes.)
Now ?typeof says
'typeof' determines the (R internal) type or storage mode of any
object
and that is the "integer" as the codes are stored in an INTSXP.
BTW, factors were an internal type long ago, and were one of the two
unnamed types which appear in output from memory.profile().
> This has a strange side effect.
It's a very well documented feature of data.frame, as others have
pointed out.
> When a variable is "imported" into a data frame,
> its type changes.
> character variables automatically are converted
> to factors when imported into data frames.
>
> Here is an example:
>
> > v1<-1:3
> > v2<-c("a","b","c")
> > df<-data.frame(v1,v2)
> > typeof(v2)
> [1] "character"
> > typeof(df$v2)
> [1] "integer"
>
> It is somewhat surprising that
> the types of v2 and df$v2 are different.
>
> the answer is to do
> levels(df$v2)[df$v2]
> but that is somewhat involved.
>
> Should the types not be identical, and typeof applied to factors
> return the type of the levels?
>
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list