[R] Odp: When factor is better than other types, such as vector and frame?
Petr PIKAL
petr.pikal at precheza.cz
Mon Aug 24 09:26:24 CEST 2009
Hi
r-help-bounces at r-project.org napsal dne 23.08.2009 05:00:11:
> Hi,
>
> It is easy to understand the types vector and frame.
>
> But I am wondering why the type factor is designed in R. What is the
> advantage of factor compare with other data types in R? Can somebody
> give an example in which case the type factor is much better than
> other data types?
Although your expressions do not correspond much with naming conventions
in R, usage of factor is sometimes preferable to character values.
consider e.g.
set.seed(111)
df<-data.frame(1:5, fac=sample(letters[1:2], 5, replace=T))
plot(df[,1], pch=as.numeric(df[,2]))
df[,2]<-as.character(df[,2])
plot(df[,1], pch=as.numeric(df[,2]))
Warning message:
In plot.xy(xy, type, ...) : NAs introduced by coercion
Another advantage is simple and straightforward manipulation with levels.
levels(df[,2])<-c("yes", "no")
> df
X1.5 fac
1 1 no
2 2 no
3 3 yes
4 4 no
5 5 yes
together with easy ordering option of levels and subsequent plotting order
in boxplots and similar.
> factor(df$fac, levels=levels(df$fac))
[1] no no yes no yes
Levels: yes no
> factor(df$fac, levels=levels(df$fac)[2:1])
[1] no no yes no yes
Levels: no yes
You need to get used to some features which are sometimes surprising but
has a reason like levels persisting in subset.
> str(df[df$fac=="no",])
'data.frame': 3 obs. of 2 variables:
$ X1.5: int 1 2 4
$ fac : Factor w/ 2 levels "yes","no": 2 2 2
>
Regards
Petr
>
> Regards,
> Peng
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list