[R] Factor levels
Gabor Grothendieck
ggrothendieck at gmail.com
Tue Aug 28 23:16:55 CEST 2007
Its the same principle. Just change the function to be suitable. This one
arranges the levels according to the input:
library(methods)
setClass("my.factor")
setAs("character", "my.factor",
function(from) factor(from, levels = unique(from)))
Input <- "a b c
1 1 176 w
2 2 141 k
3 3 172 r
4 4 182 s
5 5 123 k
6 6 153 p
7 7 176 l
8 8 170 u
9 9 140 z
10 10 194 s
11 11 164 j
12 12 100 j
13 13 127 x
14 14 137 r
15 15 198 d
16 16 173 j
17 17 113 x
18 18 144 w
19 19 198 q
20 20 122 f
"
DF <- read.table(textConnection(Input), header = TRUE,
colClasses = list(c = "my.factor"))
str(DF)
On 8/28/07, Sébastien <pomchip at free.fr> wrote:
> Ok, I cannot send to you one of my dataset since they are confidential. But
> I can produce a dummy "mini" dataset to illustrate my question. Let's say I
> have a csv file with 3 columns and 20 rows which content is reproduced by
> the following line.
>
> > mydata<-data.frame(a=1:20,
> b=sample(100:200,20,replace=T),c=sample(letters[1:26], 20,
> replace = T))
> > mydata
> a b c
> 1 1 176 w
> 2 2 141 k
> 3 3 172 r
> 4 4 182 s
> 5 5 123 k
> 6 6 153 p
> 7 7 176 l
> 8 8 170 u
> 9 9 140 z
> 10 10 194 s
> 11 11 164 j
> 12 12 100 j
> 13 13 127 x
> 14 14 137 r
> 15 15 198 d
> 16 16 173 j
> 17 17 113 x
> 18 18 144 w
> 19 19 198 q
> 20 20 122 f
>
> If I had to read the csv file, I would use something like:
> mydata<-data.frame(read.table(file="c:/test.csv",header=T))
>
> Now, if you look at mydata$c, the levels are alphabetically ordered.
> > mydata$c
> [1] w k r s k p l u z s j j x r d j x w q f
> Levels: d f j k l p q r s u w x z
>
> What I am trying to do is to reorder the levels as to have them in the order
> they appear in the table, ie
> Levels: w k r s p l u z j x d q f
>
> Again, keep in mind that my script should be used on datasets which content
> are unknown to me. In my example, I have used letters for mydata$c, but my
> code may have to handle factors of numeric or character values (I need to
> transform specific columns of my dataset into factors for plotting
> purposes). My goal is to let the code scan the content of each factor of my
> data.frame during or after the read.table step and reorder their levels
> automatically without having to ask the user to hard-code the level order.
>
> In a way, my problem is more related to the way the factor levels are
> ordered than to the read.table function, although I guess there is a link...
>
> Gabor Grothendieck a écrit :
> Its not clear from your description what you want.
Could you be a bit more
> specific including an example.
On 8/28/07, Sébastien <pomchip at free.fr>
> wrote:
> Thanks Gabor, I have two questions:
1- Is there any difference between your
> code and the following one, with
regards to Fld2 ?
### test ###
> Input <- "Fld1 Fld2
10 A
20 B
30 C
40 A
"
DF <-
> read.table(textConnection(Input), header =
TRUE)
> DF$Fld2<-factor(DF$Fld2,levels= c("C", "A", "B")))
> 2- do you see any way to bring flexibility to your method ? Because,
> it
looks to me as, at this stage, I have to i) know the order of my
> levels
before I read the table and ii) create one class per factor.
My
> problem is that I am not really working on a specific dataset. My goal is
to
> develop R scripts capable of handling datasets which have various
contents
> but close structures. So, I really need to minimize the quantity
> of
"user-specific" code.
Sebastien
Gabor Grothendieck a écrit :
You can
> create your own class and pass that to read table. In
> the example
> below Fld2 is read in with factor levels C, A, B
> in that
> order.
>
library(methods)
setClass("my.levels")
setAs("character",
> "my.levels",
> function(from) factor(from, levels = c("C", "A", "B")))
###
> test ###
> Input <- "Fld1 Fld2
10 A
20 B
30 C
40 A
"
DF <-
> read.table(textConnection(Input), header = TRUE,
> colClasses = c("numeric",
> "my.levels"))
> str(DF)
# or
DF <- read.table(textConnection(Input), header =
> TRUE,
> colClasses = list(Fld2 = "my.levels"))
str(DF)
On 8/28/07,
> Sébastien <pomchip at free.fr> wrote:
>
> Dear R-users,
> I have found this not-so-recent post in the archives
> -
> http://tolstoy.newcastle.edu.au/R/devel/00a/0291.html -
> while I was
> looking for a particular way to reorder factor levels. The
> question
> addressed by the author was to know if the read.table function
> could be
> modified to order the levels of newly created factors "according to
> the
> order that they appear in the data file". Exactly what I am looking
> for.
> As there was no reply to this post, I wonder if any move have been
> made
> towards the implementation of this suggestion. A quick look
> at
> ?read.table tells me that if this option was implemented, it was not
> in
> the read.table function...
Sebastien
PS: I am sorry to post so many
> messages on the list, but I am learning R
> (basically by trials & errors ;-)
> ) and no one around me has even a
> slight notion about
> it...
> ______________________________________________
R-help at stat.math.ethz.ch
> mailing
list
> https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do
> read the posting
> guide
http://www.R-project.org/posting-guide.html
> and provide
> commented, minimal, self-contained, reproducible code.
>
>
>
>
More information about the R-help
mailing list