[R] How to split a factor (unique identifier) into several others?

Tribo Laboy tribolaboy at gmail.com
Thu Feb 7 10:33:19 CET 2008


Hi Dimitris,


Your code works like charm, but I don't really understand how. If you
have some time I'll appreciate if you can explain some more.

The contents of "vals" in your example is equivalent to the contents
of "splitfctr" in mine.

"as.data.frame" is quite clear, but "do.call("rbind", vals)" has me puzzled.

I checked the "do.call" help, but I could not replicate the results on
the command line by directly using "rbind".

If I had to do it by directly using "rbind" can you show me how to do it?


I really appreciate your help.


In the meantime I came up with another solution, which is much more
clunky than yours, but at least I can understand how it works. I am
putting it here, just as an additional thing for the archives.

after the "splitfctr" ( or "vals" in Dimitris example is obtained)

I use the "unlist" function on the list and then make new factors like that:

all_fctrs <- unlist(splitfctr)
sample_fctr <- factor(all_fctrs[seq(1, length(all_fctrs), 3)])
condition_fctr <- factor(all_fctrs[seq(2, length(all_fctrs), 3)])
place_fctr <- factor(all_fctrs[seq(3, length(all_fctrs), 3)])

then I bundle the factors into the data frame by "cbind".


Thanks for the help.

TL



On Thu, Feb 7, 2008 at 5:20 PM, Dimitris Rizopoulos
<dimitris.rizopoulos at med.kuleuven.be> wrote:
> try the following:
>
>  dat <- data.frame(x = c("sample1_condition1_place1",
>     "sample2_condition1_place1", "sample3_condition1_place1",
>     "sample1_condition2_place1", "sample1_condition2_place1"))
>
>  vals <- strsplit(as.character(dat$x), "_")
>  as.data.frame(do.call("rbind", vals))
>
>
>  I hope it helps.
>
>  Best,
>  Dimitris
>
>  ----
>  Dimitris Rizopoulos
>  Ph.D. Student
>  Biostatistical Centre
>  School of Public Health
>  Catholic University of Leuven
>
>  Address: Kapucijnenvoer 35, Leuven, Belgium
>  Tel: +32/(0)16/336899
>  Fax: +32/(0)16/337015
>  Web: http://med.kuleuven.be/biostat/
>      http://www.student.kuleuven.be/~m0390867/dimitris.htm
>
>
>
>
>  ----- Original Message -----
>  From: "Tribo Laboy" <tribolaboy at gmail.com>
>  To: <r-help at r-project.org>
>  Sent: Thursday, February 07, 2008 7:44 AM
>  Subject: [R] How to split a factor (unique identifier) into several
>  others?
>
>
>  > Hello,
>  >
>  > I have a data frame with a factor column, which uniquely identifies
>  > the observations in the data frame and it looks like this:
>  >
>  > sample1_condition1_place1
>  > sample2_condition1_place1
>  > sample3_condition1_place1
>  > .
>  > .
>  > .
>  > sample3_condition3_place3
>  >
>  > I want to turn it into three separate factor columns "sample",
>  > "condition" and "place".
>  >
>  > This is what I did so far:
>  >
>  > # generate a factor column for the example
>  > fctr<- factor(c("sample1_condition1_place1",
>  > "sample2_condition1_place1", "sample3_condition1_place1"))
>  > splitfctr <- strsplit(as.character(fctr),"_")
>  >
>  >> splitfctr
>  > [[1]]
>  > [1] "sample1"    "condition1" "place1"
>  >
>  > [[2]]
>  > [1] "sample2"    "condition1" "place1"
>  >
>  > [[3]]
>  > [1] "sample3"    "condition1" "place1"
>  >
>  >
>  > Now this is all fine, but how do I make three separate factors of
>  > this?
>  > The object "splitfctr" is a list of character vectors, each
>  > character
>  > vector being composed of the words after spitting the long original
>  > world.
>  > Now I want to form new character vectors, which contain the first
>  > component of each list entry, then another vector for the second
>  > component, etc.
>  > I don't want to use loops, unless that's the only way to do it.I
>  > guess
>  > I have some difficulty with understanding how R indexing works...
>  >
>  > ______________________________________________
>  > R-help at r-project.org mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-help
>  > PLEASE do read the posting guide
>  > http://www.R-project.org/posting-guide.html
>  > and provide commented, minimal, self-contained, reproducible code.
>  >
>
>
>  Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
>



More information about the R-help mailing list