[R] data frame from list of lists with unequal lengths
    Ben Mazzotta 
    benjamin.mazzotta at tufts.edu
       
    Mon Jul 20 22:46:28 CEST 2009
    
    
  
Hello,
I have a dataset with multiple entries in one field separated by "/"
characters. (The true dataset has long names, 20-odd variables, and
hundreds of observations.)
     v1 v2
1     A  L
2   A/B  M
3     C  N
4 D/E/F  O
5     A  P
6     C  L
What I would like is to have a dataset that looks like this instead:
> my.df
  v1 v2
1  A  L
2  A  M
3  B  M
4  C  N
5  D  O
6  E  O
7  F  O
8  A  P
9  C  L
My original thought was to break the string into variables using
strsplit(), create new columns in the data frame using cbind(), and then
reshape the dataset with the melt() function.
> v1.new <- as.character(my.df$v1)
> v1.new <- strsplit(v1.new, "/")
> v1.new
[[1]]
[1] "A"
[[2]]
[1] "A" "B"
[[3]]
[1] "C"
[[4]]
[1] "D" "E" "F"
[[5]]
[1] "A"
[[6]]
[1] "C"
My next thought was to coerce the list into a data frame, but  I ran
into an error because the list output from strsplit() does not contain
equal length vectors.
> v1.cols <- data.frame(v1.new, check.rows=FALSE)
Error in data.frame("A", c("A", "B"), "C", c("D", "E", "F"), "A", "C",  :
  arguments imply differing number of rows: 1, 2, 3
How can I create a data frame from the unequal length vectors that
result from strsplit(my.df$v1)?
Am I going about this the wrong way? I have also tried to use
colsplit{reshape} without success.
Thank you for any advice you can offer. I hope the answer to this
question is not too obvious.
    
    
More information about the R-help
mailing list