[R] parsing strings between [ ] in columns

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Thu Feb 18 09:49:55 CET 2010


On Thu, Feb 18, 2010 at 8:29 AM, milton ruser <milton.ruser at gmail.com> wrote:
> Dear all,
>
> I have a data.frame with a column like the x shown below
> myDF<-data.frame(cbind(x=c("[[1, 0, 0], [0, 1]]",
>   "[[1, 1, 0], [0, 1]]","[[1, 0, 0], [1, 1]]",
>   "[[0, 0, 1], [0, 1]]")))
>> myDF

> After identify the groups I would like
> to idenfity the subgroups:
>  A1 A2 A3  B1 B2
> 1 1  0  0   0  1
> 2 1  1  0   0  1
> 3 1  0  0   1  1
> 4 0  0  1   0  1

Maybe it's not too early in the morning. Given your myDF above:

# how is the first one structured?
> lets = unlist(lapply(fromJSON(as.character(myDF[1,])),length))

# 3 then 2:
> lets
[1] 3 2

# make the letters (fails for >26 groups)
> rep(LETTERS[1:length(lets)],lets)
[1] "A" "A" "A" "B" "B"

# handy sequence function makes the numbers:
> sequence(lets)
[1] 1 2 3 1 2

# splat them together:
> paste(rep(LETTERS[1:length(lets)],lets),sequence(lets),sep="")
[1] "A1" "A2" "A3" "B1" "B2"

 then you can just make this the column names of your new dataframe.

 I think the morning coffee has got through the blood-brain barrier now.

Barry



More information about the R-help mailing list