[R] divide column in a dataframe based on a character

Daisy Englert Duursma daisy.duursma at gmail.com
Tue Oct 26 06:47:19 CEST 2010


Thanks for the help. Easy as..

On Tue, Oct 26, 2010 at 3:33 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Oct 25, 2010, at 8:56 PM, Daisy Englert Duursma wrote:
>
>> Hello,
>>
>> If I have a dataframe:
>>
>> example(data.frame)
>>
>> zz<-c("aa_bb","bb_cc","cc_dd","dd_ee","ee_ff","ff_gg","gg_hh","ii_jj","jj_kk","kk_ll")
>> ddd <- cbind(dd, group = zz)
>>
>> and I want to divide the column named group by the "_", how would I do
>> this?
>>
>> so instead of the first row being
>> x   y  fac char  group
>> 1  1   C    a     aa_bb
>>
>> it should be:
>> x  y fac  char group_a    group_b
>> 1  1   C    a      aa             bb
>>
>>
>>
>> I know for a vector I can:
>> x1 <- c("a_b","b_c","c_d")
>> do.call("rbind",strsplit(x1, "_"))
>>
>> but I am not sure how this relates to my data.frame
>
> The group columns is a factor, as is the default structure for non-numeric
> character arguments to dataframe() and cbind.data.frame(). If you want to
> the split values you must first convert to character:
>
>> ddd$group_a <- lapply(strsplit(as.character(ddd$group), "_"), "[", 1)
>> ddd$group_b <- lapply(strsplit(as.character(ddd$group), "_"), "[", 2)
>> ddd
>   x  y fac char group group_a group_b
> 1  1  1   C    a aa_bb    aa     bb
> 2  1  2   B    b bb_cc    bb     cc
> 3  1  3   C    c cc_dd    cc     dd
> 4  1  4   C    d dd_ee    dd     ee
> 5  1  5   B    e ee_ff    ee     ff
> 6  1  6   A    f ff_gg    ff     gg
> 7  1  7   C    g gg_hh    gg     hh
> 8  1  8   A    h ii_jj    ii     jj
> 9  1  9   B    i jj_kk    jj     kk
> 10 1 10   B    j kk_ll    kk     ll
>
> --
> David.
>
>



-- 
Daisy Englert Duursma

Room E8C156
Dept. Biological Sciences
Macquarie University  NSW  2109
Australia

Tel +61 2 9850 9256



Unit 2, 35 Denison St
Hornsby, NSW 2077

Mobile: 0421858456



More information about the R-help mailing list