[R] divide column in a dataframe based on a character
David Winsemius
dwinsemius at comcast.net
Tue Oct 26 06:33:30 CEST 2010
On Oct 25, 2010, at 8:56 PM, Daisy Englert Duursma wrote:
> Hello,
>
> If I have a dataframe:
>
> example(data.frame)
> zz<-
> c
> ("aa_bb
> ","bb_cc
> ","cc_dd","dd_ee","ee_ff","ff_gg","gg_hh","ii_jj","jj_kk","kk_ll")
> ddd <- cbind(dd, group = zz)
>
> and I want to divide the column named group by the "_", how would I
> do this?
>
> so instead of the first row being
> x y fac char group
> 1 1 C a aa_bb
>
> it should be:
> x y fac char group_a group_b
> 1 1 C a aa bb
>
>
>
> I know for a vector I can:
> x1 <- c("a_b","b_c","c_d")
> do.call("rbind",strsplit(x1, "_"))
>
> but I am not sure how this relates to my data.frame
The group columns is a factor, as is the default structure for non-
numeric character arguments to dataframe() and cbind.data.frame(). If
you want to the split values you must first convert to character:
> ddd$group_a <- lapply(strsplit(as.character(ddd$group), "_"), "[", 1)
> ddd$group_b <- lapply(strsplit(as.character(ddd$group), "_"), "[", 2)
> ddd
x y fac char group group_a group_b
1 1 1 C a aa_bb aa bb
2 1 2 B b bb_cc bb cc
3 1 3 C c cc_dd cc dd
4 1 4 C d dd_ee dd ee
5 1 5 B e ee_ff ee ff
6 1 6 A f ff_gg ff gg
7 1 7 C g gg_hh gg hh
8 1 8 A h ii_jj ii jj
9 1 9 B i jj_kk jj kk
10 1 10 B j kk_ll kk ll
--
David.
More information about the R-help
mailing list