[R] dataframe: string operations on columns
Waclaw Kusnierczyk
waku at idi.ntnu.no
Wed Jan 19 02:02:49 CET 2011
Assuming every row is split into exactly two values by whatever string
you choose as split, one fancy exercise in R data structures is
dfsplit = function(df, split)
as.data.frame(
t(
structure(dim=c(2, nrow(df)),
unlist(
strsplit(split=split,
as.matrix(df))))))
so that if your data frame is
df = data.frame(c('1 2', '3 4', '5 6'))
then
dfsplit(df, ' ')
# V1 V2
# 1 1 2
# 2 3 4
# 3 5 6
renaming the columns left as an exercise.
vQ
On 01/18/2011 05:22 PM, Peter Ehlers wrote:
> On 2011-01-18 08:14, Ivan Calandra wrote:
>> Hi,
>>
>> I guess it's not the nicest way to do it, but it should work for you:
>>
>> #create some sample data
>> df<- data.frame(a=c("A B", "C D", "A C", "A D", "B D"),
>> stringsAsFactors=FALSE)
>> #split the column by space
>> df_split<- strsplit(df$a, split=" ")
>>
>> #place the first element into column a1 and the second into a2
>> for (i in 1:length(df_split[[1]])){
>> df[i+1]<- unlist(lapply(df_split, FUN=function(x) x[i]))
>> names(df)[i+1]<- paste("a",i,sep="")
>> }
>>
>> I hope people will give you more compact solutions.
>> HTH,
>> Ivan
>>
> You can replace the loop with
>
> df <- transform(df, a1 = sapply(df_split, "[[", 1),
> a2 = sapply(df_split, "[[", 2))
>
> Peter Ehlers
>
>>
>>
>> Le 1/18/2011 16:30, boris pezzatti a écrit :
>>>
>>> Dear all,
>>> how can I perform a string operation like strsplit(x," ") on a column
>>> of a dataframe, and put the first or the second item of the split into
>>> a new dataframe column?
>>> (so that on each row it is consistent)
>>>
>>> Thanks
>>> Boris
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list