[R] Creating a dataframe from a vector of character strings

Rolf Turner rolf.turner at xtra.co.nz
Fri Apr 15 00:41:21 CEST 2011


On 15/04/11 09:04, Cliff Clive wrote:
> I have a vector of character strings that I would like to split in two, and
> place in columns of a dataframe.
>
> So for example, I start with this:
>
> beatles<- c("John Lennon", "Paul McCartney", "George Harrison", "Ringo
> Starr")
>
> and I want to end up with a data frame that looks like this:
>
>> Beatles = data.frame(firstName=c("John", "Paul", "George", "Ringo"),
>                         lastName=c("Lennon", "McCartney", "Harrison",
> "Starr"))
>> Beatles
>    firstName  lastName
> 1      John    Lennon
> 2      Paul McCartney
> 3    George  Harrison
> 4     Ringo     Starr
>
>
> I tried string-splitting the first vector on the spaces between first and
> last names, and it returned a list:
>
>> strsplit(beatles, " ")
> [[1]]
> [1] "John"   "Lennon"
>
> [[2]]
> [1] "Paul"      "McCartney"
>
> [[3]]
> [1] "George"   "Harrison"
>
> [[4]]
> [1] "Ringo" "Starr"
>
>
> Is there a fast way to convert this list into a data frame?  Right now all I
> can think of is using a for loop, which I would like to avoid, since the
> real application I am working on involves a much larger dataset

Whenever you think of using a for loop, stop and think about using
some flavour of apply() instead:

melvin <- strsplit(beatles," ")
clyde <- data.frame(firstName=sapply(melvin,function(x){x[1]}),
                                     
lastName=sapply(melvin,function(x){x[2]}))

     cheers,

             Rolf Turner



More information about the R-help mailing list