[R] how to split a data frame by two variables
MacQueen, Don
macqueen1 at llnl.gov
Thu Sep 1 20:28:08 CEST 2011
Even though it's not needed, here's a small followup.
I usually use this
split(x, paste(x$let,x$g))
But since
split(x, list(x$let,x$g))
works, so does
split(x, x[,c('let','g')])
> all.equal( split(x, x[,c('let','g')]) , split(x,list(x$let,x$g)))
[1] TRUE
As to which is the best, hard to say. If the variable names you want to
split by are held in character vector, then the third one has an advantage
splt.by <- c('let','g')
split(x, x[,splt.by] )
If x were large, and the number of columns to split by were large, there
might be performance differences, but I suspect they would have to be
*very* large before it mattered.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 9/1/11 11:08 AM, "Changbin Du" <changbind at gmail.com> wrote:
>Thanks for the great helps from David, Jim and Liviu. It solved my
>problem.
>
>Appreciated!
>
>On Thu, Sep 1, 2011 at 11:01 AM, David Winsemius
><dwinsemius at comcast.net>wrote:
>
>>
>> On Sep 1, 2011, at 1:53 PM, Changbin Du wrote:
>>
>> HI, Dear R community,
>>>
>>> I want to split a data frame by using two variables: let and g
>>>
>>> x = data.frame(num =
>>>>
>>> c(10,11,12,43,23,14,52,52,12,**23,21,23,32,31,24,45,56,56,76,**45),
>>>let =
>>> letters[1:5], g = 1:2)
>>>
>>>> x
>>>>
>>> num let g
>>> 1 10 a 1
>>> 2 11 b 2
>>> 3 12 c 1
>>> 4 43 d 2
>>> 5 23 e 1
>>> 6 14 a 2
>>> 7 52 b 1
>>> 8 52 c 2
>>> 9 12 d 1
>>> 10 23 e 2
>>> 11 21 a 1
>>> 12 23 b 2
>>> 13 32 c 1
>>> 14 31 d 2
>>> 15 24 e 1
>>> 16 45 a 2
>>> 17 56 b 1
>>> 18 56 c 2
>>> 19 76 d 1
>>> 20 45 e 2
>>>
>>> I tried the following:
>>>
>>> xs = split(x,x$g*x$let)
>>>
>>
>> Probably
>>
>> xs = split(x,list(x$g,x$let))
>>
>>>
>>> *Warning message:
>>> In Ops.factor(x$g, x$let) : * not meaningful for factors*
>>>
>>>
>>> xs = split(x,c(x$g*x$let))
>>>
>>> *Warning message:
>>> In Ops.factor(x$g, x$let) : * not meaningful for factors
>>> *
>>>
>>> Can someone give some hints?
>>>
>>> Thanks!
>>>
>>>
>>> --
>>> Sincerely,
>>> Changbin
>>> --
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________**________________
>>> R-help at r-project.org mailing list
>>>
>>>https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mail
>>>man/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>
>
>--
>Sincerely,
>Changbin
>--
>
>Changbin Du
>Data Analysis Group, Affymetrix Inc
>6550 Emeryville, CA, 94608
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list