[R] From data frame to list object
David Winsemius
dwinsemius at comcast.net
Mon Jan 31 20:31:21 CET 2011
On Jan 31, 2011, at 2:26 PM, David Winsemius wrote:
>
> On Jan 31, 2011, at 2:18 PM, Bogaso Christofer wrote:
>
>> Sorry if I did not clarify that. Here I have a data frame with many
>> columns,
>> which was taken from some outside DB. Now I want to split that data
>> frame
>> and create a "list" object (to make my further calculation easier),
>> on basis
>> of a typical column of that DB. I cannot post my original DB here
>> (due to
>> some security reason and ofcourse it's huge size), therefore I
>> posted an
>> artificial DB.
>>
>> Here my artificial DB with 3 columns:
>>
>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>> z=rep(c("x", "y", "z"), each=2))
>>
>> I would like to create a list object where each element now is a
>> matrix or
>> data frame, based on that "y" column. 1st element of that list will
>> be a
>> data frame with observations of "x" and "z" columns, that address the
>> attribute "y = a". Similarly other two.
>
> It is a job for split. The result is a data.frame but that is, after
> all, just a list with certain attributes.
>
Er, the result is a list of dataframes.
> > split(dfrm, dfrm$y)
> $a
> x y z
> 1 -1.16790385 a x
> 2 -0.84831139 a x
> 3 -0.64312051 a y
> 4 -1.66841121 a y
> 5 0.03737404 a z
> 6 -0.42165643 a z
>
> $b
> x y z
> 7 1.1045024 b x
> 8 1.4787933 b x
> 9 0.5278083 b y
> 10 0.1770083 b y
> 11 -0.5054573 b z
> 12 -0.6512499 b z
>
> $c
> x y z
> 13 0.61225420 c x
> 14 -0.45032691 c x
> 15 0.36502921 c y
> 16 0.33505288 c y
> 17 0.02189088 c z
> 18 -0.53893624 c z
>
> > str(split(dfrm, dfrm$y)$a)
> 'data.frame': 6 obs. of 3 variables:
> $ x: num -1.1679 -0.8483 -0.6431 -1.6684 0.0374 ...
> $ y: Factor w/ 3 levels "a","b","c": 1 1 1 1 1 1
> $ z: Factor w/ 3 levels "x","y","z": 1 1 2 2 3 3
>
> --
> David
>>
>> Hope I could be able to make my intentions clearer.
>>
>> Any idea how I can achieve that?
>>
>> Thanks,
>>
>>
>>
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>> Sent: 01 February 2011 00:13
>> To: Bogaso Christofer
>> Cc: r-help at r-project.org
>> Subject: Re: [R] From data frame to list object
>>
>>
>> On Jan 31, 2011, at 1:56 PM, Bogaso Christofer wrote:
>>
>>> Thanks David for this reply. However if my data frame has only 2
>>> columns
>>> then it is working fine. It is not working for a general setting:
>>>
>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>>> z=rep(c("x", "y", "z"), each=2))
>>> tapply(dfrm[,1], dfrm$y, c) # this is working fine
>>>
>>>> tapply(dfrm[,c(1,3)], dfrm$y, c) # this is giving error!
>>> Error in tapply(dfrm[, c(1, 3)], dfrm$y, c) :
>>> arguments must have same length
>>>
>>> Can you please help me how to modify that?
>>
>> You will need to specify what you goals are. What to you want to
>> happen to those two columns referred to by dfrm[, c(1,3)]? It's
>> possible that split() may be the answer, but clarify the goals first.
>> You should provide an example that represents the complexity of the
>> task.
>>
>>>
>>> Thanks,
>>>
>>> -----Original Message-----
>>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>>> Sent: 31 January 2011 23:26
>>> To: Bogaso Christofer
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] From data frame to list object
>>>
>>>
>>> On Jan 31, 2011, at 1:03 PM, Bogaso Christofer wrote:
>>>
>>>> Dear all, let say I have following data frame:
>>>>
>>>>
>>>
>>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6)) >
>>> tapply(dfrm$x, dfrm$y, c) $a [1] 0.9711995 1.4018345 -1.4355713
>>> -0.5106138
>>> -0.8470171 [6] 1.1634586
>>>
>>> $b
>>> [1] -0.8058164 0.4977112 1.1556391 0.8158588 0.2549273 [6]
>>> 3.0758260
>>>
>>> $c
>>> [1] 0.437345128 -0.415874363 0.003230285 -0.737117910 [5]
>>> 1.247972964
>>> 0.903001077
>>>
>>>
>>>>
>>>>> data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))
>>>>
>>>> x y
>>>>
>>>> 1 -1.072152537 a
>>>>
>>>> 2 0.382985265 a
>>>>
>>>> 3 0.058877377 a
>>>>
>>>> 4 -0.006911939 a
>>>>
>>>> 5 -2.355269051 a
>>>>
>>>> 6 -0.303095553 a
>>>>
>>>> 7 0.484038422 b
>>>>
>>>> 8 0.733928931 b
>>>>
>>>> 9 -1.136014346 b
>>>>
>>>> 10 0.503552090 b
>>>>
>>>> 11 1.708609658 b
>>>>
>>>> 12 -0.294599403 b
>>>>
>>>> 13 1.239308497 c
>>>>
>>>> 14 0.754081946 c
>>>>
>>>> 15 -0.237346858 c
>>>>
>>>> 16 -0.051011439 c
>>>>
>>>> 17 -0.618675146 c
>>>>
>>>> 18 0.537612359 c
>>>>
>>>>
>>>>
>>>>> From this data frame I want to create a "list" of length 3, where
>>>>> each
>>>> element of this list will be a vector corresponding to the value of
>>>> y.
>>>> For example, 1st element will be all "x" values corresponding to
>>>> the
>>>> "y=a", and similarly the other elements of this list. Can somebody
>>>> point me how to do this without having some "for" loop?
>>>>
>>>>
>>>>
>>>> Thanks and regards,
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list