[R] From data frame to list object
David Winsemius
dwinsemius at comcast.net
Mon Jan 31 20:26:16 CET 2011
On Jan 31, 2011, at 2:18 PM, Bogaso Christofer wrote:
> Sorry if I did not clarify that. Here I have a data frame with many
> columns,
> which was taken from some outside DB. Now I want to split that data
> frame
> and create a "list" object (to make my further calculation easier),
> on basis
> of a typical column of that DB. I cannot post my original DB here
> (due to
> some security reason and ofcourse it's huge size), therefore I
> posted an
> artificial DB.
>
> Here my artificial DB with 3 columns:
>
> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
> z=rep(c("x", "y", "z"), each=2))
>
> I would like to create a list object where each element now is a
> matrix or
> data frame, based on that "y" column. 1st element of that list will
> be a
> data frame with observations of "x" and "z" columns, that address the
> attribute "y = a". Similarly other two.
It is a job for split. The result is a data.frame but that is, after
all, just a list with certain attributes.
> split(dfrm, dfrm$y)
$a
x y z
1 -1.16790385 a x
2 -0.84831139 a x
3 -0.64312051 a y
4 -1.66841121 a y
5 0.03737404 a z
6 -0.42165643 a z
$b
x y z
7 1.1045024 b x
8 1.4787933 b x
9 0.5278083 b y
10 0.1770083 b y
11 -0.5054573 b z
12 -0.6512499 b z
$c
x y z
13 0.61225420 c x
14 -0.45032691 c x
15 0.36502921 c y
16 0.33505288 c y
17 0.02189088 c z
18 -0.53893624 c z
> str(split(dfrm, dfrm$y)$a)
'data.frame': 6 obs. of 3 variables:
$ x: num -1.1679 -0.8483 -0.6431 -1.6684 0.0374 ...
$ y: Factor w/ 3 levels "a","b","c": 1 1 1 1 1 1
$ z: Factor w/ 3 levels "x","y","z": 1 1 2 2 3 3
--
David
>
> Hope I could be able to make my intentions clearer.
>
> Any idea how I can achieve that?
>
> Thanks,
>
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: 01 February 2011 00:13
> To: Bogaso Christofer
> Cc: r-help at r-project.org
> Subject: Re: [R] From data frame to list object
>
>
> On Jan 31, 2011, at 1:56 PM, Bogaso Christofer wrote:
>
>> Thanks David for this reply. However if my data frame has only 2
>> columns
>> then it is working fine. It is not working for a general setting:
>>
>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>> z=rep(c("x", "y", "z"), each=2))
>> tapply(dfrm[,1], dfrm$y, c) # this is working fine
>>
>>> tapply(dfrm[,c(1,3)], dfrm$y, c) # this is giving error!
>> Error in tapply(dfrm[, c(1, 3)], dfrm$y, c) :
>> arguments must have same length
>>
>> Can you please help me how to modify that?
>
> You will need to specify what you goals are. What to you want to
> happen to those two columns referred to by dfrm[, c(1,3)]? It's
> possible that split() may be the answer, but clarify the goals first.
> You should provide an example that represents the complexity of the
> task.
>
>>
>> Thanks,
>>
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>> Sent: 31 January 2011 23:26
>> To: Bogaso Christofer
>> Cc: r-help at r-project.org
>> Subject: Re: [R] From data frame to list object
>>
>>
>> On Jan 31, 2011, at 1:03 PM, Bogaso Christofer wrote:
>>
>>> Dear all, let say I have following data frame:
>>>
>>>
>>
>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6)) >
>> tapply(dfrm$x, dfrm$y, c) $a [1] 0.9711995 1.4018345 -1.4355713
>> -0.5106138
>> -0.8470171 [6] 1.1634586
>>
>> $b
>> [1] -0.8058164 0.4977112 1.1556391 0.8158588 0.2549273 [6]
>> 3.0758260
>>
>> $c
>> [1] 0.437345128 -0.415874363 0.003230285 -0.737117910 [5]
>> 1.247972964
>> 0.903001077
>>
>>
>>>
>>>> data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))
>>>
>>> x y
>>>
>>> 1 -1.072152537 a
>>>
>>> 2 0.382985265 a
>>>
>>> 3 0.058877377 a
>>>
>>> 4 -0.006911939 a
>>>
>>> 5 -2.355269051 a
>>>
>>> 6 -0.303095553 a
>>>
>>> 7 0.484038422 b
>>>
>>> 8 0.733928931 b
>>>
>>> 9 -1.136014346 b
>>>
>>> 10 0.503552090 b
>>>
>>> 11 1.708609658 b
>>>
>>> 12 -0.294599403 b
>>>
>>> 13 1.239308497 c
>>>
>>> 14 0.754081946 c
>>>
>>> 15 -0.237346858 c
>>>
>>> 16 -0.051011439 c
>>>
>>> 17 -0.618675146 c
>>>
>>> 18 0.537612359 c
>>>
>>>
>>>
>>>> From this data frame I want to create a "list" of length 3, where
>>>> each
>>> element of this list will be a vector corresponding to the value of
>>> y.
>>> For example, 1st element will be all "x" values corresponding to the
>>> "y=a", and similarly the other elements of this list. Can somebody
>>> point me how to do this without having some "for" loop?
>>>
>>>
>>>
>>> Thanks and regards,
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
> David Winsemius, MD
> West Hartford, CT
>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list