[R] From data frame to list object

David Winsemius dwinsemius at comcast.net
Mon Jan 31 20:26:16 CET 2011


On Jan 31, 2011, at 2:18 PM, Bogaso Christofer wrote:

> Sorry if I did not clarify that. Here I have a data frame with many  
> columns,
> which was taken from some outside DB. Now I want to split that data  
> frame
> and create a "list" object (to make my further calculation easier),  
> on basis
> of a typical column of that DB. I cannot post my original DB here  
> (due to
> some security reason and ofcourse it's huge size), therefore I  
> posted an
> artificial DB.
>
> Here my artificial DB with 3 columns:
>
> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
> z=rep(c("x", "y", "z"), each=2))
>
> I would like to create a list object where each element now is a  
> matrix or
> data frame, based on that "y" column. 1st element of that list will  
> be a
> data frame with observations of "x" and "z" columns, that address the
> attribute "y = a". Similarly other two.

It is a job for split. The result is a data.frame but that is, after  
all, just a list with certain attributes.

 > split(dfrm, dfrm$y)
$a
             x y z
1 -1.16790385 a x
2 -0.84831139 a x
3 -0.64312051 a y
4 -1.66841121 a y
5  0.03737404 a z
6 -0.42165643 a z

$b
             x y z
7   1.1045024 b x
8   1.4787933 b x
9   0.5278083 b y
10  0.1770083 b y
11 -0.5054573 b z
12 -0.6512499 b z

$c
              x y z
13  0.61225420 c x
14 -0.45032691 c x
15  0.36502921 c y
16  0.33505288 c y
17  0.02189088 c z
18 -0.53893624 c z

 > str(split(dfrm, dfrm$y)$a)
'data.frame':	6 obs. of  3 variables:
  $ x: num  -1.1679 -0.8483 -0.6431 -1.6684 0.0374 ...
  $ y: Factor w/ 3 levels "a","b","c": 1 1 1 1 1 1
  $ z: Factor w/ 3 levels "x","y","z": 1 1 2 2 3 3

-- 
David
>
> Hope I could be able to make my intentions clearer.
>
> Any idea how I can achieve that?
>
> Thanks,
>
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: 01 February 2011 00:13
> To: Bogaso Christofer
> Cc: r-help at r-project.org
> Subject: Re: [R] From data frame to list object
>
>
> On Jan 31, 2011, at 1:56 PM, Bogaso Christofer wrote:
>
>> Thanks David for this reply. However if my data frame has only 2
>> columns
>> then it is working fine. It is not working for a general setting:
>>
>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>> z=rep(c("x", "y", "z"), each=2))
>> tapply(dfrm[,1], dfrm$y, c) # this is working fine
>>
>>> tapply(dfrm[,c(1,3)], dfrm$y, c)  # this is giving error!
>> Error in tapply(dfrm[, c(1, 3)], dfrm$y, c) :
>> arguments must have same length
>>
>> Can you please help me how to modify that?
>
> You will need to specify what you goals are. What to you want to
> happen to those two columns referred to by dfrm[, c(1,3)]? It's
> possible that split() may be the answer, but clarify the goals first.
> You should provide an example that represents the complexity of the
> task.
>
>>
>> Thanks,
>>
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>> Sent: 31 January 2011 23:26
>> To: Bogaso Christofer
>> Cc: r-help at r-project.org
>> Subject: Re: [R] From data frame to list object
>>
>>
>> On Jan 31, 2011, at 1:03 PM, Bogaso Christofer wrote:
>>
>>> Dear all, let say I have following data frame:
>>>
>>>
>>
>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))  >
>> tapply(dfrm$x, dfrm$y, c) $a [1]  0.9711995  1.4018345 -1.4355713
>> -0.5106138
>> -0.8470171 [6]  1.1634586
>>
>> $b
>> [1] -0.8058164  0.4977112  1.1556391  0.8158588  0.2549273 [6]
>> 3.0758260
>>
>> $c
>> [1]  0.437345128 -0.415874363  0.003230285 -0.737117910 [5]
>> 1.247972964
>> 0.903001077
>>
>>
>>>
>>>> data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))
>>>
>>>            x y
>>>
>>> 1  -1.072152537 a
>>>
>>> 2   0.382985265 a
>>>
>>> 3   0.058877377 a
>>>
>>> 4  -0.006911939 a
>>>
>>> 5  -2.355269051 a
>>>
>>> 6  -0.303095553 a
>>>
>>> 7   0.484038422 b
>>>
>>> 8   0.733928931 b
>>>
>>> 9  -1.136014346 b
>>>
>>> 10  0.503552090 b
>>>
>>> 11  1.708609658 b
>>>
>>> 12 -0.294599403 b
>>>
>>> 13  1.239308497 c
>>>
>>> 14  0.754081946 c
>>>
>>> 15 -0.237346858 c
>>>
>>> 16 -0.051011439 c
>>>
>>> 17 -0.618675146 c
>>>
>>> 18  0.537612359 c
>>>
>>>
>>>
>>>> From this data frame I want to create a "list" of length 3, where
>>>> each
>>> element of this list will be a vector corresponding to the value of
>>> y.
>>> For example, 1st element will be all "x" values corresponding to the
>>> "y=a", and similarly the other elements of this list. Can somebody
>>> point me how to do this without having some "for" loop?
>>>
>>>
>>>
>>> Thanks and regards,
>>>
>>>
>>> 	[[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
> David Winsemius, MD
> West Hartford, CT
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list