[R] How to quickly convert a data.frame into a structure of lists

Duncan Mackay mackay at northnet.com.au
Wed Aug 10 08:15:18 CEST 2011


Hi

Something to get you started
? as.list
a data.frame can be regarded as a 2 dimensional array of list vectors

df = data.frame(a=1:2,b=2:1,c=4:5,d=9:10)
as.list(df[,1:3])
$a
[1] 1 2

$b
[1] 2 1

$c
[1] 4 5

see also
http://cran.ms.unimelb.edu.au/doc/contrib/Burns-unwilling_S.pdf

Regards

Duncan


Duncan Mackay
Department of Agronomy and Soil Science
University of New England
ARMIDALE NSW 2351
Email: home mackay at northnet.com.au

At 10:58 10/08/2011, you wrote:
>Hello,
>
>This is my first project in R, so I'm trying to work 'the R way', but it
>still feels awkward sometimes.
>
>The problem that I'm facing right now is that I need to convert a data.frame
>into a structure of lists. The data.frame has columns in the order of tens
>(I need to focus on only three of them) and rows in the order of millions.
>So it's quite a big dataset.
>Let say that the columns of interest are A, B and C. I need to take the
>data.frame and construct a structure of list where I have a list for every
>level of A, those list all contain lists for every levels of B, and the
>'b-lists' contains all the values of C that match the corresponding levels
>of A and B.
>So, I should be able to write something like this:
> > MyData at list_structure$x_level_of_A$y_level_of_B
>and get a vector of the values of C that were on rows where A=x_level_of_A
>and B=y_level_of_B.
>
>My first attempt was to use two imbricated "lapply" functions running
>something like this:
>
>list_structure<-lapply(levels(A) function(x) {
>   as.character(x) = lapply( levels(B), function(y) {
>     as.character(y) = C[A==x & B==y]
>   })
>})
>
>The real code was not quite as simple, but I managed to have it work, and it
>worked well on my first dataset (where A and B had only few levels). I was
>quite happy... but the imbricated loops killed me on a second dataset where
>A had several thousand levels. So I tried something else.
>
>My second attempt was to go through every row of the data.frame and append
>the value to the appropriate vector.
>
>I first initialized a structure of lists ending with NULL vector, then I did
>something like this:
>
>for (i in 1:nrow(DataFrame)) {
>   eval(
>     substitute(
>       append(MyData at list_structure$a_value$b_value, c_value),
>       list(a_value=as.character(DF$A[i]), b_value=as.character(DF$B[i]),
>c_value=as.character(DF$C[i]))
>     )
>   )
>}
>
>This works... but way too slowly for my purpose.
>
>I would like to know if there is a better road to take to do this
>transformation. Or, if there is a way of speeding one of the two solutions
>that I have tried.
>
>Thank you very much for your help!
>
>(And in your replies, please remember that this is my first project in R, so
>don't hesitate to state the obvious if it seems like I am missing it!)
>
>Frederic
>
>--
>View this message in context: 
>http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3731746.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list