[R] Dataframe from list of similar lists: not _a_ way, but _the best_ way
Brian Diggs
diggsb at ohsu.edu
Tue Dec 7 20:18:25 CET 2010
On 12/7/2010 1:03 AM, Nick Sabbe wrote:
> Hi All.
>
> I often find myself in this situation:
>
> . Based on some vector (or list) of values, I need to calculate a
> few new values for each of them, where some of the new values are numbers,
> but some are more of descriptive nature (so: character strings)
>
> . So I use e.g. sapply, passing a custom function that returns a
> list with all the calculated values
>
> . The result of this is: a list (=the return value of sapply) of
> lists, that all have the same kind of named values
>
> A silly example:
>
> list.of.lists<-sapply(1:10, function(nr){list(org=nr,
> chr=as.character(nr))})
Actually, this is not a list of lists, but rather a list of vectors with
dimensions. I didn't know such a thing existed, but obviously it does.
> It seems rather obvious that the result would be better structured as a
> dataframe.
>
> Now I know a few ways to do this (using do.call), but I fear most of these
> are rather bad in performance: I suspect all the data is being repetitively
> copied which may be slow.
>
> So, my question to the specialists:
>
> . Is the above way of working reasonable for this kind of problem?
> Or would you suggest otherwise?
>
> . What would be the best (as in: quickest) way of transforming this
> list of lists to a dataframe? The answer to this is probably based upon
> knowledge of the inner workings of R? Or is there any way in which this
> depends on the specifics of my function (for nontrivial functions and list
> sizes)?
I don't know that this is best (in terms of fastest and/or least memory
usage), but to me the following is "best" in that it hands off the
problem to a package that is designed to handle such problems, so
presumably does a better job than any one-off approach.
library("plyr")
DF <- ldply(1:10, function(nr){data.frame(org=nr, chr=as.character(nr))})
Note that the internal function returns a data.frame rather than a list,
and the *dply functions automatically stitch the individual data.frames
together. Check out the documentation to the plyr package.
> Thanks!
>
> Nick Sabbe
>
> --
> ping: nick.sabbe at ugent.be
> link:<http://biomath.ugent.be/> http://biomath.ugent.be
> wink: A1.056, Coupure Links 653, 9000 Gent
> ring: 09/264.59.36
> -- Do Not Disapprove
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More information about the R-help
mailing list