[R] Fast nested List->data.frame
Greg Hirson
ghirson at ucdavis.edu
Tue Jan 5 09:40:37 CET 2010
Dieter,
I'd approach this by first making a matrix, then converting to a data
frame with appropriate types. I'm sure there is a way to do it with
structure in one step. Operations on matrices are usually faster than on
dataframes.
len <- 100000
d <- replicate(len, list(pH = 3, marker = TRUE, position = "A"), FALSE)
toDF <- function(alist){
d.matrix <- matrix(unlist(alist), ncol = 3, byrow = TRUE)
d.df <- as.data.frame(d.matrix)
names(d.df) <- c('pH', 'marker', 'position')
d.df$pH <- as.numeric(d.df$pH)
d.df$marker <- as.logical(d.df$marker)
return(d.df)
}
on my system,
system.time(b<-toDF(d))
user system elapsed
0.560 0.033 0.592
and
head(b)
pH marker position
1 1 TRUE A
2 1 TRUE A
3 1 TRUE A
4 1 TRUE A
5 1 TRUE A
6 1 TRUE A
and
sapply(b, class)
pH marker position
"numeric" "logical" "factor"
I hope this helps,
Greg
sessionInfo() ##old, I know.
R version 2.9.0 (2009-04-17)
i386-apple-darwin8.11.1
locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] cimis_0.1-3 RLastFM_0.1-4 RCurl_0.98-1 bitops_1.0-4.1
XML_2.5-3
[6] lattice_0.17-22
loaded via a namespace (and not attached):
[1] grid_2.9.0
On 1/4/10 11:43 PM, Dieter Menne wrote:
> I have very large data sets given in a format similar to d below. Converting
> these to a data frame is a bottleneck in my application. My fastest version
> is given below, but it look clumsy to me.
>
> Any ideas?
>
> Dieter
>
> # -----------------------
> len = 100000
> d = replicate(len, list(pH = 3,marker = TRUE,position = "A"),FALSE)
> # Data are given as d
>
> # preallocate vectors
> pH =rep(0,len)
> marker =rep(0,len)
> position =rep(0,len)
>
> system.time(
> {
> for (i in 1:len)
> {
> d1 = d[[i]]
> #Assign to vectors
> pH[i] = d1[[1]]
> marker[i] = d1[[2]]
> position[i] = d1[[3]]
> }
> # combine vectors
> pHAll = data.frame(pH,marker,position)
> }
> )
>
>
>
--
Greg Hirson
ghirson at ucdavis.edu
Graduate Student
Agricultural and Environmental Chemistry
1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616
More information about the R-help
mailing list