[R] Quicker way of combining vectors into a data.frame
Sebastian Weber
sebastian.weber at physik.tu-darmstadt.de
Thu Nov 30 18:29:53 CET 2006
Hi!
I don't know for sure - and I have not tried it yet, but how about
allocating a matrix which will hold all stuff, then put all vectors in
it and at last assign some dimnames to it:
data <- matrix(0, ncol=5, nrow=length(vec1))
data[1,] <- vec1
...
dimnames(data) <- list(c(1,2,3,4,5), )
as.data.frame(data)
I forgot, I of course assume all of your vectors to be numeric ...
Hope that helps!
Greetings,
Sebastian
On Thu, 2006-11-30 at 17:00 +0000, Gavin Simpson wrote:
> Hi,
>
> In a function, I compute 10 (un-named) vectors of reasonable length
> (4471 in the particular example I have to hand) that I want to combine
> into a data frame object, that the function will return.
>
> This is very slow, so *I'm* doing something wrong if I want it to be
> quick and efficient, though I'm not sure what the best way to do this
> would be.
>
> I know it is the combining into data frame bit that is slow, because
> I've Rprof'ed it:
>
> $by.self
> self.time self.pct total.time total.pct
> "names<-.default" 16.58 52.8 16.58 52.8
> "unlist" 7.22 23.0 7.26 23.1
> "data.frame" 1.72 5.5 29.38 93.6
> "duplicated.default" 1.66 5.3 1.66 5.3
> "+" 1.20 3.8 1.20 3.8
> "list" 0.40 1.3 0.40 1.3
> "as.data.frame.numeric" 0.28 0.9 3.32 10.6
> "apply" 0.26 0.8 1.70 5.4
> "pmatch" 0.22 0.7 0.22 0.7
> "paste" 0.20 0.6 0.90 2.9
> "deparse" 0.14 0.4 0.70 2.2
> "eval" 0.12 0.4 31.28 99.7
> "names<-" 0.12 0.4 16.70 53.2
> "FUN" 0.12 0.4 1.32 4.2
> "names" 0.12 0.4 0.14 0.4
> "as.list.default" 0.12 0.4 0.12 0.4
> "duplicated" 0.10 0.3 1.76 5.6
> "gc" 0.10 0.3 0.10 0.3
>
> And I stepped through it under debug() and all the calculations before
> are quick, and then this bit takes a little over 20 seconds to complete
>
> fab <- data.frame(lc.ratio = lc.ratio, Q = Q,
> fNupt = fNupt,
> rho.n = rho.n, rho.s = rho.s,
> net.Nimm = net.Nimm,
> net.Nden = net.Nden,
> CLminN = CLminN,
> CLmaxN = CLmaxN,
> CLmaxS = CLmaxS)
>
> I can get it down to c. 5 seconds if I do (not Rprof'ed):
>
> fab <- data.frame(lc.ratio, Q,
> fNupt,
> rho.n, rho.s,
> net.Nimm,
> net.Nden,
> CLminN,
> CLmaxN,
> CLmaxS)
>
> But this still seems quite a long time, so I'm thinking that there must
> be a quicker of doing what I want (end up with a data.frame with the 10
> vectors in it).
>
> Can anyone enlighten me?
>
> > version
> _
> platform i686-pc-linux-gnu
> arch i686
> os linux-gnu
> system i686, linux-gnu
> status Patched
> major 2
> minor 4.0
> year 2006
> month 10
> day 03
> svn rev 39576
> language R
> version.string R version 2.4.0 Patched (2006-10-03 r39576)
>
> > sessionInfo()
> R version 2.4.0 Patched (2006-10-03 r39576)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] "methods" "stats" "graphics" "grDevices" "utils"
> "datasets"
> [7] "base"
>
> Thanks in advance,
>
> G
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC & ENSIS, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list