[R] Slow reshape from 5x600000 to 6311 x 132
Christian Schulz
ozric at web.de
Fri Mar 5 08:28:43 CET 2004
Hi,
my reshape's from ~1.4 million obs. to ~150.00 obs. & 50 attr. goes
surprinsing fast (1-2 miniutes), but is less complex then yours. Perhaps it
is faster if you have no character.string as value - if it's
possible for your data?
Reshaping in the database is possible with
innerselects ,but i prefer reshape because it take
in the db really long time?
christian
Am Freitag, 5. März 2004 05:31 schrieb Christopher Austin-Lane:
> I have a dataset that's a few hundred thousand rows from a database
>
> (read in via dbreadTable). The database is like:
> > str(measures)
>
> `data.frame': 609363 obs. of 5 variables:
> $ vih.id : int 1 2 3 4 5 6 7 8 9 10 ...
>
> $ vi.id : int 1 2 3 4 5 6 7 8 9 10 ...
>
> $ vih.value: chr "0" "1989" "0" "N/A" ...
>
> $ vih.date : chr "20040226012314" "20040226012315" "20040226012315"
> "20040226012315" ...
>
> $ vih.run.n: int 1 1 1 1 1 1 1 1 1 1 ..
> I'm reshaping it to be like
>
> > str(better)
>
> `data.frame': 132 obs. of 6311 variables:
> $ vih.run.n : int 1 2 4 5 6 7 8 9 10 11 ...
> $ vih.value.1 : chr "0" "0" "0" "0" ...
> $ vih.value.2 : chr "1989" "1989" "1989" "1989" ...
> $ vih.value.3 : chr "0" "0" "0" "0" ...
> $ vih.value.4 : chr "N/A" "N/A" "N/A" "N/A" ...
> $ vih.value.5 : chr "3163979" "3163979" "3163979" "3163979" ...
> $ vih.value.6 : chr "5500073" "5500073" "5500073" "5500073" ...
>
> (etc., etc.)
>
> This takes about 4-8 hours to accomplish. Should I
>
> a) try to put it into the wide format row by row as I get the data from
> the DB instead of using dbReadTable,
>
> or
>
> b) try to tune something in R? (I'm trying it now with R
> --min-vsize=600M --min-nsize=6M although it's not seeming fast; I won't
> know if it's faster for a while).
>
> (Using home compiled R 1.8.1 on Mac OS X 10.3.2, under emacs/ESS,
> although my R 1.8.1 on Solaris 2.8 has been churning for a few hours as
> well (on a split of the data that is 630 variables by 1000 obs).
>
> --Chris
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list