[R] Thoughts for faster indexing
carl at witthoft.com
Thu Nov 21 14:23:35 CET 2013
What the Data Munger Guru said.
Plus: this is almost certainly a job for ddply or data.table.
Noah Silverman-2 wrote
> I have a fairly large data.frame. (About 150,000 rows of 100
> variables.) There are case IDs, and multiple entries for each ID, with a
> date stamp. (i.e. records of peoples activity.)
> I need to iterate over each person (record ID) in the data set, and then
> process their data for each date. The processing part is fast, the date
> part is fast. Locating the records is slow. I've even tried using
> data.table, with ID set as the index, and it is still slow.
> The line with the slow process (According to Rprof) is:
> j <- which( d$id == person )
> (I then process all the records indexed by j, which seems fast enough.)
> where d is my data.frame or data.table
> I thought that using the data.table indexing would speed things up, but
> not in this case.
> Any ideas on how to speed this up?
> Noah Silverman, M.S., C.Phil
> UCLA Department of Statistics
> 8117 Math Sciences Building
> Los Angeles, CA 90095
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://r.789695.n4.nabble.com/Thoughts-for-faster-indexing-tp4680854p4680889.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help