[R] Tip for performance improvement while handling huge data?
suresh.ghalsasi at gmail.com
Sun Feb 8 21:09:47 CET 2009
Ok. Thank you.
As of now, vectorization option is feasible. Was not sure to handle this
way. would try.
Philipp Pagel-5 wrote:
>> For certain calculations, I have to handle a dataframe with say 10
>> rows and multiple columns of different datatypes.
>> When I try to perform calculations on certain elements in each row, the
>> program just goes in "busy" mode for really long time.
>> To avoid this "busy" mode, I split the dataframe into subsets of 10000
>> Then the calculation was done very fast. within reasonable time.
>> Is there any other tip to improve the performance ?
> Depending on what exactly it is you are doing and what causes the slowdown
> there may be a number of useful strategies:
> - Buy RAM (lots of it) - it's cheap
> - Vectorize whatever you are doing
> - Don't use all the data you have but draw a random sample of reasonalbe
> - ...
> To be more helpful we'd have to know
> - what are the computations involved?
> - how are they implemented at the moment?
> -> example code
> - what is the range of "really long time"?
> Dr. Philipp Pagel
> Lehrstuhl für Genomorientierte Bioinformatik
> Technische Universität München
> Wissenschaftszentrum Weihenstephan
> 85350 Freising, Germany
> R-help at r-project.org mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://www.nabble.com/Tip-for-performance-improvement-while-handling-huge-data--tp21901287p21902758.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help