[R] memory management
Federico Calboli
f.calboli at imperial.ac.uk
Mon Oct 30 17:34:48 CET 2006
Hi All,
just a quick (?) question while I wait my code runs...
I'm comparing the identity of the lines of a dataframe, doing all possible
pairwise comparisons. In doing so I use identical(), but that's by the way. I'm
doing a (not so) quick and dirty check, and subsetting the data as
data[row.numb,]
and
data[a different row,]
I suspect the problem there is that I load into memory the whole frame data[,]
every time, making the biz quite slow and wasteful. As I'm idly waiting, I
though, had I put every line of data[,] as the item of a list, then done my
pairwise comparisons using the list, would I have had a better performance?
(do I win the prize for the most convoluted sentence sent to the R-help?)
For the pedants, yes, I know I could kill the process and try myself, but the
spirit of the question is, is there a way of dealing with big data *efficiently*?
Best,
Fede
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG
Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193
f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
More information about the R-help
mailing list