[Rd] Memory allocation in read.table

Hadley Wickham h.wickham at gmail.com
Wed Aug 28 19:59:07 CEST 2013


>> Why do those lines need any allocations? I thought class<- and attr<-
>> were primitives, and hence would modify in place.
>>
>
> .. but only if there is no other reference to the data (i.e. NAMED < 2). If there are two references, they have to copy, because it would change the other copy.
> Here, however, it already has NAMED=2 because of
>
> data <- data[keep]

Ah, got it - thanks!

> PS: if you are loading any sizable data, the one thing you don't want to do is to use read.table() ;)

Yes ;)  Romain and I (mostly Romain) are working on some faster
alternatives at https://github.com/romainfrancois/fastread.

One surprising finding so far (to me at least), is that when loading a
file full of doubles, you pretty quickly get to the point where strtod
is the bottleneck.

Hadley

-- 
Chief Scientist, RStudio
http://had.co.nz/



More information about the R-devel mailing list