[Rd] names<- appears to copy 3 times?

Simon Urbanek simon.urbanek at r-project.org
Tue Jan 17 23:30:37 CET 2012


On Jan 17, 2012, at 4:50 PM, Thomas Lumley wrote:

> On Tue, Jan 17, 2012 at 9:11 PM, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>> Hi,
>> 
>> $ R --vanilla
>> R version 2.14.1 (2011-12-22)
>> Platform: i686-pc-linux-gnu (32-bit)
>>> DF = data.frame(a=1:3,b=4:6)
>>> DF
>>  a b
>> 1 1 4
>> 2 2 5
>> 3 3 6
>>> tracemem(DF)
>> [1] "<0x8898098>"
>>> names(DF)[2]="B"
>> tracemem[0x8898098 -> 0x8763e18]:
>> tracemem[0x8763e18 -> 0x8766be8]:
>> tracemem[0x8766be8 -> 0x8766b68]:
>>> DF
>>  a B
>> 1 1 4
>> 2 2 5
>> 3 3 6
>>> 
>> 
>> Are those 3 copies really taking place?
>> 
> 
> tracemem() isn't likely to give false positives.  Since you're on
> Linux, you could check by running under gdb and setting a breakpoint
> on memtrace_report, which is the function that prints the message.
> That would show where the duplicates are happening.
> 

My gut feeling is that it comes from the extra recursion caused by the subset assignment which needs DF to be dragged around deeper (I'm too lazy to actually check so it may be wrong). As expected you get less copying if you set the names directly:

> DF = data.frame(a=1:3,b=4:6)
> tracemem(DF)
[1] "<0x100c82628>"
> n = names(DF)
> n[2]="B"
> names(DF) = n
tracemem[0x100c82628 -> 0x100c82778]: 
tracemem[0x100c82778 -> 0x100c712b0]: 

and as we discussed here earlier, using the assignment primitive directly makes just one copy:

> DF = data.frame(a=1:3,b=4:6)
> tracemem(DF)
[1] "<0x1029a3c68>"
> n = names(DF)
> n[2]="B"
> DF = `names<-`(DF, n)
tracemem[0x1029a3c68 -> 0x1029a3b18]: 

Cheers,
Simon



More information about the R-devel mailing list