[Rd] suggestion how to use memcpy in duplicate.c
Simon Urbanek
simon.urbanek at r-project.org
Thu Apr 22 13:42:54 CEST 2010
On Apr 22, 2010, at 7:12 AM, Matthew Dowle wrote:
>
> Is this a thumbs up for memcpy for DUPLICATE_ATOMIC_VECTOR at least ?
>
> If there is further specific testing then let me know, happy to help, but
> you seem to have beaten me to it.
>
I was not volunteering to do anything - I was just looking at whether it makes sense to bother at all and pointing out the bugs in your code ;). I have a sufficiently long list of TODOs already :P
Cheers,
Simon
>
> "Simon Urbanek" <simon.urbanek at r-project.org> wrote in message
> news:65D21B93-A737-4A94-BDF4-AD7E90518AC0 at r-project.org...
>>
>> On Apr 21, 2010, at 2:15 PM, Seth Falcon wrote:
>>
>>> On 4/21/10 10:45 AM, Simon Urbanek wrote:
>>>> Won't that miss the last incomplete chunk? (and please don't use
>>>> DATAPTR on INTSXP even though the effect is currently the same)
>>>>
>>>> In general it seems that the it depends on nt whether this is
>>>> efficient or not since calls to short memcpy are expensive (very
>>>> small nt that is).
>>>>
>>>> I ran some empirical tests to compare memcpy vs for() (x86_64, OS X)
>>>> and the results were encouraging - depending on the size of the
>>>> copied block the difference could be quite big: tiny block (ca. n =
>>>> 32 or less) - for() is faster small block (n ~ 1k) - memcpy is ca. 8x
>>>> faster as the size increases the gap closes (presumably due to RAM
>>>> bandwidth limitations) so for n = 512M it is ~30%.
>>>>
>>>
>>>> Of course this is contingent on the implementation of memcpy,
>>>> compiler, architecture etc. And will only matter if copying is what
>>>> you do most of the time ...
>>>
>>> Copying of vectors is something that I would expect to happen fairly
>>> often in many applications of R.
>>>
>>> Is for() faster on small blocks by enough that one would want to branch
>>> based on size?
>>>
>>
>> Good question. Given that the branching itself adds overhead possibly not.
>> In the best case for() can be ~40% faster (for single-digit n) but that
>> means billions of copies to make a difference (since the operation itself
>> is so fast). The break-even point on my test machine is n=32 and when I
>> added the branching it took 20% hit so I guess it's simply not worth it.
>> The only case that may be worth branching is n:1 since that is likely a
>> fairly common use (the branching penalty in copy routines is lower than
>> comparing memcpy/for implementations since the branching can be done
>> before the outer for loop so this may vary case-by-case).
>>
>> Cheers,
>> Simon
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
More information about the R-devel
mailing list