[Rd] suggestion how to use memcpy in duplicate.c

Wed Apr 21 22:13:24 CEST 2010

Le 21/04/10 21:39, Simon Urbanek a écrit :
>
>
> On Apr 21, 2010, at 3:32 PM, Romain Francois wrote:
>
>> Le 21/04/10 17:54, Matthew Dowle a écrit :
>>>
>>>>  From copyVector in duplicate.c :
>>>
>>> void copyVector(SEXP s, SEXP t)
>>> {
>>>      int i, ns, nt;
>>>      nt = LENGTH(t);
>>>      ns = LENGTH(s);
>>>      switch (TYPEOF(s)) {
>>> ...
>>>      case INTSXP:
>>>      for (i = 0; i<   ns; i++)
>>>          INTEGER(s)[i] = INTEGER(t)[i % nt];
>>>      break;
>>> ...
>>>
>>> could that be replaced with :
>>>
>>>      case INTSXP:
>>>      for (i=0; i<ns/nt; i++)
>>>          memcpy((char *)DATAPTR(s)+i*nt*sizeof(int), (char *)DATAPTR(t),
>>> nt*sizeof(int));
>>>      break;
>>
>> or at least with something like this:
>>
>> int* p_s = INTEGER(s) ;
>> int* p_t = INTEGER(t) ;
>> for( i=0 ; i<  ns ; i++){
>> 	p_s[i] = p_t[i % nt];
>> }
>>
>> since expanding the INTEGER macro over and over has a price.
>>
>
> ... in packages, yes, but not in core R.

Hmm. I was not talking about the overhead of the INTEGER function:

int *(INTEGER)(SEXP x) {
     if(TYPEOF(x) != INTSXP && TYPEOF(x) != LGLSXP)
	error("%s() can only be applied to a '%s', not a '%s'",
	      "INTEGER", "integer", type2char(TYPEOF(x)));
     return INTEGER(x);
}

but the one related to the macro.

#define INTEGER(x)	((int *) DATAPTR(x))
#define DATAPTR(x)	(((SEXPREC_ALIGN *) (x)) + 1)

so the loop expands to :

for (i = 0; i<   ns; i++)
           ((int *) (((SEXPREC_ALIGN *) (s)) + 1))[i] = ((int *) 
(((SEXPREC_ALIGN *) (t)) + 1))[i % nt];

I still believe grabbing the pointer just once for s and once for t is 
more efficient ...

> Cheers,
> Simon
>
>
>>> and similar for the other types in copyVector.  This won't help regular
>>> vector copies, since those seem to be done by the DUPLICATE_ATOMIC_VECTOR
>>> macro, see next suggestion below, but it should help copyMatrix which calls
>>> copyVector, scan.c which calls copyVector on three lines, dcf.c (once) and
>>> dounzip.c (once).
>>>
>>> For the DUPLICATE_ATOMIC_VECTOR macro there is already a comment next to it
>>> :
>>>
>>>      <FIXME>: surely memcpy would be faster here?
>>>
>>> which seems to refer to the for loop  :
>>>
>>>      else { \
>>>      int __i__; \
>>>      type *__fp__ = fun(from), *__tp__ = fun(to); \
>>>      for (__i__ = 0; __i__<   __n__; __i__++) \
>>>        __tp__[__i__] = __fp__[__i__]; \
>>>    } \
>>>
>>> Could that loop be replaced by the following ?
>>>
>>>     else { \
>>>     memcpy((char *)DATAPTR(to), (char *)DATAPTR(from), __n__*sizeof(type)); \
>>>     }\
>>>
>>> In the data.table package, dogroups.c uses this technique, so the principle
>>> is tested and works well so far.
>>>
>>> Are there any road blocks preventing this change, or is anyone already
>>> working on it ?  If not then I'll try and test it (on Ubuntu 32bit) and
>>> submit patch with timings, as before.  Comments/pointers much appreciated.
>>>
>>> Matthew

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9aKDM9 : embed images in Rd documents
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7