[Rd] suggestion how to use memcpy in duplicate.c
Romain Francois
romain at r-enthusiasts.com
Wed Apr 21 22:13:24 CEST 2010
Le 21/04/10 21:39, Simon Urbanek a écrit :
>
>
> On Apr 21, 2010, at 3:32 PM, Romain Francois wrote:
>
>> Le 21/04/10 17:54, Matthew Dowle a écrit :
>>>
>>>> From copyVector in duplicate.c :
>>>
>>> void copyVector(SEXP s, SEXP t)
>>> {
>>> int i, ns, nt;
>>> nt = LENGTH(t);
>>> ns = LENGTH(s);
>>> switch (TYPEOF(s)) {
>>> ...
>>> case INTSXP:
>>> for (i = 0; i< ns; i++)
>>> INTEGER(s)[i] = INTEGER(t)[i % nt];
>>> break;
>>> ...
>>>
>>> could that be replaced with :
>>>
>>> case INTSXP:
>>> for (i=0; i<ns/nt; i++)
>>> memcpy((char *)DATAPTR(s)+i*nt*sizeof(int), (char *)DATAPTR(t),
>>> nt*sizeof(int));
>>> break;
>>
>> or at least with something like this:
>>
>> int* p_s = INTEGER(s) ;
>> int* p_t = INTEGER(t) ;
>> for( i=0 ; i< ns ; i++){
>> p_s[i] = p_t[i % nt];
>> }
>>
>> since expanding the INTEGER macro over and over has a price.
>>
>
> ... in packages, yes, but not in core R.
Hmm. I was not talking about the overhead of the INTEGER function:
int *(INTEGER)(SEXP x) {
if(TYPEOF(x) != INTSXP && TYPEOF(x) != LGLSXP)
error("%s() can only be applied to a '%s', not a '%s'",
"INTEGER", "integer", type2char(TYPEOF(x)));
return INTEGER(x);
}
but the one related to the macro.
#define INTEGER(x) ((int *) DATAPTR(x))
#define DATAPTR(x) (((SEXPREC_ALIGN *) (x)) + 1)
so the loop expands to :
for (i = 0; i< ns; i++)
((int *) (((SEXPREC_ALIGN *) (s)) + 1))[i] = ((int *)
(((SEXPREC_ALIGN *) (t)) + 1))[i % nt];
I still believe grabbing the pointer just once for s and once for t is
more efficient ...
> Cheers,
> Simon
>
>
>>> and similar for the other types in copyVector. This won't help regular
>>> vector copies, since those seem to be done by the DUPLICATE_ATOMIC_VECTOR
>>> macro, see next suggestion below, but it should help copyMatrix which calls
>>> copyVector, scan.c which calls copyVector on three lines, dcf.c (once) and
>>> dounzip.c (once).
>>>
>>> For the DUPLICATE_ATOMIC_VECTOR macro there is already a comment next to it
>>> :
>>>
>>> <FIXME>: surely memcpy would be faster here?
>>>
>>> which seems to refer to the for loop :
>>>
>>> else { \
>>> int __i__; \
>>> type *__fp__ = fun(from), *__tp__ = fun(to); \
>>> for (__i__ = 0; __i__< __n__; __i__++) \
>>> __tp__[__i__] = __fp__[__i__]; \
>>> } \
>>>
>>> Could that loop be replaced by the following ?
>>>
>>> else { \
>>> memcpy((char *)DATAPTR(to), (char *)DATAPTR(from), __n__*sizeof(type)); \
>>> }\
>>>
>>> In the data.table package, dogroups.c uses this technique, so the principle
>>> is tested and works well so far.
>>>
>>> Are there any road blocks preventing this change, or is anyone already
>>> working on it ? If not then I'll try and test it (on Ubuntu 32bit) and
>>> submit patch with timings, as before. Comments/pointers much appreciated.
>>>
>>> Matthew
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9aKDM9 : embed images in Rd documents
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
More information about the R-devel
mailing list