[Rd] serialize() takes too long when serializing to a raw vector

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Thu Jan 25 15:19:18 CET 2007


Ashish Kulkarni wrote:
> Hin-Tak Leung wrote:
>   
>> It might be interesting to know get some details on your hardware.
>>
>>     
>
> It's a P4 2.66GHz with a standard Intel motherboard having 1GB RAM.
>
>   
>> On my box, linux native seems to be a little slower than
>> your quick.serialize times:
>>
>>  > system.time( serialize(matrix(0, 1000, 1000), NULL) )
>> [1] 0.372 0.288 0.692 0.000 0.000
>>  > system.time( serialize(matrix(0, 2000, 2000), NULL) )
>> [1] 1.237 1.195 2.501 0.000 0.000
>>
>> running R 2.4.1 windows under wine (same box) is a good deal s
>> lower, but is not anywhere nearly as slow as yours.
>>
>>  > system.time( serialize(matrix(0, 1000, 1000), NULL) )
>> [1] 0.00 0.00 6.08   NA   NA
>>  > system.time( serialize(matrix(0, 2000, 2000), NULL) )
>> [1]  0.01  0.01 78.00    NA    NA
>>  >
>>
>>     
>
> Well, the timings certainly differ quite a bit. What kind of
> hardware do you have? I suspect that most of the delay may be
> caused by memory reallocation since the size of the output 
> raw array is not known up front. I would imagine that WINE 
> would use the system memory allocator, not the one used by 
> the windows kernel.
>
>   
>> Since you mentioned that you are using Rmpi, there is a possibility
>> that you might be calling a different serialize() than base::serialize
>> all together???
>>     
>
> Nope, I hit this problem while using Rmpi and tracked it down
> to .mpi.serialize, which in turn calls base::serialize. From Rmpi:
>
> .mpi.serialize <- function (obj) 
> {
>     trans_obj = serialize(obj, NULL)
>     if (getRversion() >= "2.4.0") 
>         return(trans_obj)
>     else return(charToRaw(trans_obj))
> }
>
>   
Just a little sanity check on Linux. This seems to indicate an
essentially constant run time per million matrix elements. So I suspect
that something Windows-specific is going on.

> x <- sapply(1:8, function(i) {print(i); system.time(
serialize(matrix(0, i*1000, i*1000), NULL) )})
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
> matplot(t(x)/(1:8)^2)
> t(x)
       [,1]   [,2]   [,3] [,4] [,5]
[1,]  0.292  0.660  0.952    0    0
[2,]  1.368  2.492  3.861    0    0
[3,]  2.853  5.861  8.713    0    0
[4,]  5.016 10.540 15.555    0    0
[5,]  8.136 16.425 24.875    0    0
[6,] 11.481 23.926 35.427    0    0
[7,] 15.729 32.810 48.544    0    0
[8,] 20.998 43.218 64.839    0    0


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list