[R] 'object.size' takes a long time to return a value

james.holtman@convergys.com james.holtman at convergys.com
Sun Dec 12 23:03:31 CET 2004





I was using 'object.size' to see how much memory a list was taking up.
After executing the command, I had thought that my computer had locked up.
After further testing, I determined that it was taking 241 seconds for
object.size to return a value.

I did notice in the release notes that 'object.size' did take longer when
the list contained character vectors.  Is the time that it is taking
'object.size' to return a value to be expected for such a list?

Much better results were obtained when the character vectors were converted
to factors.


######  Results from the testing  ###################
> str(x.1)
List of 10
 $ : chr [1:227299] "sadc" "sar" "date" "ksh" ...
 $ : chr [1:227299] "aprperf" "aprperf" "aprperf" "aprperf" ...
 $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 62608 67968    29 10208 13128 ...
 $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ...

# takes a long time (241 seconds) to report the size
> gc();system.time(print(object.size(x.1)))
          used (Mb) gc trigger  (Mb)
Ncells  711007 19.0    2235810  59.8
Vcells 5191294 39.7   14409257 110.0
[1] 34154972
[1] 241.07   0.00 241.08     NA     NA

# trying list of 1000
> x.2 <- list.subset(x.1, 1:1000);gc();system.time(print(object.size(x.2)))
          used (Mb) gc trigger  (Mb)
Ncells  711006 19.0    2235810  59.8
Vcells 4300288 32.9   14409257 110.0
[1] 145860
[1] 0.01 0.00 0.01   NA   NA

# trying list of 10,000
> x.2 <- list.subset(x.1,
1:10000);gc();system.time(print(object.size(x.2)))
          used (Mb) gc trigger  (Mb)
Ncells  711006 19.0    2235810  59.8
Vcells 4381288 33.5   14409257 110.0
[1] 1491948
[1] 0.28 0.00 0.28   NA   NA

# list of 40,000
> x.2 <- list.subset(x.1,
1:40000);gc();system.time(print(object.size(x.2)))
          used (Mb) gc trigger  (Mb)
Ncells  711006 19.0    2235810  59.8
Vcells 4651288 35.5   14409257 110.0
[1] 5988460
[1] 7.15 0.00 7.15   NA   NA

# list of 60,000
> x.2 <- list.subset(x.1,
1:60000);gc();system.time(print(object.size(x.2)))
          used (Mb) gc trigger  (Mb)
Ncells  711006 19.0    2235810  59.8
Vcells 4831288 36.9   14409257 110.0
[1] 9001556
[1] 17.33  0.00 17.32    NA    NA

# list of 100,000
> x.2 <- list.subset(x.1,
1:100000);gc();system.time(print(object.size(x.2)))
          used (Mb) gc trigger  (Mb)
Ncells  711006 19.0    2235810  59.8
Vcells 5191288 39.7   14409257 110.0
[1] 15044780
[1] 51.85  0.00 51.86    NA    NA

# list structure of the last object
> str(x.2)
List of 10
 $ : chr [1:100000] "sadc" "sar" "date" "ksh" ...
 $ : chr [1:100000] "aprperf" "aprperf" "aprperf" "aprperf" ...
 $ : num [1:100000] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:100000] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:100000] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:100000] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:100000] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:100000] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:100000] 62608 67968    29 10208 13128 ...
 $ : num [1:100000] 0 1 0 0 1 0 0 0 0 0 ...

# with the first two items on the list converted to factors,
#     'object.size' performs a lot better
> str(x.1)
List of 10
 $ : Factor w/ 175 levels "#bpbkar","#bpcd",..: 132 133 60 93 13 160 60 84
60 132 ...
 $ : Factor w/ 8 levels "apra3g","aprperf",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ...
 $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ...
 $ : num [1:227299] 62608 67968    29 10208 13128 ...
 $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ...
> system.time(print(object.size(x.1)))  # now it is fast
[1] 16374176
[1]  0  0  0 NA NA

> version
         _
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    2
minor    0.1
year     2004
month    11
day      15
language R
>
__________________________________________________________
James Holtman        "What is the problem you are trying to solve?"
Executive Technical Consultant  --  Office of Technology, Convergys
james.holtman at convergys.com
+1 (513) 723-2929
--
"NOTICE:  The information contained in this electronic mail ...{{dropped}}




More information about the R-help mailing list