[Rd] Consistency of serialize(): please enlighten me

Henrik Bengtsson hb at stat.berkeley.edu
Fri Aug 31 21:45:34 CEST 2007


Hi,

I am puzzled with serialize().  It comes down generating identical
hash codes for (apparently) identical objects using digest::digest(),
which in turn relies on serialize().  Here is an example illustration
the issue:

ser <- function(object, ...) {
  list(
    names = names(object),
    namesRaw = charToRaw(names(object)),
    ser = serialize(names(object), connection=NULL, ascii=FALSE)
  )
} # ser()

# Object to be serialized
key <- key0 <- list(abc="Hello");

# Store results
d <- list();

# 1. As is
d[[1]] <- ser(key);

# 2. Set names and redo (hardwired: identical to what's already there)
names(key) <- "abc";
d[[2]] <- ser(key);

# 3. Set names and redo (generic: char->raw->char)
key <- key0;
names(key) <- sapply(names(key), FUN=function(name) rawToChar(charToRaw(name)));
d[[3]] <- ser(key);

# All names are identical
for (kk in 2:length(d))
  stopifnot(identical(d[[1]]$names, d[[kk]]$names));

# All raw names are identical
for (kk in 2:length(d))
  stopifnot(identical(d[[1]]$namesRaw, d[[kk]]$namesRaw));

# But, the serialized names differ.
print(identical(d[[1]]$ser, d[[2]]$ser));
print(identical(d[[1]]$ser, d[[3]]$ser));
print(identical(d[[2]]$ser, d[[3]]$ser));

So, it seems like there is some extra information in the names
attribute that is part of the serialization.  Is it possible to show
they differ at the R level?  What is that extra information?
Promises...?

Please enlighten me.

Henrik



More information about the R-devel mailing list