[Rd] BOD causes error in 2.4.0
Martin Maechler
maechler at stat.math.ethz.ch
Fri Aug 11 09:11:42 CEST 2006
>>>>> "Gabor" == Gabor Grothendieck <ggrothendieck at gmail.com>
>>>>> on Thu, 10 Aug 2006 11:34:48 -0400 writes:
Gabor> Using "R version 2.4.0 Under development (unstable)
Gabor> (2006-08-08 r38825)" on Windows XP and starting in a
Gabor> fresh session we get an error if we type BOD. (There
Gabor> is no error in "Version 2.3.1 Patched (2006-06-04
Gabor> r38279)".)
>> BOD
Gabor> Error in data.frame(Time = c("1", "2"), demand = c(" 8.3", "10.3"),
Gabor> check.names = FALSE, :
Gabor> row names contain missing values
Gabor> In addition: Warning message:
Gabor> corrupt data frame: columns will be truncated or padded with NAs in:
Gabor> format.data.frame(x, digits = digits, na.encode = FALSE)
Yes, thank you Gabor.
At first, this it's peculiar that our standard checks haven't
detected this bug themselvs, since the help page of BOD uses
BOD without any error..
Indeed the error happens in format.data.frame() which is called
from print.data.frame.
Interestingly, good old str() "works" - and quite interestingly
> str(BOD)
'data.frame': 2 obs. of 2 variables:
$ Time : num 1 2 3 4 5 7
$ demand: num 8.3 10.3 19 16 15.6 19.8
- attr(*, "reference")= chr "A1.4, p. 270"
note the '2 obs' observations part when there obviously are 6 of
them ...
Now, if you really inspect the object,
> dput(BOD, control = "all")
structure(list(Time = c(1, 2, 3, 4, 5, 7), demand = c(8.3, 10.3,
19, 16, 15.6, 19.8)), .Names = c("Time", "demand"), row.names = c(NA,
6), class = "data.frame", reference = "A1.4, p. 270")
it becomes more clear: the row.names have really become a mess,
where they should have been (as in R <= 2.3.x)
the equivalent of
row.names = c("1", "2", "3", "4", "5", "6")
Now if you look at the source code,
<..R..>/src/library/datasets/data/BOD.R
you'll see that `bug' is already in the source : it has
row.names = c(NA, 6),
explicitly there.
Of course this has something to do with the new R-devel feature
of storing rownames ``compressedly'' when they are equivalent to
as.character(1:n)
and I assume the c(NA, 6) used to be a trick for making the
row.names `compressed' - however the trick was not working correctly.
I've temporarily fixed the problem by putting
row.names = as.character(1:6),
there.
Thanks again for the report.
Martin Maechler, ETH Zurich
More information about the R-devel
mailing list