[Rd] Notes on bug reports 3229 and 3242 - as.matrix.data.frame

Gorjanc Gregor Gregor.Gorjanc at bfro.uni-lj.si
Fri Feb 11 15:41:31 CET 2005


Hello R developers.

I encountered the same problem as Uwe Ligges with as.matrix.data.frame()
in bug reports 3229 and 3242 - under section not-reproducible. 

Example I have is:

> tmp
                             level 2100-D
1       biological_process unknown     NA
2                 cellular process  -5.88
3                      development  -8.42
4            physiological process  -6.55
5 regulation of biological process     NA
6                 viral life cycle     NA

> str(tmp)
`data.frame':   6 obs. of  2 variables:
 $ level      : Factor w/ 6 levels "biological_..",..: 1 2 3 4 5 6
 $ 2100-D_mean:`data.frame':    6 obs. of  1 variable:
  ..$ 2100-D: num  NA -5.88 -8.42 -6.55 NA NA

> as.matrix.data.frame(tmp)
Error in as.matrix.data.frame(tmp) : dim<- : dims [product 6] do not 
match the length of object [7]

The error associated with this is comming up at the end of function
as.matrix.data.frame where it is used:

    dim(X) <- c(n, length(X)/n)

?dim says
     'dim' has a method for 'data.frame's, which returns the length of
     the 'row.names' attribute of 'x' and the length of 'x' (the
     numbers of "rows" and "columns").

This part is ok. The problem is with X, which is "intensively"
modified through the function. Before this (dim(X) <- ...) call
X in my case is:

> x <- tmp
> "code from as.matrix.data.frame down to dim(X) <- ..."
> X
[[1]]
[1] "biological_process unknown"

[[2]]
[1] "cellular process"

[[3]]
[1] "development"

[[4]]
[1] "physiological process"

[[5]]
[1] "regulation of biological process"

[[6]]
[1] "viral life cycle"

[[7]]
[1]    NA -5.88 -8.42 -6.55    NA    NA

So we can see, that X is somehow destroyed - the first and second
column of tmp differ. For dim command this should really be one 
long vector. So the problem lies in line

    X <- unlist(X, recursive = FALSE, use.names = FALSE)

where it should be 

    X <- unlist(X, recursive = TRUE, use.names = FALSE)
                               ^^^^

I have checked source code for that function from R as well as
in R-devel sources. I was not succesfull in reproducing the above
with the data frame bellow though. It did not report any problems
with old as.matrix.data.frame. There must be some trick with 
first column in my data. So I am quite sure my suggestion is
OK.

tmp1 <- data.frame(level=c("A A", "B B"), x=c(NA, -5.8))

--
Lep pozdrav / With regards,
    Gregor GORJANC

---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty       URI: http://www.bfro.uni-lj.si
Zootechnical Department    email: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3                  tel: +386 (0)1 72 17 861
SI-1230 Domzale            fax: +386 (0)1 72 17 888
Slovenia



More information about the R-devel mailing list