[Rd] corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)

Gorjanc Gregor Gregor.Gorjanc at bfro.uni-lj.si
Mon Feb 14 02:54:53 CET 2005


Hello!

I posted on saturday mail with the same subject on r-help seeking
for help in my work, but now I realized that this list is more 
appropriate for this. I think I found I bug. Bellow are comments
and reproducible examples:

# Create a data frame
(tmp <- data.frame(y1=1:4, f1=factor(c("A", "B", "C", "D"))))
  y1 f1
1  1  A
2  2  B
3  3  C
4  4  D

# Add new column, which is not full (missing some data for last
# records)
tmp[1:2, "y2"] <- 2
tmp
  y1 f1   y2
1  1  A    2
2  2  B    2
3  3  C <NA>
4  4  D <NA>
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits) 

# Why did I get corrupted data frame? 

# Add new factor column, which is not full (missing some data for last
# records)
tmp[1:2, "f2"] <- tmp[1:2, "f1"]
tmp
  y1 f1   y2   f2
1  1  A    2    1
2  2  B    2    2
3  3  C <NA> <NA>
4  4  D <NA> <NA>
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs 
in: format.data.frame(x, digits = digits) 

# New column should have class factor, but got somehow converted to integer
class(tmp$f2)
[1] "integer"

# If new column is completely full, everything is OK
> tmp$f3 <- tmp$f1
> tmp
  y1 f1   y2   f2 f3
1  1  A    2    1  A
2  2  B    2    2  B
3  3  C <NA> <NA>  C
4  4  D <NA> <NA>  D
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs 
in: format.data.frame(x, digits = digits) 

# Let's go further and try to convert one of new numeric column 
# to factor
tmp$y2 <- factor(tmp$y2, labels="x")
tmp
  y1 f1 y2   f2 f3
1  1  A  x    1  A
2  2  B  x    2  B
3  3  C  x <NA>  C
4  4  D  x <NA>  D
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs 
in: format.data.frame(x, digits = digits)

# Why did also NAs get converted to level x?

# Let's continue and add additional column, which is again not
# full, but missing some data for first records
tmp[3:4, "y3"] <- 1
tmp
  y1 f1 y2   f2 f3 y3
1  1  A  x    1  A NA
2  2  B  x    2  B NA
3  3  C  x <NA>  C  1
4  4  D  x <NA>  D  1
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits) 

# Notice the difference between <NA> in previous example and
# NA in current one.

# Try to convert this to factor
tmp$y3 <- factor(tmp$y3, labels="y")
tmp
  y1 f1 y2   f2 f3   y3
1  1  A  x    1  A <NA>
2  2  B  x    2  B <NA>
3  3  C  x <NA>  C    y
4  4  D  x <NA>  D    y
Warning message: 
corrupt data frame: columns will be truncated or padded with NAs 
in: format.data.frame(x, digits = digits)

# Works as expected.
# My configuration:
Version:
 platform = i386-pc-mingw32
 arch = i386
 os = mingw32
 system = i386, mingw32
 status = 
 major = 2
 minor = 0.1
 year = 2004
 month = 11
 day = 15
 language = R

Windows XP Professional (build 2600) Service Pack 0.0

--
Lep pozdrav / With regards,
    Gregor GORJANC

---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty       URI: http://www.bfro.uni-lj.si
Zootechnical Department    email: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3                  tel: +386 (0)1 72 17 861
SI-1230 Domzale            fax: +386 (0)1 72 17 888
Slovenia



More information about the R-devel mailing list