[Rd] write.dcf/read.dcf cycle converts missing entry to "NA" (PR#9796)
bill at insightful.com
bill at insightful.com
Tue Jul 17 18:58:10 CEST 2007
Full_Name: Bill Dunlap
Version: 2.5.0
OS: Red Hat Enterprise Linux WS release 3 (Taroon Update 6)
Submission from: (NULL) (24.17.60.30)
If you read a dcf file with read.dcf(file,fields=c("Field",...))
and the file does not contain the desired field "Field",
read.dcf puts a character NA for that entry in its output
matrix. If you then call write.dcf, passing it the output
of read.dcf(), it will write the entry "Field: NA". A subsequent
read.dcf() on write.dcf's output file will then have a "NA",
not a character NA, in the entry for "Field". I think that
write.dcf() should not write lines in the output file where
the input matrix contains a character NA.
Here is a test function to demonstrate the problem. It returns
TRUE when a write.dcf/read.dcf cycle does not change the data.
test.write.dcf <- function () {
origFile <- tempfile()
copyFile <- tempfile()
on.exit(unlink(c(origFile, copyFile)))
writeLines(c("Package: testA", "Version: 0.1-1", "Depends:", "",
"Package: testB", "Version: 2.1" , "Suggests: testA", "",
"Package: testC", "Version: 1.3.1", ""),
origFile)
orig <- read.dcf(origFile,
fields=c("Package","Version","Depends","Suggests"))
write.dcf(orig, copyFile, width = 72)
copy <- read.dcf(copyFile,
fields=c("Package","Version","Depends","Suggests"))
value <- all.equal(orig, copy)
if (!identical(value, TRUE)) {
attr(value, "orig") <- orig
attr(value, "copy") <- copy
}
value
}
Currently we get
> test.write.dcf()
[1] "'is.NA' value mismatch: 0 in current 4 in target"
attr(,"orig")
Package Version Depends Suggests
[1,] "testA" "0.1-1" "" NA
[2,] "testB" "2.1" NA "testA"
[3,] "testC" "1.3.1" NA NA
attr(,"copy")
Package Version Depends Suggests
[1,] "testA" "0.1-1" "" "NA"
[2,] "testB" "2.1" "NA" "testA"
[3,] "testC" "1.3.1" "NA" "NA"
With the attached write.dcf() it returns TRUE.
The diff would be
19,22c19,24
< eor <- character(nr * nc)
< eor[seq.int(1, nr - 1) * nc] <- "\n"
< writeLines(paste(formatDL(rep.int(colnames(x), nr), c(t(x)),
< style = "list", width = width, indent = indent), eor,
---
> tx <- t(x)
> not.na <- c(!is.na(tx))
> eor <- character(sum(not.na))
> eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
> writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
> style = "list", width = width, indent = indent)[not.na], eor,
and the entire function would be
`write.dcf` <-
function (x, file = "", append = FALSE, indent = 0.1 * getOption("width"),
width = 0.9 * getOption("width"))
{
if (!is.data.frame(x))
x <- data.frame(x)
x <- as.matrix(x)
mode(x) <- "character"
if (file == "")
file <- stdout()
else if (is.character(file)) {
file <- file(file, ifelse(append, "a", "w"))
on.exit(close(file))
}
if (!inherits(file, "connection"))
stop("'file' must be a character string or connection")
nr <- nrow(x)
nc <- ncol(x)
tx <- t(x)
not.na <- c(!is.na(tx))
eor <- character(sum(not.na))
eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
style = "list", width = width, indent = indent)[not.na], eor,
sep = ""), file)
}
More information about the R-devel
mailing list