[Rd] (PR#9796) write.dcf/read.dcf cycle converts missing entry
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Wed Jul 18 09:43:25 CEST 2007
BIll,
Thanks.
I am seeing some problems here, for example when all the fields are
missing, or all the fields in a row are missing. I've fixes for those,
and will commit to R-devel shortly.
On Tue, 17 Jul 2007, bill at insightful.com wrote:
> Full_Name: Bill Dunlap
> Version: 2.5.0
> OS: Red Hat Enterprise Linux WS release 3 (Taroon Update 6)
> Submission from: (NULL) (24.17.60.30)
>
>
> If you read a dcf file with read.dcf(file,fields=c("Field",...))
> and the file does not contain the desired field "Field",
> read.dcf puts a character NA for that entry in its output
> matrix. If you then call write.dcf, passing it the output
> of read.dcf(), it will write the entry "Field: NA". A subsequent
> read.dcf() on write.dcf's output file will then have a "NA",
> not a character NA, in the entry for "Field". I think that
> write.dcf() should not write lines in the output file where
> the input matrix contains a character NA.
>
> Here is a test function to demonstrate the problem. It returns
> TRUE when a write.dcf/read.dcf cycle does not change the data.
>
> test.write.dcf <- function () {
> origFile <- tempfile()
> copyFile <- tempfile()
> on.exit(unlink(c(origFile, copyFile)))
> writeLines(c("Package: testA", "Version: 0.1-1", "Depends:", "",
> "Package: testB", "Version: 2.1" , "Suggests: testA", "",
> "Package: testC", "Version: 1.3.1", ""),
> origFile)
> orig <- read.dcf(origFile,
> fields=c("Package","Version","Depends","Suggests"))
> write.dcf(orig, copyFile, width = 72)
> copy <- read.dcf(copyFile,
> fields=c("Package","Version","Depends","Suggests"))
> value <- all.equal(orig, copy)
> if (!identical(value, TRUE)) {
> attr(value, "orig") <- orig
> attr(value, "copy") <- copy
> }
> value
> }
> Currently we get
> > test.write.dcf()
> [1] "'is.NA' value mismatch: 0 in current 4 in target"
> attr(,"orig")
> Package Version Depends Suggests
> [1,] "testA" "0.1-1" "" NA
> [2,] "testB" "2.1" NA "testA"
> [3,] "testC" "1.3.1" NA NA
> attr(,"copy")
> Package Version Depends Suggests
> [1,] "testA" "0.1-1" "" "NA"
> [2,] "testB" "2.1" "NA" "testA"
> [3,] "testC" "1.3.1" "NA" "NA"
> With the attached write.dcf() it returns TRUE.
>
> The diff would be
> 19,22c19,24
> < eor <- character(nr * nc)
> < eor[seq.int(1, nr - 1) * nc] <- "\n"
> < writeLines(paste(formatDL(rep.int(colnames(x), nr), c(t(x)),
> < style = "list", width = width, indent = indent), eor,
> ---
>> tx <- t(x)
>> not.na <- c(!is.na(tx))
>> eor <- character(sum(not.na))
>> eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
>> writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
>> style = "list", width = width, indent = indent)[not.na], eor,
>
> and the entire function would be
>
> `write.dcf` <-
> function (x, file = "", append = FALSE, indent = 0.1 * getOption("width"),
> width = 0.9 * getOption("width"))
> {
> if (!is.data.frame(x))
> x <- data.frame(x)
> x <- as.matrix(x)
> mode(x) <- "character"
> if (file == "")
> file <- stdout()
> else if (is.character(file)) {
> file <- file(file, ifelse(append, "a", "w"))
> on.exit(close(file))
> }
> if (!inherits(file, "connection"))
> stop("'file' must be a character string or connection")
> nr <- nrow(x)
> nc <- ncol(x)
> tx <- t(x)
> not.na <- c(!is.na(tx))
> eor <- character(sum(not.na))
> eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
> writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
> style = "list", width = width, indent = indent)[not.na], eor,
> sep = ""), file)
> }
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list