[Rd] bug: write.dcf converts hyphen in field name to period
Michael Chirico
m|ch@e|ch|r|co4 @end|ng |rom gm@||@com
Fri Aug 2 09:29:34 CEST 2019
write.dcf(list('my-field' = 1L), tmp <- tempfile())
cat(readLines(tmp))
# my.field: 1
However there's nothing wrong with hyphenated fields per the Debian
standard:
https://www.debian.org/doc/debian-policy/ch-controlfields.html
And in fact we see them using hyphenated fields there, and indeed read.dcf
handles this just fine:
writeLines(gsub('.', '-', readLines(tmp), fixed = TRUE), tmp)
read.dcf(tmp)
# my-field
# [1,] "1"
The guilty line is as.data.frame:
if(!is.data.frame(x)) x <- as.data.frame(x, stringsAsFactors = FALSE)
For my case, simply adding check.names=FALSE to this call would solve the
issue in my case, but I think not in general. Here's what I see in the
standard:
> The field name is composed of US-ASCII characters excluding control
characters, space, and colon (i.e., characters in the ranges U+0021 (!)
through U+0039 (9), and U+003B (;) through U+007E (~), inclusive). Field
names must not begin with the comment character (U+0023 #), nor with the
hyphen character (U+002D -).
This could be handled by an adjustment to the next line:
nmx <- names(x)
becomes
nmx <- gsub('^[#-]', '', gsub('[^\U{0021}-\U{0039}\U{003B}-\U{007E}]', '.',
names(x)))
(Or maybe errors for having invalid names)
Michael Chirico
[[alternative HTML version deleted]]
More information about the R-devel
mailing list