[Rd] feature request: comment character in read.table?
Ben Bolker
bolker@zoo.ufl.edu
Thu, 13 Sep 2001 19:04:49 -0400 (EDT)
On Thu, 13 Sep 2001, Peter Kleiweg wrote:
[snip]
>
> That is not very robust. What about these:
>
1> # a comment
2> 1 2 3 # a comment
3> # a comment
4> "1" "2" "3 # not a comment"
5> "# not a comment" # a comment
>
> Comments don't have to start at the first column, and comments
> can also exist after real data. A comment char within a string
> should not be taken as the start of a comment, and you also have
> to take into account that the tokens delimiting a string can
> vary.
>
As Brian Ripley has pointed out, he hopes to do this at a lower level,
more robustly, later. In the meantime, in my defense: this code works for
lines 1, 2, and 3 (it's OK with comments that start after the first column
and that exist after real data -- that was part of my spec). It doesn't
deal with comment characters embedded in quoted strings, but I don't have
any problem with telling people that they're not allowed to have comment
characters in quoted strings in their data -- it seems to be a perfectly
reasonable restriction.
If I wanted to hack this further I would probably try to do a strsplit
on quotation characters, and look for comment characters only in the odd
parts of the split. And if someone puts
"\"\\"\\\" ## " "#" "\\ \" \#"
in their data file, then they deserve what they get ... :-)
Ben Bolker
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._