[R] How to comment in R
Duncan Murdoch
murdoch at stats.uwo.ca
Wed Feb 11 22:37:03 CET 2009
On 2/11/2009 3:52 PM, Wacek Kusnierczyk wrote:
> Duncan Murdoch wrote:
>> On 2/11/2009 2:27 PM, Wacek Kusnierczyk wrote:
>>> Duncan Murdoch wrote:
>>>> On 2/11/2009 1:21 PM, Stavros Macrakis wrote:
>>>>> On Wed, Feb 11, 2009 at 12:32 PM, Greg Snow <Greg.Snow at imail.org>
>>>>> wrote:
>>>>>> ...The c-style of /* */ allows both types and you can comment out
>>>>>> part of a line, but it is not simple to match and has its own
>>>>>> restrictions. Friedl in his regular expressions book takes 10 pages
>>>>>> to develop a pattern to match these (and the final pattern is almost
>>>>>> 2 full lines of text in the book). And this is without allowing
>>>>>> nesting....
>>>>>
>>>>> Though there is a real debate about the value of multiline, possibly
>>>>> nested, comments, the regular expression argument is a red herring.
>>>>> Lexical analysis of multiline comments is a solved problem (and not a
>>>>> particularly difficult one!), and matters only to language and editor
>>>>> implementors. Emacs handles them with no problem.
>>>>
>>>> I agree about that. I think the lack of multiline comments comes from
>>>> design considerations rather than implementation ones. They're just
>>>> not needed, and having two types of comments would lead to weird
>>>> interactions, e.g. is the block comment closed in the lines below?
>>>>
>>>> /*
>>>> # */
>>>
>>> it's still a simple design decision which, once made, can be implemented
>>> coherently.
>>
>> I agree it's a design decision and once made implementation would be
>> easy, but I don't agree that it's a simple design decision. If it
>> was, you'd be able to tell me the obvious answer to my question. If
>> the answer isn't obvious, then it's a source of errors in programs
>> when people assume the wrong behaviour.
>
> these are things you shouldn't be willing to say. is it *obvious* that
> x[,1] should return a vector not a data frame or matrix (depending on
> what x is)? how *obvious* it is? how *obvious* will it be to a user
> who has little experience with r and will make best guesses? how
> *obvious* will it be to a user who comes from matlab or the like?
>
> i can tell you the obvious answer to your question: if tfm said that a
> # being the first non-whitespace character in a line appearing in an
> open comment block does not work as a single line comment tag, then the
> block comment above is closed. if tfm said otherwise, it would not be
> closed.
>
> again and again, one message pervasive on this list is: if you don't
> know, if something is not obvious*, rtfm carefully. here's your
> answer: decide how to treat overlapping mulitline and single-line
> comments, put into a man page, and you're done. just like with x[,1].
> and it doesn't really have to be obvious or intuitive -- just like with
> x[,1].
>
> * i'd add: the more something is obvious, the more you're advised to rtfm.
>
>
>>
>> In another message, you pointed out the ugly construct
>>
>> x <- "
>> # not a comment
>> "
>>
>> which arises because R allows multi-line string literals, and also
>> gives priority to string content over # comments. This is a bad
>> design, in my opinion. (There shouldn't be multi-line string literals.)
>
> it's a design choice which can be considered bad or good, depending on
> the point of view. you're sort of saying things you should not be
> willing to say, again. when i criticized r, with concrete examples, for
> the design being bad because of *incoherence*, i was promptly accused of
> being dogmatic, and the issues were explained away as design
> 'features'. now you're saying 'bad design' -- well, it's just a choice,
> not even leading to any incoherence, why would it have to be bad? i
> find it more convenient and readable to write
>
> x =
> "foo
> bar
> dee"
>
> rather than
>
> x = "foo\nbar\ndee"
>
>
> in perl you can write multiline string literals, whereby you include the
> newlines in the string, as in r. whcih is sort of redundant, since perl
> supports here documents. in python you can enter a string literal on
> multiple lines escaping the newlines, but the newlines do not belong in
> the string. which is good design, which is bad?
>
> btw. if you strike the tab key while writing a string literal in an r
> script, you get a tab character inside the string. is this bad design,
> too? how different are tabs from newlines? from spaces?
>
>
>> You probably haven't studied Rd syntax as closely as I have lately,
>
> fortunately, haven't at all.
>
>> but in an Rd file, # comments are treated as R treats them (i.e.
>> within a string they don't do anything), % comments are always in
>> effect. You should see how many people get confused by the handling
>> of % comments in Rd files. But that's not something we can change:
>> there are something like 100000 Rd files on CRAN.
>
> i can believe, it's pretty enough for me to see how many people get
> repeatedly confused by many other features in r.
>
>
>
>>
>> So if the behaviour isn't very obvious in cases like the one quoted
>> above, and if the new syntax doesn't add any new expressiveness, I
>> would be opposed to adding it.
>
> what about cases where the syntax is 'obvious' and yet r does the
> opposite of what one would expect?
We shouldn't add features like that either. In some cases we have (or S
did), and the thousands of CRAN packages mean we're stuck with most of them.
Duncan Murdoch
More information about the R-help
mailing list