[R] How to comment in R

Duncan Murdoch murdoch at stats.uwo.ca
Wed Feb 11 22:37:03 CET 2009


On 2/11/2009 3:52 PM, Wacek Kusnierczyk wrote:
> Duncan Murdoch wrote:
>> On 2/11/2009 2:27 PM, Wacek Kusnierczyk wrote:
>>> Duncan Murdoch wrote:
>>>> On 2/11/2009 1:21 PM, Stavros Macrakis wrote:
>>>>> On Wed, Feb 11, 2009 at 12:32 PM, Greg Snow <Greg.Snow at imail.org>
>>>>> wrote:
>>>>>> ...The c-style of /* */ allows both types and you can comment out
>>>>>> part of a line, but it is not simple to match and has its own
>>>>>> restrictions.  Friedl in his regular expressions book takes 10 pages
>>>>>> to develop a pattern to match these (and the final pattern is almost
>>>>>> 2 full lines of text in the book).  And this is without allowing
>>>>>> nesting....
>>>>>
>>>>> Though there is a real debate about the value of multiline, possibly
>>>>> nested, comments, the regular expression argument is a red herring.
>>>>> Lexical analysis of multiline comments is a solved problem (and not a
>>>>> particularly difficult one!), and matters only to language and editor
>>>>> implementors.  Emacs handles them with no problem.
>>>>
>>>> I agree about that.  I think the lack of multiline comments comes from
>>>> design considerations rather than implementation ones.   They're just
>>>> not needed, and having two types of comments would lead to weird
>>>> interactions, e.g. is the block comment closed in the lines below?
>>>>
>>>>   /*
>>>> #  */
>>>
>>> it's still a simple design decision which, once made, can be implemented
>>> coherently.
>>
>> I agree it's a design decision and once made implementation would be
>> easy, but I don't agree that it's a simple design decision.  If it
>> was, you'd be able to tell me the obvious answer to my question.  If
>> the answer isn't obvious, then it's a source of errors in programs
>> when people assume the wrong behaviour.
> 
> these are things you shouldn't be willing to say.  is it *obvious* that
> x[,1] should return a vector not a data frame or matrix (depending on
> what x is)?  how *obvious* it is?  how *obvious* will it be to a user
> who has little experience with r and will make best guesses?  how
> *obvious* will it be to a user who comes from matlab or the like?
> 
> i can tell you the obvious answer to your question:  if tfm said that a
> # being the first non-whitespace character in a line appearing in an
> open comment block does not work as a single line comment tag, then the
> block comment above is closed.  if tfm said otherwise, it would not be
> closed.
> 
> again and again, one message pervasive on this list is:  if you don't
> know, if something is not obvious*, rtfm carefully.  here's your
> answer:  decide how to treat overlapping mulitline and single-line
> comments, put into a man page, and you're done.  just like with x[,1]. 
> and it doesn't really have to be obvious or intuitive -- just like with
> x[,1].
> 
> * i'd add:  the more something is obvious, the more you're advised to rtfm.
> 
> 
>>
>> In another message, you pointed out the ugly construct
>>
>> x <- "
>> # not a comment
>> "
>>
>> which arises because R allows multi-line string literals, and also
>> gives priority to string content over # comments.  This is a bad
>> design, in my opinion.  (There shouldn't be multi-line string literals.) 
> 
> it's a design choice which can be considered bad or good, depending on
> the point of view.  you're sort of saying things you should not be
> willing to say, again.  when i criticized r, with concrete examples, for
> the design being bad because of *incoherence*, i was promptly accused of
> being dogmatic, and the issues were explained away as design
> 'features'.  now you're saying 'bad design' -- well, it's just a choice,
> not even leading to any incoherence, why would it have to be bad?  i
> find it more convenient and readable to write
> 
> x =
> "foo
> bar
> dee"
> 
> rather than
> 
> x = "foo\nbar\ndee"
> 
> 
> in perl you can write multiline string literals, whereby you include the
> newlines in the string, as in r.  whcih is sort of redundant, since perl
> supports here documents.  in python you can enter a string literal on
> multiple lines escaping the newlines, but the newlines do not belong in
> the string.  which is good design, which is bad? 
> 
> btw. if you strike the tab key while writing a string literal in an r
> script, you get a tab character inside the string.  is this bad design,
> too?  how different are tabs from newlines?  from spaces?
> 
> 
>> You probably haven't studied Rd syntax as closely as I have lately, 
> 
> fortunately, haven't at all.
> 
>> but in an Rd file, # comments are treated as R treats them (i.e.
>> within a string they don't do anything), % comments are always in
>> effect.  You should see how many people get confused by the handling
>> of % comments in Rd files.  But that's not something we can change: 
>> there are something like 100000 Rd files on CRAN.
> 
> i can believe, it's pretty enough for me to see how many people get
> repeatedly confused by many other features in r.
> 
> 
> 
>>
>> So if the behaviour isn't very obvious in cases like the one quoted
>> above, and if the new syntax doesn't add any new expressiveness, I
>> would be opposed to adding it.
> 
> what about cases where the syntax is 'obvious' and yet r does the
> opposite of what one would expect?

We shouldn't add features like that either.  In some cases we have (or S 
did), and the thousands of CRAN packages mean we're stuck with most of them.

Duncan Murdoch




More information about the R-help mailing list