[Bioc-devel] All Package Maintainers Please README!

Fri Oct 26 19:33:43 CEST 2007

James W. MacDonald wrote:
> I'm not sure I like this idea, mainly because it is a passive
> filtering on the part of the package maintainer, and a pretty big
> assumption on my part.
>
> Something active like adding an INTERNAL or as Oleg mentioned, a # at
> the beginning of the line signals to me that the maintainer really
> wants this stuff filtered out, whereas only taking the first line
> requires the assumption that the maintainer really only wanted the
> first line to be public.
>
> Also note that a line in this instance is demarcated by a newline
> character, so as long as you don't hit the return key, any number of
> sentences will be filtered by a single INTERNAL or #, so it shouldn't
> be burdensome to filter things out.
>
> Anyway, I'm not sure we should be filtering this stuff out regardless.
> As an example here are some recent commit messages:
>
> First from Biobase:
>
> esApply for ExpressionSet
>
> * made esApply(X, MARGIN, FUN, ...) a generic
>
> * method for exprSet same as previous functoin esApply
>   (inappropriately modifies FUN environment, breaking lexical scope)
>
> * method for ExpressionSet equivalent to
>
>   with(pData(X), apply(exprs(X), MARGIN, FUN, ...))
>
> * Documentation to follow shortly
>
> Or from AnnotationDBI:
>
> Remove bad constraint from probes table on chip packages
>
> This change only affects packages for rodents and humans.  It drops
> the not null constraint from the accessions collumn which is
> problematic since lousy platforms will sometimes have probes that are
> measuring "who-knows what".  Users ought to have a right to know when
> they are using a probe like this...
>
>
> Do you really want to argue that the first line in these messages is
> all that should show up in the changelog? In both cases the first line
> is pretty cryptic, but the second line is actually quite useful.
>
> Jim
>
>
>
> Martin Morgan wrote:
>> Jim --
>>
>> I was actually writing identical sentiment (from across the hall), so
>> there's a second vote for just the first line (and filtering empty
>> first lines). Martin
>>
>> Marc Carlson <mcarlson at fhcrc.org> writes:
>>
>>> James W. MacDonald wrote:
>>>> Robert recently suggested that I make a stab at a blog-based
>>>> changelog rather than the current monthly postings, sort of similar
>>>> to what Duncan Murdoch has done with the R NEWS and windows CHANGELOG.
>>>>
>>>> The biggest difference between what is done for R and what I will
>>>> be doing for BioC is this; R-core does a really good job of writing
>>>> explanatory notes describing what the change was, and what it means
>>>> for the end user.
>>>>
>>>> On the other hand, the commit messages that people use range from
>>>> the ridiculous to the sublime. Since I will no longer be parsing
>>>> the commit messages by hand, I will not be able to remove the more
>>>> useless messages that people tend to use, and these things will go
>>>> straight to the changelog for all to see.
>>>>
>>>> So, first thing; if you don't want your section of the changelog to
>>>> be populated with things like 'WTF was I thinking?!@!?@!?' or
>>>> 'Oops', or the venerable 'commit' or better yet, the ever popular '
>>>> ', you will want to actually use a commit message that means
>>>> something with respect to the commit you just made.
>>>>
>>>> Now I know some of the commit messages are not intended for public
>>>> consumption, so there is a way out. If you prepend your commit
>>>> message with INTERNAL, then it will be scrubbed. Or at least I
>>>> think it will ;-D. I'm using Python for the first time to do the
>>>> parsing, so I am sure there are bugs aplenty. Note that this
>>>> INTERNAL thing is _by line_, so if you do something like:
>>>>
>>>> INTERNAL This is a commit message nobody should ever see.
>>>>
>>>> But they can see this one.
>>>>
>>>> Then the second part of the message _should_ get through. Note that
>>>> you need to use INTERNAL exactly, as it is always possible that
>>>> someone might use Internal at the beginning of a commit message
>>>> that they want published, so I am not doing any case-changing on
>>>> the test for this string.
>>>>
>>>> The changelog as it currently exists (with just one day of changes
>>>> so far) can be viewed here:
>>>>
>>>> http://fgc.lsi.umich.edu/cgi-bin/blosxom.cgi
>>>>
>>>> Please take a look and send me any suggestions.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>   
>>> Hmmm,
>>> Here in Seattle, Seth had most of us making commit messages where the
>>> 1st  line was a brief title describing the major contents of the change
>>> and then there would be a line break followed by any of the gory
>>> details
>>> that might be needed to carefully describe what the title meant.  I
>>> like
>>> this format because part of a commit message is to say briefly what
>>> changes have taken place, and partly it's also a place to make personal
>>> notes so that later on you can remember what you were thinking at the
>>> time.  So my 1st point is that by habit some of us already separate
>>> these two with a linebreak.
>>>
>>> Partly because I adopted this habit already and partly because I don't
>>> want to live in constant fear of what might slip into my commit
>>> messages, it might be nice if you just captured the 1st section and
>>> then
>>> allowed us to tag any lines that fall below that linebreak with a
>>> character if we want them to also be in the public eye (with the rest
>>> remaining private by default)?  Of course I could also tag the lower
>>> stuff, but then I am typing lots of extra characters with each commit.
>>>
>>>     Marc
>>>
>>> _______________________________________________
>>> Bioc-devel at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
You have a point Jim, but I think that we also have to consider how this
blog will change what we put into our commit messages.  Prior to this, I
have primarily used the 1st line as a "title" and then followed up with
a more detailed description.  But the existence of this blog means that
I will lean towards putting more information in the 1st section (rather
than just a title), and just shift any private information into the
second part.

I am really just suggesting that a standardized format separated by a
clean line break would be the least amount of typing by everyone
involved.  I like this because at least at the fhcrc we are already
using this format, so it's very similar and the format seems to work
pretty well.

This won't change the fact that most people will still just type one
cryptic line for most commits.  But at least those of use who want to
put more personal data in there (purely for the sake of our personal
recollection) will not be penalized by having to type lots of comment
characters.

Of course, I would also like to still have a title for my commits, so
perhaps we should really have 3 sections?  A title, a public
description, and then a private section which could all be separated by
two line breaks?

    Marc