[Bioc-devel] All Package Maintainers Please README!

Florian Hahne fhahne at fhcrc.org
Fri Oct 26 20:02:07 CEST 2007

Marc Carlson schrieb:
> James W. MacDonald wrote:
>> I'm not sure I like this idea, mainly because it is a passive
>> filtering on the part of the package maintainer, and a pretty big
>> assumption on my part.
>> Something active like adding an INTERNAL or as Oleg mentioned, a # at
>> the beginning of the line signals to me that the maintainer really
>> wants this stuff filtered out, whereas only taking the first line
>> requires the assumption that the maintainer really only wanted the
>> first line to be public.
>> Also note that a line in this instance is demarcated by a newline
>> character, so as long as you don't hit the return key, any number of
>> sentences will be filtered by a single INTERNAL or #, so it shouldn't
>> be burdensome to filter things out.
>> Anyway, I'm not sure we should be filtering this stuff out regardless.
>> As an example here are some recent commit messages:
>> First from Biobase:
>> esApply for ExpressionSet
>> * made esApply(X, MARGIN, FUN, ...) a generic
>> * method for exprSet same as previous functoin esApply
>>   (inappropriately modifies FUN environment, breaking lexical scope)
>> * method for ExpressionSet equivalent to
>>   with(pData(X), apply(exprs(X), MARGIN, FUN, ...))
>> * Documentation to follow shortly
>> Or from AnnotationDBI:
>> Remove bad constraint from probes table on chip packages
>> This change only affects packages for rodents and humans.  It drops
>> the not null constraint from the accessions collumn which is
>> problematic since lousy platforms will sometimes have probes that are
>> measuring "who-knows what".  Users ought to have a right to know when
>> they are using a probe like this...
>> Do you really want to argue that the first line in these messages is
>> all that should show up in the changelog? In both cases the first line
>> is pretty cryptic, but the second line is actually quite useful.
>> Jim
>> Martin Morgan wrote:
>>> Jim --
>>> I was actually writing identical sentiment (from across the hall), so
>>> there's a second vote for just the first line (and filtering empty
>>> first lines). Martin
>>> Marc Carlson <mcarlson at fhcrc.org> writes:
>>>> James W. MacDonald wrote:
>>>>> Robert recently suggested that I make a stab at a blog-based
>>>>> changelog rather than the current monthly postings, sort of similar
>>>>> to what Duncan Murdoch has done with the R NEWS and windows CHANGELOG.
>>>>> The biggest difference between what is done for R and what I will
>>>>> be doing for BioC is this; R-core does a really good job of writing
>>>>> explanatory notes describing what the change was, and what it means
>>>>> for the end user.
>>>>> On the other hand, the commit messages that people use range from
>>>>> the ridiculous to the sublime. Since I will no longer be parsing
>>>>> the commit messages by hand, I will not be able to remove the more
>>>>> useless messages that people tend to use, and these things will go
>>>>> straight to the changelog for all to see.
>>>>> So, first thing; if you don't want your section of the changelog to
>>>>> be populated with things like 'WTF was I thinking?!@!?@!?' or
>>>>> 'Oops', or the venerable 'commit' or better yet, the ever popular '
>>>>> ', you will want to actually use a commit message that means
>>>>> something with respect to the commit you just made.
>>>>> Now I know some of the commit messages are not intended for public
>>>>> consumption, so there is a way out. If you prepend your commit
>>>>> message with INTERNAL, then it will be scrubbed. Or at least I
>>>>> think it will ;-D. I'm using Python for the first time to do the
>>>>> parsing, so I am sure there are bugs aplenty. Note that this
>>>>> INTERNAL thing is _by line_, so if you do something like:
>>>>> INTERNAL This is a commit message nobody should ever see.
>>>>> But they can see this one.
>>>>> Then the second part of the message _should_ get through. Note that
>>>>> you need to use INTERNAL exactly, as it is always possible that
>>>>> someone might use Internal at the beginning of a commit message
>>>>> that they want published, so I am not doing any case-changing on
>>>>> the test for this string.
>>>>> The changelog as it currently exists (with just one day of changes
>>>>> so far) can be viewed here:
>>>>> http://fgc.lsi.umich.edu/cgi-bin/blosxom.cgi
>>>>> Please take a look and send me any suggestions.
>>>>> Best,
>>>>> Jim
>>>> Hmmm,
>>>> Here in Seattle, Seth had most of us making commit messages where the
>>>> 1st  line was a brief title describing the major contents of the change
>>>> and then there would be a line break followed by any of the gory
>>>> details
>>>> that might be needed to carefully describe what the title meant.  I
>>>> like
>>>> this format because part of a commit message is to say briefly what
>>>> changes have taken place, and partly it's also a place to make personal
>>>> notes so that later on you can remember what you were thinking at the
>>>> time.  So my 1st point is that by habit some of us already separate
>>>> these two with a linebreak.
>>>> Partly because I adopted this habit already and partly because I don't
>>>> want to live in constant fear of what might slip into my commit
>>>> messages, it might be nice if you just captured the 1st section and
>>>> then
>>>> allowed us to tag any lines that fall below that linebreak with a
>>>> character if we want them to also be in the public eye (with the rest
>>>> remaining private by default)?  Of course I could also tag the lower
>>>> stuff, but then I am typing lots of extra characters with each commit.
>>>>     Marc
>>>> _______________________________________________
>>>> Bioc-devel at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> You have a point Jim, but I think that we also have to consider how this
> blog will change what we put into our commit messages.  Prior to this, I
> have primarily used the 1st line as a "title" and then followed up with
> a more detailed description.  But the existence of this blog means that
> I will lean towards putting more information in the 1st section (rather
> than just a title), and just shift any private information into the
> second part.
> I am really just suggesting that a standardized format separated by a
> clean line break would be the least amount of typing by everyone
> involved.  I like this because at least at the fhcrc we are already
> using this format, so it's very similar and the format seems to work
> pretty well.
> This won't change the fact that most people will still just type one
> cryptic line for most commits.  But at least those of use who want to
> put more personal data in there (purely for the sake of our personal
> recollection) will not be penalized by having to type lots of comment
> characters.
> Of course, I would also like to still have a title for my commits, so
> perhaps we should really have 3 sections?  A title, a public
> description, and then a private section which could all be separated by
> two line breaks?
>     Marc
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
This would make the commit messages less flexible compared to using a 
comment character. I often check in changes for several different things 
at once and it is way easier to read (and type) something like

  some change on foo
  # need to fix the docs for this guy
  bar can now handle....
  # but still can't...

This way I can comment the individual parts directly whereas if we 
separate internal and public stuff into two sections I have to reference 
again in the internal part what I'm talking about.

More information about the Bioc-devel mailing list