[BioC] An ESR essay about software design, and how it applies to Bioconductor

Naomi Altman naomi at stat.psu.edu
Sat Feb 28 06:16:10 MET 2004


I would say Bioconductor is not extremely friendly, but neither is it 
extremely unfriendly to statisticians.

I have had a number of statistics graduate students and others at the same 
level, get programs working for me with very little of my input.  None of 
them had much prior experience with R.

I do know R/Splus well enough to write primitive functions, so I can look 
at code when the programs do not work as expected.  However, I am not very 
comfortable working with the objects and "slots" created by the 
Bioconductor software.  Nevertheless, by using the Vignettes and on-line 
help, I have managed to write my own software that use these objects.  I 
have had some problems, but by ploughing through this e-mail list I have 
solved most by myself.

Most of my biologist collaborators are struggling more than I am, but still 
getting things to work - sometimes with my help and often without.  All of 
us are finding the responsiveness of people on this list to be extremely 
helpful.

Yes, we could probably do some things faster with commercial software.  But 
then we would not get the most up-to-date methods, easy access to the 
functionality of R, and the benefit of interaction on this list.   I have 
heard of vendors charging in the 30K range for small parts of the 
functionality that is already built into Bioconductor.

In conclusion - my thanks to the developers and to people who answer 
questions posted to this list.  Of course, if anyone wants to improve the 
documentation, I am all for it.

--Naomi


At 04:00 PM 2/27/2004, Vincent Carey 525-2265 wrote:
> > friendly", and even "adequately documented" may be completely different
> > from the rest of humanity!  I immediately thought of certain BioC
> > packages I've recently bashed my head over (and over and over).
>
>the developers are fairly responsive to questions
>
> >
> > At the end of the essay ESR presents a checklist for telling whether
> > your software suffers from problems similar to the ones he describes.
> > For the benefit of any package developers/maintainers who may still be
> > reading this, here's my version of that checklist as revised
> > specifically for Bioconductor:
> >
> >    1. What does the package look like to a computer person who isn't a
> >       statistician or a statistician who isn't a computer person? What
> >       would be the most obvious thing someone unfamiliar with your
> >       package would try to use it for... and if they did, would they
> >       succeed after having done nothing more than read the manpage?
>
>we've taken care to develop a "vignette" protocol in addition
>to man pages so that the user may get a holistic view of a software
>component's roles.  all bioc packages have vignettes.  admittedly
>these are not perfect but they help to illustrate and test
>interoperability.
>
> >    2. Is there any dialogue in the Tcl widgets which is a dead end,
> >       without giving guidance on what the choices actually do? (although
> >       if you read ESR's essay you might conclude that there's no point
> >       to even having widgets, since a GUI does not automatically
> >       translate into user friendliness)
>
>some widgets are extremely useful.  no essay would convince
>me to eliminate them.  there is clearly scope for improvement
>with some of them.  we have taken care to provide widgetbuilding
>tools so that user/developers dissatisfied with the behavior
>of a given widget can try to design one that is more effective.
>
> >    3. The requirement that end-users read documentation is NOT a sign of
> >       failure for a program such as R which mostly lacks a UI... but...
> >           * Is every argument, method, and slot of every non-private
> >             object documented in the manpage
> >             *for that object* (rather than referring to some other
> >             manpage which in turn refers to another manpage, ad nauseum)?
>
>that is the intention of the documentation validation protocol
>of R CMD check.  it can be subverted, and when it is, we try
>to remedy it.
>
> >           * Are the usage examples you give in the manpage simple,
> >             general, and comprehensible both to statisticians who aren't
> >             computer people and computer people who aren't
> >             statisticians? Hint: gratuitous use of functions that aren't
> >             from the package you're documenting reduces comprehensibility.
>
>perhaps not.  perhaps you have a better example to contribute.
>again the vignettes help to provide context.  there is also
>a browser for vignettes called vExplorer
>
> >           * Does the documentation rely on references to hardcopy
> >             publications to explain crucial portions of the object's
> >             functionality instead of using external references as
> >             supplementary/background material?
>
>perhaps.  we have limited resources for what we are doing and
>sometimes a demand must be made on the user or reader to
>obtain an explanatory resource.
>
> >           * If there is a significant number of usage scenarios where
> >             the default argument values will be inappropriate, is the
> >             user warned?
> >           * Are the manpages in sync with the current package version?
>
>they should be, and there are mechanisms for verifying this.
>
> >    4. Do you ever find yourself using any phrase resembling "The syntax
> >       is just like it is for the S-Plus version"?
>
>no.
>
> >    5. Does your project welcome and respond to usability feedback from
> >       non-expert users?
>
>yes.
>
> >    6. Do error messages give enough information to be able to
> >       distinguish between malformed input/arguments, platform
> >       limitations (memory, drive space, access permissions), problems in
> >       R itself, and other ("other" presumably being the real bugs)?
>
>in many cases, yes.  in other cases, no.  provide resources
>so that we can add programming effort to exceptionhandling
>features and this situation will improve.
>
> >
> > Thank you for your patience in reading this. I don't pretend to
> > understand the technical complexity of your work, nor your motivations
> > for doing it. However, if you do write open source software such as
> > Bioconductor packages, it would be logical to at least assume that you
> > want other people to use your software. Hopefully the above
> > considerations will assist in making that happen.
>
>it is happening.
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list