[Bioc-devel] A new beginner

Martin Morgan mtmorgan at fhcrc.org
Thu Feb 17 16:47:38 CET 2011


On 02/17/2011 06:34 AM, Stefano Calza wrote:
> Ciao
> 
> you probably mean you have been programming using S3 methods not S4.
> 
> Using S4 methods is not compulsory though highly recommended. At
> least this is my understanding. There are packages in BioC not using
> S4, therefore I assume you can go ahead like this.

Actually, new package authors should really think of S4 as 'compulsory'.
Here are two reasons for this:

1. Classes provide a way to structure the complicated data that we
typically see in high throughput assays. For instance, coordinating
sample descriptions with expression values and thus minimizing mix-ups
when the user subsets one but not the other.

2. Classes provide a way for users to use different packages. For
instance the ExpressionSet returned by affy's justRMA can be used
directly by arrayQualityMetrics. This is both convenient for the user
and minimizes opportunities for error. For this reason it is often a
good strategy to re-use existing classes (like ExpressionSet in the
microarray world, or the classes in IRanges / Biostrings in the
sequencing world), rather than to invent new ones.

The S4 requirement is not meant to get in the way of high-quality
algorithms; a good strategy is to implement algorithms that operate on
basic data types (a matrix of expression values, for instance) but
expose these as methods on an S4 object.

In terms of examples, one possibility is the 'StudentGWAS' package we'll
use in a course here at the Hutch in the next two days; it'll become
available at

http://bioconductor.org/help/course-materials/2011/

soon. It implements a single class with essential methods (constructor,
accessors, show) and a method for doing something a little more
substantial, so it's not too complicated. Next choices would be Biobase
(something like AnnotatedDataFrame might be a good start) or for a more
advanced example IRanges.

limma does use S4 classes, e.g., RGlist, etc. While these classes are a
little 'loose' for my taste, they represent for the authors a compromise
between structuring data and implementing foundational algorithms, and
the package was developed at a time when S4 was more in flux than it is
currently.

Martin

> 
> Most packages in BioC use S4 methods, so just pick one not too
> complex!
> 
> regards
> 
> Stefano
> 
> On Thu, Feb 17, 2011 at 02:17:31PM +0000, Stefano Berri wrote: 
> <Stefano>Hi everybody. <Stefano> <Stefano>I am about to start
> assembling my code to make my first Bioconductor <Stefano>package. 
> <Stefano> <Stefano>I've read the instructions about "Package
> Guidelines" and "Package <Stefano>Submission" and I will try to
> follow those instruction the best I can. <Stefano> <Stefano>I have a
> first question, however. <Stefano>You seem to ask for your code to be
> in S4 Classes and Methods <Stefano> <Stefano>( Packages should also
> conform to the following: <Stefano>* Use S4 classes and methods.) 
> <Stefano> <Stefano>At the moment I wrote my code in the form 
> <Stefano> <Stefano>List <- myFunction(List, bar = bar, foo = foo) 
> <Stefano> <Stefano>Using "plain functions" and Lists as input and
> output. <Stefano>I was inspired by 'limma' that, as far as I
> understand, works this way. <Stefano> <Stefano>Can submit using this
> interface or shall I really use S4 implementation? If <Stefano>so,
> could you recommend a simple package that uses classes as you would 
> <Stefano>recomend that I can use as template/inspration/guide for my
> code? <Stefano> <Stefano>Thank you very much <Stefano> 
> <Stefano>Stefano Berri <Stefano> 
> <Stefano>_______________________________________________ 
> <Stefano>Bioc-devel at r-project.org mailing list 
> <Stefano>https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-devel mailing list