[Bioc-devel] A new beginner

Wolfgang Huber whuber at embl.de
Thu Feb 17 17:14:29 CET 2011


Hi

I find that S4 classes really help with writing robust, maintainable and 
elegant code, since they allow to put related data into one object and 
can automate much of the validity checking that would be very tedious 
e.g. with lists.

My mileage with S4 methods is more variable. If a certain method is only 
ever going to be used with one particular signature, one might as well 
implement it as a normal function, since the syntax for doing so and the 
debugging is simpler, and not much is lost in other respects.

There are, of course, also examples where S4 methods are very useful, 
like what Martin mentioned. Often there is one big, "substantial" 
function (S4 or not), which is wrapped by different S4-method 
definitions that do some datatype-specific pre- or postprocessing.

	Wolfgang








Martin Morgan scripsit 17/02/11 16:47:
> On 02/17/2011 06:34 AM, Stefano Calza wrote:
>> Ciao
>>
>> you probably mean you have been programming using S3 methods not S4.
>>
>> Using S4 methods is not compulsory though highly recommended. At
>> least this is my understanding. There are packages in BioC not using
>> S4, therefore I assume you can go ahead like this.
>
> Actually, new package authors should really think of S4 as 'compulsory'.
> Here are two reasons for this:
>
> 1. Classes provide a way to structure the complicated data that we
> typically see in high throughput assays. For instance, coordinating
> sample descriptions with expression values and thus minimizing mix-ups
> when the user subsets one but not the other.
>
> 2. Classes provide a way for users to use different packages. For
> instance the ExpressionSet returned by affy's justRMA can be used
> directly by arrayQualityMetrics. This is both convenient for the user
> and minimizes opportunities for error. For this reason it is often a
> good strategy to re-use existing classes (like ExpressionSet in the
> microarray world, or the classes in IRanges / Biostrings in the
> sequencing world), rather than to invent new ones.
>
> The S4 requirement is not meant to get in the way of high-quality
> algorithms; a good strategy is to implement algorithms that operate on
> basic data types (a matrix of expression values, for instance) but
> expose these as methods on an S4 object.
>
> In terms of examples, one possibility is the 'StudentGWAS' package we'll
> use in a course here at the Hutch in the next two days; it'll become
> available at
>
> http://bioconductor.org/help/course-materials/2011/
>
> soon. It implements a single class with essential methods (constructor,
> accessors, show) and a method for doing something a little more
> substantial, so it's not too complicated. Next choices would be Biobase
> (something like AnnotatedDataFrame might be a good start) or for a more
> advanced example IRanges.
>
> limma does use S4 classes, e.g., RGlist, etc. While these classes are a
> little 'loose' for my taste, they represent for the authors a compromise
> between structuring data and implementing foundational algorithms, and
> the package was developed at a time when S4 was more in flux than it is
> currently.
>
> Martin
>
>>
>> Most packages in BioC use S4 methods, so just pick one not too
>> complex!
>>
>> regards
>>
>> Stefano
>>
>> On Thu, Feb 17, 2011 at 02:17:31PM +0000, Stefano Berri wrote:
>> <Stefano>Hi everybody.<Stefano>  <Stefano>I am about to start
>> assembling my code to make my first Bioconductor<Stefano>package.
>> <Stefano>  <Stefano>I've read the instructions about "Package
>> Guidelines" and "Package<Stefano>Submission" and I will try to
>> follow those instruction the best I can.<Stefano>  <Stefano>I have a
>> first question, however.<Stefano>You seem to ask for your code to be
>> in S4 Classes and Methods<Stefano>  <Stefano>( Packages should also
>> conform to the following:<Stefano>* Use S4 classes and methods.)
>> <Stefano>  <Stefano>At the moment I wrote my code in the form
>> <Stefano>  <Stefano>List<- myFunction(List, bar = bar, foo = foo)
>> <Stefano>  <Stefano>Using "plain functions" and Lists as input and
>> output.<Stefano>I was inspired by 'limma' that, as far as I
>> understand, works this way.<Stefano>  <Stefano>Can submit using this
>> interface or shall I really use S4 implementation? If<Stefano>so,
>> could you recommend a simple package that uses classes as you would
>> <Stefano>recomend that I can use as template/inspration/guide for my
>> code?<Stefano>  <Stefano>Thank you very much<Stefano>
>> <Stefano>Stefano Berri<Stefano>
>> <Stefano>_______________________________________________
>> <Stefano>Bioc-devel at r-project.org mailing list
>> <Stefano>https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>


-- 


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioc-devel mailing list