[Bioc-devel] Request to add 'normalize' to BiocGenerics

Schalkwyk, Leonard leonard.schalkwyk at kcl.ac.uk
Wed Feb 20 17:32:44 CET 2013



Is this not just an indication that normalize is now a poor choice of a function name?

LEo

On 20 Feb 2013, at 16:14, Wolfgang Huber wrote:

> Hi
> 
> is it clear that all these different functions (methods) share similar semantics and enough (conceptually) of their interface?
> 
> Wouldn't the implication be that preemptively every possible string of characters should already be defined as a generic function in BiocGenerics?
> 
> 	Best wishes
> 	Wolfgang
> 
> Il giorno Feb 20, 2013, alle ore 11:04 AM, Laurent Gatto <lg390 at cam.ac.uk> ha scritto:
> 
>> On 19 February 2013 22:44, Hervé Pagès <hpages at fhcrc.org> wrote:
>>> Hi Laurent, and maintainers of packages with a normalize() function,
>>> 
>>> 
>>> On 02/15/2013 04:28 AM, Laurent Gatto wrote:
>>>> 
>>>> A quick (and incomplete) manual search using
>>>> http://search.bioconductor.jp/ suggest the following usage of
>>>> normalize:
>>>> 
>>>> As a function:
>>>> xps::normalize
>>>> codelink::normalize
>>>> EBImage::normalize
>>>> diffGeneAnalysis::normalize
>>>> 
>>>> Defining a generic and methods:
>>>> oligo::normalize
>>>> flowCore::normalize
>>>> MSnbase::normalize
>>>> isobar::normalize
>>>> 
>>>> and
>>>> 
>>>> several normalize\.[*+] functions
>>>> 
>>>> Would it be reasonable to add a normalize generic definition to
>>>> BiocGenerics?  The generic definitions in the above packages differ,
>>>> however.
>>> 
>>> 
>>> Sounds good to me.
>>> 
>>> However, since the various normalize() functions have different
>>> signatures, we need to agree on what the signature of the generic
>>> in BiocGenerics should be.
>>> 
>>> Here is a summary of the situation:
>>> 
>>> ** xps package: normalize() is an ordinary function with the
>>>    following arg list:
>>> 
>>>      normalize(xps.data, filename=character(0), filedir=getwd(),
>>>                tmpdir="", update=FALSE, select="all", method="mean",
>>>                option="transcript:all", logbase="0", exonlevel="",
>>>                refindex=0, refmethod="mean", params=list(0.02, 0),
>>>                add.data=TRUE, verbose=TRUE)
>>> 
>>>    The package also defines normalize.constant(), normalize.lowess(),
>>>    normalize.quantiles(), normalize.supsmu(), which are also ordinary
>>>    functions (not S3 methods), and have similar but slightly
>>>    different arg lists.
>>> 
>>> ** codelink package: Ordinary function with the following args:
>>> 
>>>      normalize(object, method="quantiles", log.it=TRUE,
>>>                preserve=FALSE, weights=NULL, verbose=FALSE)
>>> 
>>> ** EBImage package: Ordinary function with the following args:
>>> 
>>>      normalize(x, separate=TRUE, ft=c(0, 1))
>>> 
>>> ** diffGeneAnalysis package: Ordinary function with the following
>>>    args:
>>> 
>>>      normalize(rawdata, numSlides, ctrl, expm, ctrlbg=0.30,
>>>                expmbg=0.30)
>>> 
>>> ** deepSNV package: S4 generic with the following args:
>>> 
>>>      normalize(test, control, ...)
>>> 
>>> ** isobar package: S4 generic with the following args:
>>> 
>>>      normalize(x, f=median, target="intensity", exclude.protein=NULL,
>>>                   use.protein=NULL, f.doapply=TRUE, log=TRUE,
>>>                   channels=NULL, na.rm=FALSE, per.file=TRUE, ...)
>>> 
>>> ** affy package: S4 generic with the following args:
>>> 
>>>      normalize(object, ...)
>>> 
>>> ** flowCore package: S4 generic with the following args:
>>> 
>>>      normalize(data, x, ...)
>>> 
>>> ** MSnbase package: S4 generic with the following args:
>>> 
>>>      normalize(object, method, ...)
>>> 
>>> ** oligo package: S4 generic with the following args:
>>> 
>>>      normalize(object, method=normalizationMethods(),
>>>                copy=TRUE, subset=NULL,
>>>                target='core', verbose=TRUE, ...)
>>> 
>>> So it looks like the greatest common factor is normalize(x, ...).
>>> Not too surprising for a generic that covers such a wide range of
>>> related but slightly different concepts / algorithms.
>>> 
>>> One technical difficulty though is that, even though almost all these
>>> functions seem to take an S4 object as their 1st arg, some of them
>>> don't:
>>> 
>>> (a) For EBImage::normalize(), 'x' can be an ordinary array in
>>>     addition to an Image object.
>>> 
>>> (b) For diffGeneAnalysis::normalize(), 'rawdata' is an ordinary
>>>     matrix.
>>> 
>>> (c) For deepSNV::normalize(), 'test' can be an ordinary matrix
>>>     in addition to a deepSNV object.
>>> 
>>> (d) For oligo::normalize(), 'object' can be an ordinary matrix
>>>     in addition to a FeatureSet object.
>>> 
>>> So how can we disambiguate when the first arg is an ordinary matrix?
>>> IMO normalize() should fail in that case i.e. no package should define
>>> methods for ordinary arrays or matrices. Not ideal but better than the
>>> current situation where what is returned depends on which package was
>>> loaded last.
>>> 
>>> I could put normalize(x, ...) in BiocGenerics if nobody objects, but
>>> that's all. I don't have time to fix the 10 packages that this change
>>> will affect. However, I'd rather wait the beginning of the Bioc 2.13
>>> devel cycle (April) for such a change. For some packages like
>>> diffGeneAnalysis (which doesn't use S4 at all), that will probably
>>> require a significant amount of changes since it will need to pass
>>> the data to normalize in an S4 container instead of an ordinary matrix.
>>> 
>>> Comments and suggestions are welcome.
>> 
>> Fine by me.
>> 
>> Laurent
>> 
>>> Thanks,
>>> H.
>>> 
>>>> 
>>>> Best wishes,
>>>> 
>>>> Laurent
>>>> 
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>> 
>>> 
>>> --
>>> Hervé Pagès
>>> 
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1-B514
>>> P.O. Box 19024
>>> Seattle, WA 98109-1024
>>> 
>>> E-mail: hpages at fhcrc.org
>>> Phone:  (206) 667-5791
>>> Fax:    (206) 667-1319
>> 
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 



More information about the Bioc-devel mailing list