[BioC] request for simple usage of probe level normalisations

Mon Oct 8 11:30:05 CEST 2007

Dear All,

a) I don't know, if sequence based models like GCRMA, which I read stands 
actually for "GeneChip (tm)" not GC content, can be extended to
other platforms.
I am just looking at single color agilent chips and there is a gc content 
bias:
log2(intensity)~gc percentage:
Coefficients:  Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.023391   0.105189   9.729   <2e-16 ***
gcp         0.121826   0.002503  48.666   <2e-16 ***

b) I know one should not request/ask open source deveoplers for something

but:
if GCRMA is applicable to other platforms, then it would be nice if it could
be used in a simple way with these other platforms, and new platforms for 
oligo chips are getting more and more common.

I read the information from the oligo package and of the makePDpackage
which seems to be superseeded in the future by the pdInfoBuilder.

Would it be possible to make this simpler somehow? I don't know exactly what 
information is actually needed by the downstream analysis with GCRMA, but 
wouldn't it be sufficient that for the creation of a new environment I would 
need just 2 simple tab delimited text files*. Then one could simply make a 
script that converts ones own format (which are not .ndf or .cdf) to this 
_simple_ tab delimited format whose specification is clearly outlined in the 
package vignette.

Maybe I am underestimating the complexity (ignoring spatial information on 
chip) or its already there (yes, cdf etc.. files can be faked).

thank you very much,

ido

*eg.:
file1: 
oligo name, sequence, gene name (for grouping multiple oligos)
file2: annotation
gene or oligo name (if not grouped), annotations....