[BioC] RMA in Bioconductor versus APT - missing probesets

Thu Mar 3 23:58:44 CET 2011

Hi Michal,
yes, I think it's wrong to use rma/just.rma on ST data -- since working this out, I never do (except for using existing array QC pipelines which rely on 'rma' where i don't care about a few missing/erroneous probesets).
Thus consider these excellent alternatives: oligo, XPS, or affymetrix-probeset-summarize.
I currently use oligo because I can write pure R code with no dependencies on ROOT, but I will probably switch to XPS, because once its installed, the same pipeline can handle ST arrays and older genome arrays & can calculate DABG calls on the ST arrays

my 2 cents

Mark

On 04/03/2011, at 2:24 AM, Michal Blazejczyk wrote:

> Dear Mark,
> 
> Thank you for your answer.
> 
> Please correct me if I'm getting the wrong impression, but doesn't this mean
> that just.rma() and rma() are simply wrong in this case?  And if that's the case
> then should they be used for ST data?  In previous versions of Biocionductor they
> simply did not work (there was no cdf environment) but now that they do users will
> be using them, generating results that are not complete...
> 
> Best,
> Michał
> 
> 
> 
> Mark Cowley <m.cowley at garvan.org.au> wrote:
>> Michal,
>> in just.rma and rma, it was assumed that each probe could be in at most 1
>> probeset. once a probe was used, it cannot be reused. 
>> on the ST arrays, some probes can be in many probesets... so if you use rma,
>> eventually, all the probes in a probeset have been used once by the time the
>> current probeset needs it & you get NA's.
> 
>> Mark
> 
>> On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote:
> 
>>> Dear Christian,
>>> 
>>> I am aware of the existence of xps.  However, we can't use it for our purposes,
>>> largely because it is too complicated to set up (or at least, that was the case
>>> the last time we looked at it).  I would still like to know what's happening in
>>> just.rma()  :)
>>> 
>>> Best,
>>> Michał
>>> 
>>> 
>>> 
>>> cstrato <cstrato at aon.at> wrote:
>>>> Dear Michal,
>>> 
>>>> As an alternative to just.rma() you could use the Bioconductor package 
>>>> xps which uses the Affymetrix PGF-file as well as the Affymetrix 
>>>> annotations, and thus should contain all probesets. xps has also a 
>>>> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained 
>>>> from APT vs xps, respectively, for the HuGene 1.0 ST array.
>>> 
>>>> Best regards
>>>> Christian
>>>> _._._._._._._._._._._._._._._._._._
>>>> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
>>>> V.i.e.n.n.a           A.u.s.t.r.i.a
>>>> e.m.a.i.l:        cstrato at aon.at
>>>> _._._._._._._._._._._._._._._._._._
>>> 
>>> 
>>>> On 2/23/11 7:06 PM, Michal Blazejczyk wrote:
>>>>> Dear group,
>>>>> 
>>>>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level
>>>>> probesets that RMA in APT for the Human Gene 1.0 ST array.  To be specific, 819 probesets
>>>>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them
>>>>> through NetAffx.
>>>>> 
>>>>> I would like to know why this is happening, and whether it is to be expected or maybe
>>>>> it is a bug.
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Michał Błażejczyk
>>>>> FlexArray Lead Developer
>>>>> McGill University and Genome Quebec Innovation Centre
>>>>> http://www.gqinnovationcenter.com/services/bioinformatics/flexarray/index.aspx?l=e
>>>>> 
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>