[BioC] RMA vs gcRMA on 2 groups of samples

Robert Gentleman rgentlem at fhcrc.org
Fri Nov 2 19:14:18 CET 2007



Naomi Altman wrote:
> Dear Bogdan,
> Any normalization method that uses a set of arrays, reduces the 
> variability among those arrays.
> 
> So, if you have 2 sets of arrays and normalize separately, you will 
> find that the within set variability is smaller than the between set 
> variability - i.e. you induce significant differential expression 
> simply by the normalization.  To avoid this effect, when you are 
> doing differential expression analysis (or sample clustering) you 
> must either use methods that normalize each array separately (MAS) or 
> normalize all together.

  An alternative (and the one that I prefer) is to do separate 
normalizations, and to then use some sort of batch effect term in the 
model used to assess differentially expressed genes.

  Normalization is intended to clean up the relatively minor issues that 
arise due to slightly different conditions etc. for arrays that are 
essentially the same.  As far as I can see it is not intended to adjust 
for batch effects, and in my experience generally does a bad job of 
that.  Just because you can normalize (or fit any statistical model) 
does not mean that you should.

   best wishes
     Robert


> 
> --Naomi
> 
> At 12:01 PM 11/2/2007, Bogdan Tanasa wrote:
>> Greetings Naomi,
>>
>> thanks for reply. To generalize my question: when dealing with 2 sets of
>> samples, let's say  X1, X2, ...., Xn  and  Y1, Y2, ..., Yn,
>> I could run the normalization in 2 ways: A. only X(1,n) and only Y(1,n), or
>> B. both X(1,n),Y(1,n). Are there any a priori statistical
>> criteria that favors a way or the other ? If I  would take into
>> consideration biological criteria (the things I am interested in), the
>> results
>>from A may sometimes look better than B', or vice versa. Thanks !
>> Bogdan
>>
>>
>>
>> On 11/2/07, Naomi Altman <naomi at stat.psu.edu> wrote:
>>> Dear Bogdan,
>>> I do not have an opinion on gcRMA versus RMA.  But if you are doing
>>> differential expression analysis comparing the cell samples with the
>>> organ samples, you need to normalize
>>> all the samples together.
>>>
>>> --Naomi
>>>
>>> At 11:31 AM 11/1/2007, Bogdan Tanasa wrote:
>>>> Hi folks,
>>>>
>>>> I would like to ask for your opinions on the following:
>>>>
>>>> I have 60 expression profiles of 60 samples (cells and organs in
>>>> resting conditions).
>>>> I normalized these arrays in many ways, including RMA.
>>>>
>>>> Considering the biological arguments (cells samples vs organs
>>>> samples), I am planning to do the normalization separately, on the
>>>> group of cell samples, and on the group of organ samples.
>>>>
>>>> My questions are:
>>>>
>>>> - after RMA normalization on separate groups of samples (cells vs
>>>> organs), the results are different, but are these better ? GO analysis
>>>> do not display major differences.
>>>>
>>>> - would gcRMA work better than RMA ? The majority of opinions in SoCal
>>>> are pro-RMA.
>>>>
>>>> thanks,
>>>>
>>>> Bogdan
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> Naomi S. Altman                                814-865-3791 (voice)
>>> Associate Professor
>>> Dept. of Statistics                              814-863-7114 (fax)
>>> Penn State University                         814-865-1348 (Statistics)
>>> University Park, PA 16802-2111
>>>
>>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348 (Statistics)
> University Park, PA 16802-2111
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list