[BioC] Advice with RemoveBatchEffect and Rank Product package

Osee Sanogo sanogo at life.illinois.edu
Mon Sep 10 13:23:04 CEST 2012


Dear Gordon,

>From what you said, it seems that I am oversimplifying my experiment by
attempting to analyze it with RankProd, which doesn't offer the option for
complex modeling.

Could you please explain to me how I could analyze the experiment using
Limma? 
Please let me know if you'd like me to provide further details of the
experiment.

Thank you so much.

Osee


On 9/10/12 2:04 AM, "Gordon K Smyth" <smyth at wehi.EDU.AU> wrote:

> Dear Osee,
> 
> No, you can't use removeBatchEffect to control for dye bias.
> 
> Can you ignore the dye effect?  Not in general, but who knows?
> 
> Your experiment seems too complex to be properly analysed using RankProd.
> For one thing, it seems clear that you have obtained multiple parts of the
> brain from the same biological replicates, meaning that your samples are
> paired by fish number.
> 
> I could explain how to analyse this experiment using limma.  However, if
> you are determined that you will use RankProd, it might be best to email
> the authors of that package for advice.
> 
> Best wishes
> Gordon
> 
> ---------------------------------------------
> Professor Gordon K Smyth,
> Bioinformatics Division,
> Walter and Eliza Hall Institute of Medical Research,
> 1G Royal Parade, Parkville, Vic 3052, Australia.
> http://www.statsci.org/smyth
> 
> On Sun, 9 Sep 2012, Osee Sanogo wrote:
> 
>> Dear Gordon,
>> 
>> Thank you for getting back to me about my questions.
>> 
>> My experiment is trying to identify differentially expressed genes in four
>> regions of the brain in response to a stressor. I have 6 biol. replicates in
>> each brain region for the control and experimental groups in each region,
>> and the comparison is being done within brain region (i.e., T control vs T
>> exp, D ctrl vs D exp, C ctrl vs C exp, BS ctrl vs BS exp). The sample were
>> run in two-color Agilent Array.
>> 
>> You're right that the design I sent was from the separate channel analysis,
>> in which I attempted to account for array and dye effect, and then run the
>> data in RankProd. Now I know that this is not right. Ok, I will use the
>> single channel analysis in Limma.
>> 
>> I still would like to run the two-channel data (ratios) in RankProd, as my
>> previous experience found this useful for my dara (low replicate numbers).
>> 
>> My questions: 1) Could I use RemoveBatchEffect to control for dye bias
>> before running the two-channel data in RankProd? If yes, how should I do
>> this using the RemoveBatch Effect function?
>>            2) I found that about 3% of my probes have dye effect. Can I
>> omit controlling for dye effect without compromising the result of my
>> analysis?
>> 
>> The data were loess/scale normalized into an expression set (Data_RP).
>> 
>> Here is the design of the experiment
>> 
>> FileName     Cy3 Cy5 Fish.Number Slide Brain.Part Weight Length
>> 1    1T.gpr   1  -1           1     2          T     39   0.63
>> 2    2T.gpr  -1   1           2     1          T     39   0.63
>> 3    3T.gpr   1  -1           3     4          T     39   0.63
>> 4    4T.gpr  -1   1           4     3          T     39   0.63
>> 5    5T.gpr   1  -1           5     6          T     39   0.63
>> 6    6T.gpr  -1   1           6     5          T     NA     NA
>> 7    1D.gpr  -1   1           1     5          D     47   1.21
>> 8    2D.gpr   1  -1           2     4          D     47   1.21
>> 9    3D.gpr  -1   1           3     1          D     47   1.21
>> 10   4D.gpr   1  -1           4     6          D     47   1.21
>> 11   5D.gpr  -1   1           5     3          D     47   1.21
>> 12   6D.gpr   1  -1           6     2          D     NA     NA
>> 13   1C.gpr   1  -1           1     4          C     47   1.31
>> 14   2C.gpr  -1   1           2     3          C     47   1.31
>> 15   3C.gpr   1  -1           3     6          C     47   1.31
>> 16   4C.gpr  -1   1           4     5          C     47   1.31
>> 17   5C.gpr   1  -1           5     2          C     47   1.31
>> 18   6C.gpr  -1   1           6     1          C     NA     NA
>> 19  1BS.gpr  -1   1           1     1         BS     89   1.44
>> 20  2BS.gpr   1  -1           2     2         BS     89   1.44
>> 21  3BS.gpr  -1   1           3     3         BS     89   1.44
>> 22  4BS.gpr   1  -1           4     4         BS     NA     NA
>> 23  5BS.gpr  -1   1           5     5         BS     NA     NA
>> 24  6BS.gpr   1  -1           6     6         BS     NA     NA
>> 
>> Thank you for your help and please let me know if you need further
>> explanation of the experiment.
>> 
>> Best regards,
>> 
>> Osee
>> 
>>> 
>> 
>> 
>> On 9/9/12 7:24 PM, "Gordon K Smyth" <smyth at wehi.EDU.AU> wrote:
>> 
>>> Dear Osee,
>>> 
>>> You are attempting to do a number of things that don't seem correct to me.
>>> 
>>> First, you seem to attempting a separate channel analysis of two color
>>> microarray data, but ignoring the pairing of the red and green channels.
>>> It isn't correct to do this.  I don't see any way to use RankProd, or any
>>> other package designed for independent samples, correctly in this context.
>>> If you must do a separate channel analysis, you would be better off using
>>> the separate channel analysis facilities of the limma package.
>>> 
>>> Second, when you set batch=rep(1,24), you are defining a batch that
>>> consists of every array in your data set.  Obviously it doesn't make sense
>>> to remove batch effects unless there are at least two batches.
>>> 
>>> Third, I don't follow your design matrix.
>>> 
>>> It would be better to go back to the start, and describe in more basic
>>> terms what is the nature of your data and what comparison you are trying
>>> to make.
>>> 
>>> Best wishes
>>> Gordon
>>> 
>>>> Date: Sat, 8 Sep 2012 11:40:45 +0000
>>>> From: "Sanogo, Yibayiri O" <sanogo at illinois.edu>
>>>> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
>>>> Subject: [BioC] Advice with RemoveBatchEffect and Rank Product package
>>>> 
>>>> Dear Members of the list,
>>>> 
>>>> (I apologize for posting this again -I sent it earlier to the list but
>>>> from another account and I was listed me as non-Member -and Member I am
>>>> since 2008:-)).
>>>> 
>>>> I have been using Rank Prod to analyze Agilent two-color data. However, I
>>>> would like to remove the dye effect prior to analysis. I read on the forum
>>>> that RemoveBatchEffect should be used in the Limma linear model, a type of
>>>> modeling that is not in Rank Product.
>>>> 
>>>> I have two questions:
>>>> 
>>>> 1) Would it be appropriate to use RemoveBatchEffect to correct for dye
>>>> effect prior to running the expression data using Rank Prod?
>>>> 
>>>> 2) a) If no, what other function could I use to do this?
>>>>   b) If yes, I would like a help with the correct design and how to
>>>> properly indicate the batch.
>>>> 
>>>> Here is my design indicating the two dyes (cy3=-1, cy5=1; T, D, C, BS =are
>>>> different areas of the brain):
>>>> 
>>>> design1
>>>>   BS  C  D  T
>>>> 1   0  0  0  1
>>>> 2   0  0  0 -1
>>>> 3   0  0  0  1
>>>> 4   0  0  0 -1
>>>> 5   0  0  0  1
>>>> 6   0  0  0 -1
>>>> 7   0  0 -1  0
>>>> 8   0  0  1  0
>>>> 9   0  0 -1  0
>>>> 10  0  0  1  0
>>>> 11  0  0 -1  0
>>>> 12  0  0  1  0
>>>> 13  0  1  0  0
>>>> 14  0 -1  0  0
>>>> 15  0  1  0  0
>>>> 16  0 -1  0  0
>>>> 17  0  1  0  0
>>>> 18  0 -1  0  0
>>>> 19 -1  0  0  0
>>>> 20  1  0  0  0
>>>> 21 -1  0  0  0
>>>> 22  1  0  0  0
>>>> 23 -1  0  0  0
>>>> 24  1  0  0  0
>>>> 
>>>> attr(,"assign")
>>>> [1] 1 1 1 1
>>>> 
>>>> I've tried this (Data_RP are my data, the M values of the expression set):
>>>> 
>>>> DYE_RP<-removeBatchEffect(Data_RP, batch=rep(1,24), batch2=NULL,
>>>> design=design1)
>>>> 
>>>> but it is returning an error message
>>>> " Error in contr.sum(levels(batch)) :
>>>>  not enough degrees of freedom to define contrasts"
>>>> 
>>>> Please help me correct this code.
>>>> 
>>>> Thank you so much for your help.
>>>> 
>>>> Osee
>>>> 
>>>> -- -- --
>>>> Y. Osee Sanogo
>>>> Integrative Biology
>>>> Institute for Genomic Biology
>>>> University of Illinois at Urbana
>>>> 505 S. Goodwin Ave
>>>> Urbana, IL-61801
>>>> 
>>>> Tel: 217-333 2308 (Office)
>>>>     217-417 9593 (Cell)
> 
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:7}}



More information about the Bioconductor mailing list