[BioC] problems about cDNA vs genomic arrays normalization
yanju
yanju at liacs.nl
Mon Nov 20 18:52:09 CET 2006
Thanks Jenny,
After reading your explanation, I still have 2 puzzles.
1. Before I also applied normalizeWithinArrays() method to this
dataset. Do you think it is correct or necessary in my case?
2. You said "For the statistical analysis, you use the R values
directly." But after normalizeBetweenArrays(), then a MAList was
generated. It consisted of M, A value etc but not R value (red channel
intensity). And then I fited my MAlist to the linear model by using:
design<-modelMatrix(targets, ref="gDNA")
fit<-lmFit(ma.paq,design)
I think all my following analysis are based on the M value. Finally, I
used eBayes function to summary statistics in order to detect the most
differently expressed genes.
cont.matrix<-makeContrasts( WTvsMU=wt-mu,levels=design)
fit2<-contrasts.fit(fit,cont.matrix)
fit2<-eBayes(fit2)
So, I have no idea how to use R values directly. Was my codes wrong?
I was not quite sure about my code or method, because at the end I gave
some uninterpretable results which did not meet the expectation of the
biologists. That is why now I am recheck my code and methods. Thank you
again and also Wolfgang for your kindly help.
Kind regards,
Yanju
Jenny Drnevich wrote:
> Hi Yanju,
>
> I have just been working with a couple of data sets similar to yours
> where a) one channel has the same reference and b) the assumptions of
> few differences between sample and reference are not necessarily
> upheld. In these cases I have been using the Rquantile or Gquantile
> methods of normalizeBetweenArrays() in limma. These methods will do a
> quantile normalization on the R or G channel indicated so they have
> the "same empirical distribution across arrays, leaving the M-values
> (log-ratios) unchanged." Say your reference is in the green channel -
> doing a Gquantile normalization would force all the reference values
> to have the same distribution, and then adjust the R channel values
> accordingly. For the statistical analysis, you use the R values
> directly because if you use the M values, it would be like you never
> did the normalization. If the reference is not all in the same
> channel, I manipulate the RGList so that they are all in the same
> channel, but then I also include 'dye' as a batch effect in the model.
>
> HTH,
> Jenny
>
> At 10:32 AM 11/20/2006, yanju wrote:
>
>> Dear all,
>>
>> I have got a microarray dataset derived from common reference design.
>> The common reference is gemoic DNA. In normal normalization, we assume
>> that large fraction of genes is not differently expressed, then the
>> adjustment strategies are used to let the log-ratios have a median(mean)
>> of 0. But in my case, every spot would have the same observed signal in
>> the genomic channel while the signals in the cDNA channel vary greatly.
>> Therefore, the strategies that i just mentioned are not suitable. I was
>> wondering how to normalize this kinds of data? Is that any packages or
>> functions existed already? Expecting your reply.
>>
>> Regards,
>> Yanju
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at uiuc.edu
More information about the Bioconductor
mailing list