[BioC] median normalization
James W. MacDonald
jmacdon at med.umich.edu
Wed Aug 31 16:45:28 CEST 2011
Hi Viritha,
On 8/31/2011 9:36 AM, viritha kaza wrote:
> Hi James,
> Thanks for your quick reply.
>
> So my last step should be
> exprs<-eset-median+median1.
> Is it?
If your data are not log transformed, yes. This of course assumes that
this is what the original authors meant by 'median normalization'.
>
> In the paper they say
> "RNA fluorescent labeling reaction and hybridization were performed
> using the Affymetrix Gene Chips HG-U133A and HG-U133B according to the
> manufacturer’s instructions (http://www.affymetrix.com/). The arrays
> consist of 22,283 (HG-U133A) and 22,645 (HG-U133B) probe sets, which
> together amount to 23,583 unique genes based on Unigene build 173.
> Microarray analysis was performed using Affymetrix Microarray Suite 5.0
> and in-house Visual Basic software MATRIX 1.26. Array data were median
> normalized, and replicate genes were combined by averaging. Samples (or
> averages of samples) were then compared against each other by
> calculating log ratios for each gene, and statistical significance was
> presented as a p value calculated by Student’s t test. The microarray
> data have been uploaded to GEO (Gene Expression Omnibus), and the
> accession number is GSE-4824."
>
> In GSE4824 in geo:
> "Data processing:Data were analyzed with Microarray Suite version 5.0
> (MAS 5.0) using Affymetrix default analysis settings and global scaling
> as normalization method. The trimmed mean target intensity of each array
> was arbitrarily set to 250".
> Could you please suggest whether my steps are correct :
> Since there are 3 platforms namely GPL570 which contains 6 arrays which
> are referenece profiles and other 2 i.e HG-U133A and HG-U133B contains
> 79 samples.
> Steps as suggested by paper:
> 1)combine both platforms with mas5 intensity of both the platform for
> the 79samples from series matrix file(unlogged)
> * what about the common probes between HG U133A and HG U133B(168)?
> 2)Add annotation with Unigene and then combine the reference profile
> which is HGU133 plus2
> *Do I ignore those probes which are unique to HGU133plus2 (9921) and
> probes of HGU133A and HGU133B(6)
> 3)Then perform median normalization or median centering.
> 4)Then averaging the replicate genes.
> 5)Log ratios for each genes(fold change)
> 6)then perform statistical student t-test.
> * During which step do I convert the expression to log2 ?
I would assume somewhere before step 5. But this is your project, so you
have to make that decision for yourself.
Best,
Jim
> wiating for your suggestions.
> Thanks,
> Viritha
> On Mon, Aug 29, 2011 at 1:27 PM, James W. MacDonald
> <jmacdon at med.umich.edu <mailto:jmacdon at med.umich.edu>> wrote:
>
> Hi Viritha,
>
>
> On 8/29/2011 12:10 PM, viritha kaza wrote:
>
> Hi group,
> I am trying to replicate a dataset GSE4824 from a paper.
> There are actually 3 platforms in them. But right now I am
> concentrating
> only on one platform GPL570.This contains 6 arrays.
> I have written the code to perform Microarray Suite version 5.0
> (MAS 5.0)
> using Affymetrix default analysis settings and global scaling as
> normalization method. The trimmed mean target intensity of each
> array was
> arbitrarily set to 250.After which median normalization.
>
>
> source("http://bioconductor.__org/biocLite.R
> <http://bioconductor.org/biocLite.R>")
>
>
> biocLite("affy")
>
>
> library(affy)
>
>
> mydata<- ReadAffy()
>
>
> eset.mas5 = mas5(mydata,sc=250,normalize=__TRUE)
>
>
> write.exprs(eset.mas5,"__GSE4824_GPL570.txt",sep='\t')
>
>
> eset=exprs(eset.mas5)
>
>
> median = apply(eset, 2, median)
>
>
> median1=median(median)
> exprs<-eset/median*median1
>
>
> The output from mas5() isn't log transformed, so you should be
> subtracting and adding, not dividing and multiplying.
>
> This assumes that by 'median normalization' the original authors
> simply meant median centering.
>
> Best,
>
> Jim
>
>
> write.table(exprs,"GSE4824___GPL570_Median.txt",sep='\t')
>
>
> Please let me know if the my code performs corectly the above task,
> especially if last few steps would perform median normalization
> correctly or
> not? Also let me know if this is the right way to do median
> normalization.
> Thanks,
> Viritha
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/__listinfo/bioconductor
> <https://stat.ethz.ch/mailman/listinfo/bioconductor>
> Search the archives:
> http://news.gmane.org/gmane.__science.biology.informatics.__conductor
> <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826 <tel:734-615-7826>
> ******************************__****************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list