[BioC] two color arrays normalization
Jenny Drnevich
drnevich at illinois.edu
Fri Feb 20 19:52:12 CET 2009
Hi Giusy,
It looks like you are doing everything right. How are you
"outputting" your rev.Mval object? If you are simply typing rev.Mval
at the prompt, then it's probably spitting out the entire large
matrix; my guess is that your display can only show 5 columns at a
time, and the 6th column gets put out afterward. The easy way to
check to see how big the object is is:
> dim(rev.Mval)
or to check just the five few rows:
> rev.Mval[1:5,]
You can output the object and open it in excel using:
> write.csv(rev.Mval)
As for manually calculating the differential expression, why in the
world would you want to do that? The proper way to analyze microarray
data is not by doing a simple t-test for each gene; there are issues
with multiple hypothesis testing, plus limma implements an empirical
Bayes correction that helps to improve your power by borrowing
information across genes (if you get penalized for testing thousands
of genes, you should at least get the benefit of testing thousands of genes).
Cheers,
Jenny
At 04:50 PM 2/19/2009, Giusy Della Gatta wrote:
>Hi Jenny,
>
>I modified again the script, but I am still stuck
>with this analysis.
>I am interested into have as output the M values
>of all my microarray (3 control and 3 treatment). We
>will manually calculate the differentially expressed genes
>and the most significative ones. I followed your advices, I eliminated
>the contrast matrix and at the end of my analysis
>I converted the Mvalues signs by multiplying *-1, but why I have
>only one output column
>when I display the results of this command: rev.Mval <- MA$M * -1?
>How can I extract the reverted M values for all the
>genes and all the samples?
>
>Below the script that I used:
>
>
> > library(limma)
> > # Read in data files
> > targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > targets
> Filename Cy3 Cy5
>1 Control_1b.txt ctrl ref
>2 Control_2b.txt ctrl ref
>3 Control_3b.txt ctrl ref
>4 dexa_dbz_lt_1.txt treat ref
>5 dexa_dbz_lt_2.txt treat ref
>6 dexa_dbz_lt_3.txt treat ref
> > RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
>Read Control _1b.txt
>Read Control _2b.txt
>Read Control_3b.txt
>Read dexa_dbz_lt_1.txt
>Read dexa_dbz_lt_2.txt
>Read dexa_dbz_lt_3.txt
> > # convert RG object in a MAlist using MA>RG() function
> > MA<-MA.RG(RG, bc.method="none")
> > # perform background correction
> > RG<-backgroundCorrect (RG,method="none")
> > # perform within array normalization
> > MA.n<-normalizeWithinArrays(RG, method="loess")
> > # perform between array normalization
> > MA.bn<-normalizeBetweenArrays(MA.n, method="Rquantile")
> > MA.bn
>An object of class "MAList"
>$targets
> FileName
>Control _1b Control _1b
>Control _2b Control _2b
>Control_3b Control_3b
>dexa_dbz_lt_1 dexa_dbz_lt_1
>dexa_dbz_lt_2 dexa_dbz_lt_2
>dexa_dbz_lt_3 dexa_dbz_lt_3
>
>$genes
> Row Col Start Sequence ProbeUID
>1 1 1 0 0
>2 1 2 0 0
>3 1 3 0 0
>4 1 4 0 GGCGCAGGTTAATATGGGCCCTGGACTGATGGAGGGCGCTGGGTAGGGAG 2
>5 1 5 0 ACCAAGTACAAAGATAGTTATAACCAAGTACAAAGATAGTTATA 4
> ControlType ProbeName GeneName SystematicName Description
>1 1 DarkCorner DarkCorner DarkCorner
>2 1 DarkCorner DarkCorner DarkCorner
>3 1 DarkCorner DarkCorner DarkCorner
>4 0 P0156694 XM_113729 P0156694 BARCODE
>5 0 P0029969 NM_002288 P0029969 SENSE
>243499 more rows ...
>
>$source
>[1] "agilent"
>
>$printer
>$ngrid.r
>[1] 1
>
>$ngrid.c
>[1] 1
>
>$nspot.r
>[1] 534
>
>$nspot.c
>[1] 456
>
>
>$M
> Control _1b Control _2b Control_3b dexa_dbz_lt_1 dexa_dbz_lt_2
>[1,] 0.028631881 0.01576235 0.02857114 -0.04946816 -0.03046490
>[2,] 0.002111331 0.03122844 0.03317934 -0.08001562 -0.02675807
>[3,] 0.028839984 -0.03593778 0.02981069 -0.01823344 -0.08365155
>[4,] 0.540221183 0.14265895 -0.24933370 -0.78564142 -1.07674330
>[5,] -0.044016945 0.03051181 -0.02424689 -0.08100343 -0.07191502
> dexa_dbz_lt_3
>[1,] -0.01389899
>[2,] -0.03334981
>[3,] -0.05116126
>[4,] -0.43853242
>[5,] -0.11921854
>243499 more rows ...
>
>$A
> Control _1b Control _2b Control_3b dexa_dbz_lt_1 dexa_dbz_lt_2
>[1,] 5.742711 5.712913 5.718245 5.757256 5.763494
>[2,] 5.822337 5.733033 5.681233 5.793217 5.795643
>[3,] 5.761935 5.674629 5.759187 5.797976 5.788469
>[4,] 8.808693 9.526645 9.915676 10.210170 10.306222
>[5,] 5.788424 5.765753 5.790519 5.801103 5.814293
> dexa_dbz_lt_3
>[1,] 5.758196
>[2,] 5.807207
>[3,] 5.794197
>[4,] 9.836943
>[5,] 5.828283
>243499 more rows ...
>
> > # Create design matrix
> > design <- modelMatrix(targets, ref="ref")
>Found unique target names:
> ctrl ref treat
> > design
> ctrl treat
>[1,] -1 0
>[2,] -1 0
>[3,] -1 0
>[4,] 0 -1
>[5,] 0 -1
>[6,] 0 -1
> > #revert M signs
> > rev.Mval <- MA.bn$M * -1
>
> > rev.Mval (THIS IS ONLY AN EXTRACT OF ALL THE ENTIRE COLUMN THAT
> WAS IN OUTPUT)
>
> [9088,] -5.989322e-02
> [9089,] 4.548476e-02
> [9090,] -5.121079e-01
> [9091,] 8.276698e-01
> [9092,] -7.587297e-01
> [9093,] -1.804467e-02
> [9094,] -5.659472e-02
> [9095,] -6.725387e-02
>
>
>
>Thank you,
>Giusy
>
>
>
>
>-----Original Message-----
>From: Jenny Drnevich [mailto:drnevich at illinois.edu]
>Sent: Wed 2/18/2009 1:34 PM
>To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
>Subject: RE: [BioC] two color arrays normalization
>
>Hi Giusy,
>
>M-values are the ratios for each array individually. If you want to
>output these but in reversed form, all you have to do is multiply them by -1:
>
>rev.Mval <- MA$M * -1
>
>There are no M values in the fit2 object, because instead of the
>individual array ratios, the model has calculated the "average" ratio
>value for each column, which are called coefficients. So the
>fit2$coef contains the log2(FC) values for the ctrl-ref and treat-ref
>comparisons. You should spend some time reading through the limma
>User's Guide. It explains all of this in detail, along with functions
>that can be used to look at and output your data.
>
>limmaUsersGuide()
>
>
>As always, if something is not working for you, it's best to include
>the code that is not working instead of just saying "it doesn't work".
>
>Cheers,
>Jenny
>
>At 12:06 PM 2/18/2009, Giusy Della Gatta wrote:
> >Thank you Jenny!
> >
> >Not only the controls but all the
> >arrays are going all the way round!
> >
> >With yours advices the values
> >are (correctly) switched:
> >
> > > fit$coef[1:5,]
> > ctrl treat
> >[1,] -0.024321790 0.03127735
> >[2,] -0.022173037 0.04670783
> >[3,] -0.007570963 0.05101542
> >[4,] -0.144515478 0.76697238
> >[5,] 0.012584011 0.09071233
> >
> > > fit2$coef[1:5,]
> > ctrl treat
> >[1,] 0.024321790 -0.03127735
> >[2,] 0.022173037 -0.04670783
> >[3,] 0.007570963 -0.05101542
> >[4,] 0.144515478 -0.76697238
> >[5,] -0.012584011 -0.09071233
> >
> >
> >but still when I am printing out the M values
> >for all the genes from the MAlist object the values are not switched,
> >while if I try to recover the M values from the
> >fit2 object I don't find them.
> >Please, may you help me also with this?
> >
> >Thank you very much
> >Giusy
> >
> >
> >-----Original Message-----
> >From: Jenny Drnevich [mailto:drnevich at illinois.edu]
> >Sent: Wed 2/18/2009 11:51 AM
> >To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
> >Subject: Re: [BioC] two color arrays normalization
> >
> >Hi Giusy,
> >
> >It shouldn't matter if you put the minus signs in the design matrix
> >or the contrast matrix, they will do the same thing. Actually, the
> >contrast matrix is completely unnecessary, the columns of the design
> >matrix already specify the differences between the ref and your two
> >other groups. Now, are you having trouble getting the results
> >switched, or is it just that the results for a few genes are the
> >opposite of what you expect to happen? Let me walk you through a way
> >to check that the directions of the M values are being reversed. Your
> >design matrix is:
> >
> > >Filename Cy3 Cy5
> > >Control_1b.txt ctrl ref
> > >Control_2b.txt ctrl ref
> > >Control_3b.txt ctrl ref
> > >dexa_dbz_lt_1.txt treat ref
> > >dexa_dbz_lt_2.txt treat ref
> > >dexa_dbz_lt_3.txt treat ref
> >
> >Therefore, the M values in your MA object are log2(Cy5/Cy3), which is
> >either log2(ref/ctrl) or log2(ref/treat). A positive M value means up
> >in ref compared to the ctrl (or treat), but you really want the
> >opposite, that positive values mean up in ctrl (or treat) as compared
> >to the ref. Your design matrix as created by modelMatrix is:
> >
> > ctrl treat
> >[1,] 1 0
> >[2,] 1 0
> >[3,] 1 0
> >[4,] 0 1
> >[5,] 0 1
> >[6,] 0 1
> >
> >Even though the column names say "ctrl" and "treat", they actually
> >mean "ref-ctrl" and "ref-treat"; this is because the first column
> >indicates the M values from the first three arrays in the original
> >orientation, which is log2(Cy5/Cy3), or log2(ref) - log2(ctrl). If
> >you use lmFit with this design matrix:
> >
> >fit<-lmFit(MA,design)
> >
> >The fit$coef values will be positive or negative, "up" or "down" in
> >the ref as compared to the ctrl (or treat). There are many different
> >ways to flip these, your contrast matrix with -1s is one way, but a
> >quicker way is to just multiply the original design matrix by -1:
> >
> >fit2 <- lmFit(MA, design*-1)
> >
> >Now, compare the direction of change between the two fit objects:
> >
> >fit$coef[1:5,]
> >fit2$coef[1:5,]
> >
> >The magnitude of the values shouldn't change, but the direction
> >should be switched. If you are absolutely sure that the ref was in
> >Cy5 on your arrays, then the fit2 object should contain the correct
> >orientation of "up" or "down" in the ctrl as compared to the ref.
> >However, if the positive controls are going in the opposite direction
> >of the way you expect them to be, it's not because you are setting up
> >the contrasts incorrectly. Either you have somehow switched the
> >samples on the arrays, or the probes for the positive control are
> >measuring a different part of the transcript that give a different
> >result than you expect.
> >
> >HTH,
> >Jenny
> >
> >
> >At 09:13 AM 2/18/2009, Giusy Della Gatta wrote:
> > >Hi Naomi,
> > >I don't know if I understood correct. I switched the signs of the
> > >design and the
> > >contrast matrices, but I still have the same results: controls going
> > >at the opposite way.
> > >
> > > > library(limma)
> > > > # Read in data files
> > > > targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > > > RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
> > >Read Control _1b.txt
> > >Re
> > >Read Control_3b.txt
> > >Read dexa_dbz_lt_1.txt
> > >Read dexa_dbz_lt_2.txt
> > >Read dexa_dbz_lt_3.txt
> > > > # create MA list
> > > > MA<-MA.RG(RG, bc.method="none")
> > > > # perform background correction
> > > > RG<-backgroundCorrect (RG,method="none")
> > > > # perform within array normalization
> > > > MA<-normalizeWithinArrays(RG, method="loess")
> > > > # Create design matrix
> > > > design <- modelMatrix(targets, ref="ref")
> > >Found unique target names:
> > > ctrl ref treat
> > > > design<- cbind(ctrl= c(1,1,1,0,0,0), treat= c(0,0,0,1,1,1))
> > > > design
> > > ctrl treat
> > >[1,] 1 0
> > >[2,] 1 0
> > >[3,] 1 0
> > >[4,] 0 1
> > >[5,] 0 1
> > >[6,] 0 1
> > > > fit<-lmFit(MA,design)
> > > > cont.matrix<-cbind("ctrl-ref"=c(-1,0), "treat-ref"=c(0,-1))
> > > > cont.matrix
> > > ctrl-ref treat-ref
> > >[1,] -1 0
> > >[2,] 0 -1
> > > > fit2<-contrasts.fit(fit, cont.matrix)
> > > > d1 <- ebayes(fit2)
> > >
> > >
> > >
> > >
> > >Thank you
> > >Giusy
> > >
> > >
> > >-----Original Message-----
> > >From: Naomi Altman [mailto:naomi at stat.psu.edu]
> > >Sent: Tue 2/17/2009 7:50 PM
> > >To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
> > >Subject: RE: [BioC] two color arrays normalization
> > >
> > >Hi Giusy,
> > >Move the minus signs from the first design matrix to the 2nd and I
> > >think it will work fine.
> > >
> > >--Naomi
> > >
> > >At 06:07 PM 2/17/2009, Giusy Della Gatta wrote:
> > > >Hi Naomi,
> > > >
> > > >I performed the analysis of my micorarrays,but still I don't manage
> > > >to revert the channels!
> > > >My experiment consisted into infect cells with an adenovirus:
> an emty one
> > > >and an adenovirus expressing for a specific protein. Then I
> > > >treated the same cells with a specific drug or simply with the
> > > vehicle (DMSO).
> > > >I have 6 microarrays: 3 controls DMSO0-treated and 3 samples
> drug-treated.
> > > >In each microarray the green channel is expressing the levels of
> > > >infected and treated
> > > >cells while the red channel are not infected cells. For all the
> > experiments
> > > >I have the same RED CHANNEL reference.
> > > >I composed the target file as follows:
> > > >
> > > >Filename Cy3 Cy5
> > > >Control_1b.txt ctrl ref
> > > >Control_2b.txt ctrl ref
> > > >Control_3b.txt ctrl ref
> > > >dexa_dbz_lt_1.txt treat ref
> > > >dexa_dbz_lt_2.txt treat ref
> > > >dexa_dbz_lt_3.txt treat ref
> > > >
> > > >and the I used the following script:
> > > >
> > > > >targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > > > >RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
> > > > >MA<-MA.RG(RG, bc.method="normexp")
> > > > >MA<-normalizeWithinArrays(RG, method="loess")
> > > > >design <- modelMatrix(targets, ref="ref")
> > > > > design
> > > > ctrl treat
> > > >[1,] -1 0
> > > >[2,] -1 0
> > > >[3,] -1 0
> > > >[4,] 0 -1
> > > >[5,] 0 -1
> > > >[6,] 0 -1
> > > > >fit<-lmFit(MA,design)
> > > > >cont.matrix<-cbind("ctrl-ref"=c(1,0), "treat-ref"=c(0,1))
> > > > >cont.matrix
> > > > ctrl-ref treat-ref
> > > >[1,] 1 0
> > > >[2,] 0 1
> > > >
> > > > >fit2<-contrasts.fit(fit, cont.matrix)
> > > > >d1 <- ebayes(fit2)
> > > > >toptable(fit2,adjust="fdr")
> > > >
> > > >I don't know if I am still omitting
> > > >something, because I have the positive
> > > >controls of this experiment that
> > > >are going exactly in the opposite way!!
> > > >
> > > >May you can help me?
> > > >
> > > >Thank you in advance!
> > > >Giusy
> > > >
> > > >
> > > >-----Original Message-----
> > > >From: Naomi Altman [mailto:naomi at stat.psu.edu]
> > > >Sent: Mon 2/9/2009 9:56 PM
> > > >To: Giusy Della Gatta; bioconductor at stat.math.ethz.ch
> > > >Subject: Re: [BioC] two color arrays normalization
> > > >
> > > >If there is no dye-swap, then what do you mean by "swapping of
> > the colors"?
> > > >
> > > >--Naomi
> > > >
> > > >At 07:56 PM 2/9/2009, Giusy Della Gatta wrote:
> > > >
> > > > >Hi everybody,
> > > > >
> > > > >I am analyzing two color Agilent microarrays
> > > > >by using LIMMA package.
> > > > >In my specific case the red channel is representing
> > > > >"the reference" while the green channel is "the treatment".
> > > > >Is it enough to use the Target File composition to specify the name
> > > > >of the samples
> > > > >and their corrispondet channels? Or I have to use other
> > specific commands
> > > > >to specify the "swapping" of the colors?
> > > > >
> > > > >Thank you in advance!
> > > > >Regards
> > > > >Giusy
> > > > >
> > > > >_______________________________________________
> > > > >Bioconductor mailing list
> > > > >Bioconductor at stat.math.ethz.ch
> > > > >https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > > >Search the archives:
> > > > >http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > >
> > > >Naomi S. Altman 814-865-3791 (voice)
> > > >Associate Professor
> > > >Dept. of Statistics 814-863-7114 (fax)
> > > >Penn State University 814-865-1348 (Statistics)
> > > >University Park, PA 16802-2111
> > >
> > >Naomi S. Altman 814-865-3791 (voice)
> > >Associate Professor
> > >Dept. of Statistics 814-863-7114 (fax)
> > >Penn State University 814-865-1348 (Statistics)
> > >University Park, PA 16802-2111
> > >
> > >_______________________________________________
> > >Bioconductor mailing list
> > >Bioconductor at stat.math.ethz.ch
> > >https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >Search the archives:
> > >http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >Jenny Drnevich, Ph.D.
> >
> >Functional Genomics Bioinformatics Specialist
> >W.M. Keck Center for Comparative and Functional Genomics
> >Roy J. Carver Biotechnology Center
> >University of Illinois, Urbana-Champaign
> >
> >330 ERML
> >1201 W. Gregory Dr.
> >Urbana, IL 61801
> >USA
> >
> >ph: 217-244-7355
> >fax: 217-265-5066
> >e-mail: drnevich at illinois.edu
>
>Jenny Drnevich, Ph.D.
>
>Functional Genomics Bioinformatics Specialist
>W.M. Keck Center for Comparative and Functional Genomics
>Roy J. Carver Biotechnology Center
>University of Illinois, Urbana-Champaign
>
>330 ERML
>1201 W. Gregory Dr.
>Urbana, IL 61801
>USA
>
>ph: 217-244-7355
>fax: 217-265-5066
>e-mail: drnevich at illinois.edu
More information about the Bioconductor
mailing list