[BioC] two color arrays normalization

Jenny Drnevich drnevich at illinois.edu
Fri Feb 20 19:52:12 CET 2009


Hi Giusy,

It looks like you are doing everything right. How are you 
"outputting" your rev.Mval object? If you are simply typing rev.Mval 
at the prompt, then it's probably spitting out the entire large 
matrix; my guess is that your display can only show 5 columns at a 
time, and the 6th column gets put out afterward. The easy way to 
check to see how big the object is is:

 > dim(rev.Mval)

or to check just the five few rows:

 > rev.Mval[1:5,]

You can output the object and open it in excel using:

 > write.csv(rev.Mval)

As for manually calculating the differential expression, why in the 
world would you want to do that? The proper way to analyze microarray 
data is not by doing a simple t-test for each gene; there are issues 
with multiple hypothesis testing, plus limma implements an empirical 
Bayes correction that helps to improve your power by borrowing 
information across genes (if you get penalized for testing thousands 
of genes, you should at least get the benefit of testing thousands of genes).

Cheers,
Jenny

At 04:50 PM 2/19/2009, Giusy Della Gatta wrote:
>Hi Jenny,
>
>I modified again the script, but I am still stuck
>with this analysis.
>I am interested into have as output the M values
>of all my microarray (3 control and 3 treatment). We
>will manually calculate the differentially expressed genes
>and the most significative ones. I followed your advices, I eliminated
>the contrast matrix and at the end of my analysis
>I converted the Mvalues signs by multiplying *-1, but why I have 
>only one output column
>when I display the results of this command: rev.Mval <- MA$M * -1?
>How can I extract the reverted M values for all the
>genes and all the samples?
>
>Below the script that I used:
>
>
> > library(limma)
> > # Read in data files
> > targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > targets
>            Filename   Cy3 Cy5
>1    Control_1b.txt  ctrl ref
>2    Control_2b.txt  ctrl ref
>3    Control_3b.txt  ctrl ref
>4 dexa_dbz_lt_1.txt treat ref
>5 dexa_dbz_lt_2.txt treat ref
>6 dexa_dbz_lt_3.txt treat ref
> > RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
>Read Control _1b.txt
>Read Control _2b.txt
>Read Control_3b.txt
>Read dexa_dbz_lt_1.txt
>Read dexa_dbz_lt_2.txt
>Read dexa_dbz_lt_3.txt
> > # convert RG object in a MAlist using MA>RG() function
> > MA<-MA.RG(RG, bc.method="none")
> > # perform background correction
> > RG<-backgroundCorrect (RG,method="none")
> > # perform within array normalization
> > MA.n<-normalizeWithinArrays(RG, method="loess")
> > # perform between array normalization
> > MA.bn<-normalizeBetweenArrays(MA.n, method="Rquantile")
> > MA.bn
>An object of class "MAList"
>$targets
>                    FileName
>Control _1b     Control _1b
>Control _2b     Control _2b
>Control_3b       Control_3b
>dexa_dbz_lt_1 dexa_dbz_lt_1
>dexa_dbz_lt_2 dexa_dbz_lt_2
>dexa_dbz_lt_3 dexa_dbz_lt_3
>
>$genes
>   Row Col Start                                           Sequence ProbeUID
>1   1   1     0                                                           0
>2   1   2     0                                                           0
>3   1   3     0                                                           0
>4   1   4     0 GGCGCAGGTTAATATGGGCCCTGGACTGATGGAGGGCGCTGGGTAGGGAG        2
>5   1   5     0       ACCAAGTACAAAGATAGTTATAACCAAGTACAAAGATAGTTATA        4
>   ControlType  ProbeName   GeneName SystematicName Description
>1           1 DarkCorner DarkCorner     DarkCorner
>2           1 DarkCorner DarkCorner     DarkCorner
>3           1 DarkCorner DarkCorner     DarkCorner
>4           0   P0156694  XM_113729       P0156694     BARCODE
>5           0   P0029969  NM_002288       P0029969       SENSE
>243499 more rows ...
>
>$source
>[1] "agilent"
>
>$printer
>$ngrid.r
>[1] 1
>
>$ngrid.c
>[1] 1
>
>$nspot.r
>[1] 534
>
>$nspot.c
>[1] 456
>
>
>$M
>       Control _1b Control _2b  Control_3b dexa_dbz_lt_1 dexa_dbz_lt_2
>[1,]  0.028631881  0.01576235  0.02857114   -0.04946816   -0.03046490
>[2,]  0.002111331  0.03122844  0.03317934   -0.08001562   -0.02675807
>[3,]  0.028839984 -0.03593778  0.02981069   -0.01823344   -0.08365155
>[4,]  0.540221183  0.14265895 -0.24933370   -0.78564142   -1.07674330
>[5,] -0.044016945  0.03051181 -0.02424689   -0.08100343   -0.07191502
>      dexa_dbz_lt_3
>[1,]   -0.01389899
>[2,]   -0.03334981
>[3,]   -0.05116126
>[4,]   -0.43853242
>[5,]   -0.11921854
>243499 more rows ...
>
>$A
>      Control _1b Control _2b Control_3b dexa_dbz_lt_1 dexa_dbz_lt_2
>[1,]    5.742711    5.712913   5.718245      5.757256      5.763494
>[2,]    5.822337    5.733033   5.681233      5.793217      5.795643
>[3,]    5.761935    5.674629   5.759187      5.797976      5.788469
>[4,]    8.808693    9.526645   9.915676     10.210170     10.306222
>[5,]    5.788424    5.765753   5.790519      5.801103      5.814293
>      dexa_dbz_lt_3
>[1,]      5.758196
>[2,]      5.807207
>[3,]      5.794197
>[4,]      9.836943
>[5,]      5.828283
>243499 more rows ...
>
> > # Create design matrix
> > design <- modelMatrix(targets, ref="ref")
>Found unique target names:
>  ctrl ref treat
> > design
>       ctrl treat
>[1,]   -1     0
>[2,]   -1     0
>[3,]   -1     0
>[4,]    0    -1
>[5,]    0    -1
>[6,]    0    -1
> > #revert M signs
> > rev.Mval <- MA.bn$M * -1
>
> > rev.Mval (THIS IS ONLY AN EXTRACT OF ALL THE ENTIRE COLUMN THAT 
> WAS IN OUTPUT)
>
>   [9088,] -5.989322e-02
>   [9089,]  4.548476e-02
>   [9090,] -5.121079e-01
>   [9091,]  8.276698e-01
>   [9092,] -7.587297e-01
>   [9093,] -1.804467e-02
>   [9094,] -5.659472e-02
>   [9095,] -6.725387e-02
>
>
>
>Thank you,
>Giusy
>
>
>
>
>-----Original Message-----
>From: Jenny Drnevich [mailto:drnevich at illinois.edu]
>Sent: Wed 2/18/2009 1:34 PM
>To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
>Subject: RE: [BioC] two color arrays normalization
>
>Hi Giusy,
>
>M-values are the ratios for each array individually. If you want to
>output these but in reversed form, all you have to do is multiply them by -1:
>
>rev.Mval <- MA$M * -1
>
>There are no M values in the fit2 object, because instead of the
>individual array ratios, the model has calculated the "average" ratio
>value for each column, which are called coefficients. So the
>fit2$coef contains the log2(FC) values for the ctrl-ref and treat-ref
>comparisons. You should spend some time reading through the limma
>User's Guide. It explains all of this in detail, along with functions
>that can be used to look at and output your data.
>
>limmaUsersGuide()
>
>
>As always, if something is not working for you, it's best to include
>the code that is not working instead of just saying "it doesn't work".
>
>Cheers,
>Jenny
>
>At 12:06 PM 2/18/2009, Giusy Della Gatta wrote:
> >Thank you Jenny!
> >
> >Not only the controls but all the
> >arrays are going all the way round!
> >
> >With yours advices the values
> >are (correctly) switched:
> >
> > > fit$coef[1:5,]
> >              ctrl      treat
> >[1,] -0.024321790 0.03127735
> >[2,] -0.022173037 0.04670783
> >[3,] -0.007570963 0.05101542
> >[4,] -0.144515478 0.76697238
> >[5,]  0.012584011 0.09071233
> >
> > > fit2$coef[1:5,]
> >              ctrl       treat
> >[1,]  0.024321790 -0.03127735
> >[2,]  0.022173037 -0.04670783
> >[3,]  0.007570963 -0.05101542
> >[4,]  0.144515478 -0.76697238
> >[5,] -0.012584011 -0.09071233
> >
> >
> >but still when I am printing out the M values
> >for all the genes from the MAlist object the values are not switched,
> >while if I try to recover the M values from the
> >fit2 object I don't find them.
> >Please, may you help me also with this?
> >
> >Thank you very much
> >Giusy
> >
> >
> >-----Original Message-----
> >From: Jenny Drnevich [mailto:drnevich at illinois.edu]
> >Sent: Wed 2/18/2009 11:51 AM
> >To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
> >Subject: Re: [BioC] two color arrays normalization
> >
> >Hi Giusy,
> >
> >It shouldn't matter if you put the minus signs in the design matrix
> >or the contrast matrix, they will do the same thing. Actually, the
> >contrast matrix is completely unnecessary, the columns of the design
> >matrix already specify the differences between the ref and your two
> >other groups. Now, are you having trouble getting the results
> >switched, or is it just that the results for a few genes are the
> >opposite of what you expect to happen? Let me walk you through a way
> >to check that the directions of the M values are being reversed. Your
> >design matrix is:
> >
> >  >Filename        Cy3     Cy5
> >  >Control_1b.txt  ctrl    ref
> >  >Control_2b.txt  ctrl    ref
> >  >Control_3b.txt  ctrl    ref
> >  >dexa_dbz_lt_1.txt       treat   ref
> >  >dexa_dbz_lt_2.txt       treat   ref
> >  >dexa_dbz_lt_3.txt       treat   ref
> >
> >Therefore, the M values in your MA object are log2(Cy5/Cy3), which is
> >either log2(ref/ctrl) or log2(ref/treat). A positive M value means up
> >in ref compared to the ctrl (or treat), but you really want the
> >opposite, that positive values mean up in ctrl (or treat) as compared
> >to the ref. Your design matrix as created by modelMatrix is:
> >
> >      ctrl treat
> >[1,]    1     0
> >[2,]    1     0
> >[3,]    1     0
> >[4,]    0     1
> >[5,]    0     1
> >[6,]    0     1
> >
> >Even though the column names say "ctrl" and "treat", they actually
> >mean "ref-ctrl" and "ref-treat"; this is because the first column
> >indicates the M values from the first three arrays in the original
> >orientation, which is log2(Cy5/Cy3), or log2(ref) - log2(ctrl). If
> >you use lmFit with this design matrix:
> >
> >fit<-lmFit(MA,design)
> >
> >The fit$coef values will be positive or negative, "up" or "down" in
> >the ref as compared to the ctrl (or treat). There are many different
> >ways to flip these, your contrast matrix with -1s is one way, but a
> >quicker way is to just multiply the original design matrix by -1:
> >
> >fit2 <- lmFit(MA, design*-1)
> >
> >Now, compare the direction of change between the two fit objects:
> >
> >fit$coef[1:5,]
> >fit2$coef[1:5,]
> >
> >The magnitude of the values shouldn't change, but the direction
> >should be switched. If you are absolutely sure that the ref was in
> >Cy5 on your arrays, then the fit2 object should contain the correct
> >orientation of "up" or "down" in the ctrl as compared to the ref.
> >However, if the positive controls are going in the opposite direction
> >of the way you expect them to be, it's not because you are setting up
> >the contrasts incorrectly. Either you have somehow switched the
> >samples on the arrays, or the probes for the positive control are
> >measuring a different part of the transcript that give a different
> >result than you expect.
> >
> >HTH,
> >Jenny
> >
> >
> >At 09:13 AM 2/18/2009, Giusy Della Gatta wrote:
> > >Hi Naomi,
> > >I don't know if I understood correct. I switched the signs of the
> > >design  and the
> > >contrast matrices, but I still have the same results: controls going
> > >at the opposite way.
> > >
> > > > library(limma)
> > > > # Read in data files
> > > > targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > > > RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
> > >Read Control _1b.txt
> > >Re
> > >Read Control_3b.txt
> > >Read dexa_dbz_lt_1.txt
> > >Read dexa_dbz_lt_2.txt
> > >Read dexa_dbz_lt_3.txt
> > > > # create MA list
> > > > MA<-MA.RG(RG, bc.method="none")
> > > > # perform background correction
> > > > RG<-backgroundCorrect (RG,method="none")
> > > > # perform within array normalization
> > > > MA<-normalizeWithinArrays(RG, method="loess")
> > > > # Create design matrix
> > > > design <- modelMatrix(targets, ref="ref")
> > >Found unique target names:
> > >  ctrl ref treat
> > > > design<- cbind(ctrl= c(1,1,1,0,0,0), treat= c(0,0,0,1,1,1))
> > > > design
> > >      ctrl treat
> > >[1,]    1     0
> > >[2,]    1     0
> > >[3,]    1     0
> > >[4,]    0     1
> > >[5,]    0     1
> > >[6,]    0     1
> > > > fit<-lmFit(MA,design)
> > > > cont.matrix<-cbind("ctrl-ref"=c(-1,0), "treat-ref"=c(0,-1))
> > > > cont.matrix
> > >      ctrl-ref treat-ref
> > >[1,]       -1         0
> > >[2,]        0        -1
> > > > fit2<-contrasts.fit(fit, cont.matrix)
> > > > d1 <- ebayes(fit2)
> > >
> > >
> > >
> > >
> > >Thank you
> > >Giusy
> > >
> > >
> > >-----Original Message-----
> > >From: Naomi Altman [mailto:naomi at stat.psu.edu]
> > >Sent: Tue 2/17/2009 7:50 PM
> > >To: Giusy Della Gatta; Naomi Altman; bioconductor at stat.math.ethz.ch
> > >Subject: RE: [BioC] two color arrays normalization
> > >
> > >Hi Giusy,
> > >Move the minus signs from the first design matrix to the 2nd and I
> > >think it will work fine.
> > >
> > >--Naomi
> > >
> > >At 06:07 PM 2/17/2009, Giusy Della Gatta wrote:
> > > >Hi Naomi,
> > > >
> > > >I performed the analysis of my micorarrays,but still I don't manage
> > > >to revert the channels!
> > > >My experiment consisted into infect cells with an adenovirus: 
> an emty one
> > > >and an adenovirus expressing for a specific protein.  Then I
> > > >treated the same cells with a specific drug or simply with the
> > > vehicle (DMSO).
> > > >I have 6 microarrays: 3 controls DMSO0-treated and 3 samples 
> drug-treated.
> > > >In each microarray the green channel is expressing the levels of
> > > >infected and treated
> > > >cells while the red channel are not infected cells. For all the
> > experiments
> > > >I have the same RED CHANNEL reference.
> > > >I composed the target file as follows:
> > > >
> > > >Filename        Cy3     Cy5
> > > >Control_1b.txt  ctrl    ref
> > > >Control_2b.txt  ctrl    ref
> > > >Control_3b.txt  ctrl    ref
> > > >dexa_dbz_lt_1.txt       treat   ref
> > > >dexa_dbz_lt_2.txt       treat   ref
> > > >dexa_dbz_lt_3.txt       treat   ref
> > > >
> > > >and the I used the following script:
> > > >
> > > > >targets=readTargets("target_frame_ltbarc_pgedit.giusy")
> > > > >RG<-read.maimages(targets$FileName, source="agilent", ext="txt")
> > > > >MA<-MA.RG(RG, bc.method="normexp")
> > > > >MA<-normalizeWithinArrays(RG, method="loess")
> > > > >design <- modelMatrix(targets, ref="ref")
> > > > > design
> > > >  ctrl treat
> > > >[1,]   -1     0
> > > >[2,]   -1     0
> > > >[3,]   -1     0
> > > >[4,]    0    -1
> > > >[5,]    0    -1
> > > >[6,]    0    -1
> > > > >fit<-lmFit(MA,design)
> > > > >cont.matrix<-cbind("ctrl-ref"=c(1,0), "treat-ref"=c(0,1))
> > > > >cont.matrix
> > > >      ctrl-ref treat-ref
> > > >[1,]        1         0
> > > >[2,]        0         1
> > > >
> > > > >fit2<-contrasts.fit(fit, cont.matrix)
> > > > >d1 <- ebayes(fit2)
> > > > >toptable(fit2,adjust="fdr")
> > > >
> > > >I don't know if I am still omitting
> > > >something, because I have the positive
> > > >controls of this experiment  that
> > > >are going exactly in the opposite way!!
> > > >
> > > >May you can help me?
> > > >
> > > >Thank you in advance!
> > > >Giusy
> > > >
> > > >
> > > >-----Original Message-----
> > > >From: Naomi Altman [mailto:naomi at stat.psu.edu]
> > > >Sent: Mon 2/9/2009 9:56 PM
> > > >To: Giusy Della Gatta; bioconductor at stat.math.ethz.ch
> > > >Subject: Re: [BioC] two color arrays normalization
> > > >
> > > >If there is no dye-swap, then what do you mean by "swapping of
> > the colors"?
> > > >
> > > >--Naomi
> > > >
> > > >At 07:56 PM 2/9/2009, Giusy Della Gatta wrote:
> > > >
> > > > >Hi everybody,
> > > > >
> > > > >I am analyzing two color Agilent microarrays
> > > > >by using LIMMA package.
> > > > >In my specific case the red channel is representing
> > > > >"the reference" while the green channel is "the treatment".
> > > > >Is it enough to use the Target File composition to specify the name
> > > > >of  the samples
> > > > >and their corrispondet channels?  Or I have to use other
> > specific commands
> > > > >to specify the "swapping" of the colors?
> > > > >
> > > > >Thank you in advance!
> > > > >Regards
> > > > >Giusy
> > > > >
> > > > >_______________________________________________
> > > > >Bioconductor mailing list
> > > > >Bioconductor at stat.math.ethz.ch
> > > > >https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > > >Search the archives:
> > > > >http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > >
> > > >Naomi S. Altman                                814-865-3791 (voice)
> > > >Associate Professor
> > > >Dept. of Statistics                              814-863-7114 (fax)
> > > >Penn State University                         814-865-1348 (Statistics)
> > > >University Park, PA 16802-2111
> > >
> > >Naomi S. Altman                                814-865-3791 (voice)
> > >Associate Professor
> > >Dept. of Statistics                              814-863-7114 (fax)
> > >Penn State University                         814-865-1348 (Statistics)
> > >University Park, PA 16802-2111
> > >
> > >_______________________________________________
> > >Bioconductor mailing list
> > >Bioconductor at stat.math.ethz.ch
> > >https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >Search the archives:
> > >http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >Jenny Drnevich, Ph.D.
> >
> >Functional Genomics Bioinformatics Specialist
> >W.M. Keck Center for Comparative and Functional Genomics
> >Roy J. Carver Biotechnology Center
> >University of Illinois, Urbana-Champaign
> >
> >330 ERML
> >1201 W. Gregory Dr.
> >Urbana, IL 61801
> >USA
> >
> >ph: 217-244-7355
> >fax: 217-265-5066
> >e-mail: drnevich at illinois.edu
>
>Jenny Drnevich, Ph.D.
>
>Functional Genomics Bioinformatics Specialist
>W.M. Keck Center for Comparative and Functional Genomics
>Roy J. Carver Biotechnology Center
>University of Illinois, Urbana-Champaign
>
>330 ERML
>1201 W. Gregory Dr.
>Urbana, IL 61801
>USA
>
>ph: 217-244-7355
>fax: 217-265-5066
>e-mail: drnevich at illinois.edu



More information about the Bioconductor mailing list