[BioC] difference between rmaPLM and default fitPLM?

Thu Aug 14 03:32:08 CEST 2008

In answer to your questions:

1. They are not intended to return the exact same numerical values.
There is probably details in one of the affyPLM that explains what
fitPLM does in general for fitting models. In particular, see appendix A
in:

http://www.bioconductor.org/packages/2.2/bioc/vignettes/affyPLM/inst/doc/AffyExtensions.pdf

more specifically it is fitting the model

y_ij = probe_i + chip_j + e_ij

for each probeset. This is the same model that RMA fits using the median
polish. Different fitting algorithms lead to numerically different
parameter estimates, though I would be surprised if there was any
significant difference in performance between the two.

All rmaPLM is meant to do is the standard RMA computation but return it
in a PLMset (meaning the model residuals also get returned in the
appropriate slot of the PLMset). It does not, as you have observed,
return SE estimates.

2. If you want the se.chip.coefs that correspond directly to the
expression values you get from rma() then you out of luck. You could do
what most people do, which is assume that for all intents and purposes
the values you get from fitPLM are equivalent to RMA (though since RMA
is a bit of a "brand name", they are not really quite the same as you
have seen, I don't make that exact claim).

Why is there not SE for RMA? Basically, the main reason is that as far
as I am aware there is not any asymptotic theory for the median polish
leading to good SE. I suppose that I could have cooked up something
ad-hoc for RMA, but the PLM approach superceded it instead.

Best,

Ben

On Wed, 2008-08-13 at 16:16 -0500, Jenny Drnevich wrote:
> Hi again,
> 
> I've gone through the codes of fitPLM and rmaPLM, and I think the 
> difference between them is in the modelparam; rmaPLM instead uses 
> md.param, which has only two of the list items of modelparam (I don't 
> pretend to understand what they are). Each function also calls a 
> different .Call code, and the one rmaPLM uses outputs NaNs for all 
> the se.*.coefs. So, my questions remain:
> 
> 1. Are rmaPLM and fitPLM with defaults supposed to return the same 
> expression values? If so, why aren't they, and if not, what is fitPLM 
> calculating?
> 
> 2. Is there anyway to get the se.chip.coefs for RMA expression values?
> 
> Thanks!
> Jenny
> 
> At 01:18 PM 8/13/2008, Jenny Drnevich wrote:
> >HI,
> >
> >I'm getting a difference in the output of rmaPLM() and the default 
> >fitPLM(). I thought the default settings of fitPLM was supposed to 
> >be the RMA model, at least the vignette calls it a RMA-style PLM. 
> >Besides differences in the coefs/RMA values produced, rmaPLM is not 
> >outputting the se.chip.coefs slot, which is the standard error 
> >estimates for the chip coefficients that I want. See code and 
> >sessionInfo() below. I'm stepping through the code of each right now 
> >to figure out where the differences are, but I thought maybe someone 
> >could help me.
> >
> >Thanks,
> >Jenny
> >
> > > raw <- ReadAffy()
> >
> > > raw
> >AffyBatch object
> >size of arrays=1002x1002 features (8 kb)
> >cdf=Mouse430_2 (45101 affyids)
> >number of samples=4
> >number of genes=45101
> >annotation=mouse4302
> >notes=
> >
> > > rma.plm1 <- fitPLM(raw)
> > > rma.plm2 <- rmaPLM(raw)
> > > coefs(rma.plm1)[1:5,]
> >                  ctrl      ipi     Dmp1 ipi+Dmp1
> >1415670_at   11.02547 11.94682 11.07018 11.97461
> >1415671_at   12.33057 11.79736 12.25891 11.85637
> >1415672_at   11.38321 11.27387 11.36435 11.27967
> >1415673_at   10.03639 10.30681 10.02795 10.50375
> >1415674_a_at 10.45065 10.39834 10.53324 10.41134
> > > coefs(rma.plm2)[1:5,]
> >                   ctrl      ipi      Dmp1 ipi+Dmp1
> >1415670_at   10.710135 11.64534 10.751418 11.64572
> >1415671_at   12.414390 11.86388 12.337369 11.92163
> >1415672_at   11.584338 11.46070 11.549982 11.51548
> >1415673_at    9.902446 10.20577  9.933505 10.41233
> >1415674_a_at 10.228274 10.13993 10.301835 10.18136
> > > rma.rma <- rma(raw)
> >Background correcting
> >Normalizing
> >Calculating Expression
> > > exprs(rma.rma)[1:5,]
> >                   ctrl      ipi      Dmp1 ipi+Dmp1
> >1415670_at   10.710135 11.64534 10.751418 11.64572
> >1415671_at   12.414390 11.86388 12.337369 11.92163
> >1415672_at   11.584338 11.46070 11.549982 11.51548
> >1415673_at    9.902446 10.20577  9.933505 10.41233
> >1415674_a_at 10.228274 10.13993 10.301835 10.18136
> > > se(rma.plm1)[1:5,]
> >                    ctrl        ipi       Dmp1   ipi+Dmp1
> >1415670_at   0.02452685 0.02395422 0.02431190 0.02582995
> >1415671_at   0.02061485 0.02152285 0.02013362 0.02067196
> >1415672_at   0.03053776 0.02985205 0.03011108 0.03238839
> >1415673_at   0.04038712 0.03987544 0.04036288 0.03978057
> >1415674_a_at 0.02810474 0.02876715 0.02745431 0.02860643
> > > se(rma.plm2)[1:5,]
> >              ctrl ipi Dmp1 ipi+Dmp1
> >1415670_at    NaN NaN  NaN      NaN
> >1415671_at    NaN NaN  NaN      NaN
> >1415672_at    NaN NaN  NaN      NaN
> >1415673_at    NaN NaN  NaN      NaN
> >1415674_a_at  NaN NaN  NaN      NaN
> > > sessionInfo()
> >R version 2.7.1 (2008-06-23)
> >i386-pc-mingw32
> >
> >locale:
> >LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> >States.1252;LC_MONETARY=English_United 
> >States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> >
> >attached base packages:
> >[1] splines   tools     stats     graphics  grDevices utils     datasets
> >[8] methods   base
> >
> >other attached packages:
> >  [1] mouse4302cdf_2.2.0   affyQCReport_1.18.0  geneplotter_1.18.0
> >  [4] lattice_0.17-8       RColorBrewer_1.0-2   simpleaffy_2.16.0
> >  [7] made4_1.14.0         ade4_1.4-9           affyPLM_1.16.0
> >[10] affycoretools_1.12.0 annaffy_1.12.1       KEGG.db_2.2.0
> >[13] gcrma_2.12.1         matchprobes_1.12.0   biomaRt_1.14.0
> >[16] RCurl_0.9-3          GOstats_2.6.0        Category_2.6.0
> >[19] genefilter_1.20.0    survival_2.34-1      RBGL_1.16.0
> >[22] annotate_1.18.0      xtable_1.5-2         GO.db_2.2.0
> >[25] AnnotationDbi_1.2.2  RSQLite_0.6-9        DBI_0.2-4
> >[28] graph_1.18.1         limma_2.14.5         affy_1.18.2
> >[31] preprocessCore_1.2.0 affyio_1.8.0         Biobase_2.0.1
> >[34] RWinEdt_1.8-0
> >
> >loaded via a namespace (and not attached):
> >[1] cluster_1.11.11    grid_2.7.1         KernSmooth_2.22-22 XML_1.94-0.1
> >
> >
> >
> >Jenny Drnevich, Ph.D.
> >
> >Functional Genomics Bioinformatics Specialist
> >W.M. Keck Center for Comparative and Functional Genomics
> >Roy J. Carver Biotechnology Center
> >University of Illinois, Urbana-Champaign
> >
> >330 ERML
> >1201 W. Gregory Dr.
> >Urbana, IL 61801
> >USA
> >
> >ph: 217-244-7355
> >fax: 217-265-5066
> >e-mail: drnevich at illinois.edu
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
> >Search the archives: 
> >http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> Jenny Drnevich, Ph.D.
> 
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
> 
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
> 
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor