[BioC] Analyzing arrays with no replicates
James W. MacDonald
jmacdon at med.umich.edu
Sun Feb 20 23:30:15 CET 2005
McGee, Monnie wrote:
>
> 1. I used rma(affy.object) to obtain normalized, background
> corrected expression levels. However, rma uses median polish to
> obtain expression values. I read in the achieves of this list that
> median polish was not a good idea when there are no replicates, and
> that rlm would be better. In order to use rlm, one needs an
> expression set, which means I have to specify some sort of summary
> method (using expresso) or take median polish (rma). If not median
> polish, which summary method? I have also read on the list that
> reviewers of some journals consider that median polish is the best
> method, and, if it is not used, the analysis is suspect (resulting in
> a reanalysis or a rejection). So, if I don't use median polish (or
> rma) how do I justify the method I use?
I think you are confused. RMA doesn't care if there are any replicates
or not. It is simply used to convert PM data to an expression measure
for each gene on each chip.
>
> 2. When I do my.eset <- rma(affy.object), I obtain an object of class
> "exprSet", but I think the phenoData attribute is incorrect;
>
>> print(my.eset)
>
> Expression Set (exprSet) with 54675 genes 4 samples phenoData object
> with 1 variables and 4 cases varLabels sample: arbitrary numbering
>
>> pData(my.eset)
>
> sample Control 1 Treat1 2 Treat2 3 Treat3 4
>
> It seems to me that the phenoData object should have 4 variables and
> 1 case. In other words, phenoData sees my treatments as cases (or
> patients) and says that I have one variable.
You can change the phenoData slot at any time. See ?exprSet for more
information.
>
>
> 3. Once I have expression levels (and a correct eset), what is the
> best way to do non-specific filtering? Clearly, I don't want to test
> all 54K genes. I tried the following, suggested in one of the R
> working papers:
>
>
>> library(genefilter) f1 <- pOverA(0.25, log2(100)) f2 <- function(x)
>> (IQR(x) > 0.5) ff <- filterfun(f1,f2) selected <-
>> genefilter(my.eset,ff) sum(selected)
>
> [1] 57
>
> Maybe 57 genes is a great number, but it seems much smaller than the
> examples I've seen in papers. Also, one of the AFFX genes is in this
> filtered sample. Since they are housekeeping genes, I would think
> that they aren't supposed to pass through the filter.
You will likely have to do some sort of pre-filtering before looking at
your fold change data (see answers from Sean Davis and Naomi Altman).
Without any replication there is no reason to expect that an AFFX gene
would be removed by a filter, because there is no reason to expect that
these probes are any less noisy than any other, and without any measure
of variability you will not be able to distinguish differentially
expressed genes from noisy genes.
Jim
--
James W. MacDonald
University of Michigan
Affymetrix and cDNA Microarray Core
1500 E Medical Center Drive
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list