[BioC] Re: Pooling in Microarray Studies (Eric Blalock)
emblal at uky.edu
Mon Oct 6 14:36:23 MEST 2003
To add to what's already been said about pooling by Christina and Nicholas,
keep in mind that it is going to be difficult to use a sub-pooling strategy
(e.g., two animals per chip) and get the same power as one animal per chip
would give you, unless you increase the number of animals in the study.
This should be true if the samples generated from individual animals
generate sufficient mRNA for microarray measurement (otherwise the
non-pooling strategy is moot).
Dr. Kendziorski's recent paper discusses this (I reiterate because some
others on the list might find it useful)
and our group also published on the subject earlier this year
We've also gone back and actually compared results from pooled arrays vs.
individual for the same set of mRNA.So far, these results support the
notion that pooling mRNAs really does give the averaged readings found in
the 1:1 study (n = 18/ group). However, our sub-pooling strategy was 3:1
(subjects to arrays, 6 arrays per group) and it is possible that other
strategies, or a larger n (as Dr. Kendziorski has) might reveal some
serious problems with the 'average mixing' notion upon which our
sub-pooling assumptions were predicated.
>Date: Mon, 6 Oct 2003 10:11:49 -0500 (CDT)
>From: Christina Kendziorski <kendzior at biostat.wisc.edu>
>Subject: [BioC] Pooling in microarray studies
>To: bioconductor at stat.math.ethz.ch
> <Pine.GSO.4.44.0310060956470.24247-100000 at nova-8.biostat.wisc.edu>
>Content-Type: TEXT/PLAIN; charset=US-ASCII
>Pooling, as you have noted, is often done when sufficient
>mRNA is not available. I see that others have indicated that
>you probably have enough mRNA...but I will make a few
>comments anyway. Pooling is also done in an effort to
>reduce the effect of biological variability. There are
>many studies that have used pooled samples for this
>latter purpose. Provided certain assumptions hold
>(mRNAs average out when pooled and there are no outliers),
>pooling can be advantageous. You end up getting an estimate
>of the sum of pool to pool and technical variability....
>and provided you are making inferences about pools of subjects
>(as opposed to individuals - which is often the case
>when studying experimental populations), this estimate is
>the one you need. I discuss this in a recent Biostatistics paper.
>I'll send it under separate cover.
>We are doing studies to check these assumptions using ~60
>Affy chips. There is some evidence to think that the assumptions
>might NOT hold.
>Finally, I will note that if there is contamination
>(let's say an outlier animal), pooling might still be useful.
>As I argued at a talk at JSM (slides are
>at my website), there are different cases of contamination to
>consider. THe most realistic one is that an animal is an outlier
>in say some collection of genes....and you don't know of course what
>that collection is. You'll either end up averaging at the
>mRNA level (again, provided mRNAs average) or you will end up
>averaging the individual measurements across arrays (after normalizing).
>I hope that helps.
>Department of Biostatistics and Medical Informatics
>University of Wisconsin - Madison
>Medical Sciences Center (6729)
>1300 University Avenue
>Madison, Wisconsin 53706
>Phone: (608) 262-3146
>Fax: (608) 265-7916
> > I have question arising to the pooling of mRNA
> > samples. Someone approached me about the
> > following problem:
> > The study wants to use Affymetrix chips to study
> > changes in expression between a group of treated
> > mice and a group untreated mice. There are 10 mice
> > in each group. It is only possible to extract
> > 8 ug of RNA from each mouse, not enough for one chip.
> > (According to the experimenters they require 10 ug per
> > chip) So it is not possible to use biological
> > replicate chips for each individual mice. Now the issue
> > is whether to perhaps pool the RNA in each group
> > and carry out analysis on technical replicates from the
> > pooled samples.
> > As I understand it pooling may reduce the precision, with
> > the risk that one or few samples can dominate the outcome, and
> > that averaging over single sample hybridisations is perhaps
> > safer than using pooled samples. However in this case you cannot
> > do single sample hybridisations.
> > I was wondering if the following approach is an acceptable
> > compromise to retain at least some information on the between
> > sample variation in each group:
> > Mix the RNA from 2 different mice on a single chip to get 5
> > hybridisations, where the hybridisation on each chip is from the
> > mix of the RNA samples of two mice? I though that this may
> > enable you to some extend if all the mice are behaving
> > similarly. Ofcourse one would not be able to distinguish
> > between the behaviour of the two mice relating to the same
> > chip. Or is it better to accept that you do not have enough
> > RNA to hybridize the sample for each individual to a separate
> > chip and pool the samples and accept the risk that
> > one sample may dominate the outcome? The best
> > solution did not seem obvious (to me at least!)
> > Any comments will be much appreciated.
> > Wiesner
> > Wiesner J. Vos
> > Department of Statistics
> > University of Oxford
> > 1 South Parks Road
> > OX1 3TG
> > United Kingdom
>Date: Mon, 06 Oct 2003 23:36:25 +0800
>From: "Nicholas Lewin-Koh" <nikko at hailmail.net>
>Subject: [BioC] Re: Pooling in microarray studies
>To: bioconductor at stat.math.ethz.ch
>Message-ID: <20031006153625.6E3083D7A6 at www.fastmail.fm>
>Content-Type: text/plain; charset="ISO-8859-1"
>If you can get away without pooling, don't. However if you do decide to
>pool i would use more than two mice per a chip. As another poster
>mentioned C. M. Kendziorski at university of Wisconsinn has done some
>really nice work on the effects of pooling in terms of variance
>components. One key point is that there is the added variance of
>assigning mice to pools as well as the biological variance between mice.
>She did some nice work on the number of mice needed per a pool to get the
>variance to an acceptable level as well. The only real cause for pooling
>is if you really can't get enough Rna and mice are cheap relative to
More information about the Bioconductor