[BioC] Re: Proper Pooling Design
Eric
emblal at uky.edu
Wed Jan 21 01:59:31 MET 2004
I would agree that the latter design would be of greater interest from a
biological point of view and recommend the following two articles.
Kendziorski, C. M., Y. Zhang, H. Lan and A. D. Attie (2003). "The
efficiency of pooling mRNA in microarray experiments." Biostatistics 4(3):
465-77.
In a microarray experiment, messenger RNA samples are oftentimes
pooled across subjects out of necessity, or in an effort to reduce the
effect of biological variation. A basic problem in such experiments is to
estimate the nominal expression levels of a large number of genes. Pooling
samples will affect expression estimation, but the exact effects are not
yet known as the approach has not been systematically studied in this
context. We consider how mRNA pooling affects expression estimates by
assessing the finite-sample performance of different estimators for designs
with and without pooling. Conditions under which it is advantageous to pool
mRNA are defined; and general properties of estimates from both pooled and
non-pooled designs are derived under these conditions. A formula is given
for the total number of subjects and arrays required in a pooled experiment
to obtain gene expression estimates and confidence intervals comparable to
those obtained from the no-pooling case. The formula demonstrates that by
pooling a perhaps increased number of subjects, one can decrease the number
of arrays required in an experiment without a loss of precision. The
assumptions that facilitate derivation of this formula are considered using
data from a quantitative real-time PCR experiment. The calculations are not
specific to one particular method of quantifying gene expression as they
assume only that a single, normalized, estimate of expression is obtained
for each gene. As such, the results should be generally applicable to a
number of technologies provided sufficient pre-processing and normalization
methods are available and applied.
Peng, X., C. L. Wood, E. M. Blalock, K. C. Chen, P. W. Landfield and A. J.
Stromberg (2003). "Statistical implications of pooling RNA samples for
microarray experiments." BMC Bioinformatics 4(1): 26.
BACKGROUND: Microarray technology has become a very important tool
for studying gene expression profiles under various conditions. Biologists
often pool RNA samples extracted from different subjects onto a single
microarray chip to help defray the cost of microarray experiments as well
as to correct for the technical difficulty in getting sufficient RNA from a
single subject. However, the statistical, technical and financial
implications of pooling have not been explicitly investigated. RESULTS:
Modeling the resulting gene expression from sample pooling as a mixture of
individual responses, we derived expressions for the experimental error and
provided both upper and lower bounds for its value in terms of the
variability among individuals and the number of RNA samples pooled. Using
"virtual" pooling of data from real experiments and computer simulations,
we investigated the statistical properties of RNA sample pooling. Our study
reveals that pooling biological samples appropriately is statistically
valid and efficient for microarray experiments. Furthermore, optimal
pooling design(s) can be found to meet statistical requirements while
minimizing total cost. CONCLUSIONS: Appropriate RNA pooling can provide
equivalent power and improve efficiency and cost-effectiveness for
microarray experiments with a modest increase in total number of subjects.
Pooling schemes in terms of replicates of subjects and arrays can be
compared before experiments are conducted.
At 11:48 PM 1/20/2004 +0100, you wrote:
>Message: 7
>Date: Tue, 20 Jan 2004 17:47:04 -0500
>From: YUK FAI LEUNG <yfleung at mcb.harvard.edu>
>Subject: [BioC] Proper pooling design
>To: bioconductor at stat.math.ethz.ch
>Message-ID: <400DAFE8.5030709 at mcb.harvard.edu>
>Content-Type: text/plain; charset=us-ascii; format=flowed
>
>Hi there,
>
>I am designing a pilot microarray study on embryoic developmental mutant
>using affy platform. The comparison itself is very simple, the mutant vs
>normal at one time point. Due to various reasons (mostly funding and
>limited amount of tissue), I can't start with the "ideal" approach in
>which each sample is hybridized to an individual chip.
>
>Since I can easily rear a lot of animals, it seems that pooling is the
>only choice for the pilot study. However I am not sure what is the best
>way to allocate the pooled samples to each chip. For example if I want
>to do 3 array replicates each for the mutant and control. Is it better
>to pool enough samples for 3 arrays and then separate the pooled sample
>in 3 portions for hybridization or just pool different individual
>samples for different replicates?
>
>It seems to me that the first way is like getting a group expression
>average with accessment of technical variation, while the second
>approach can also provide some sort of evalution of biological
>variation, abeit an averaged one by the pooling. I suspect the latter
>approach is better, and would love to know the suggestions from you.
>
>Thanks!
>
>Fai
>
>--
>Yuk Fai Leung
>Department of Molecular and Cellular Biology
>Harvard University
>BL 2079, 16 Divinity Avenue
>Cambridge, MA 02138
>Tel: 617-495-2599
>Fax: 617-496-3321
>email: yfleung at mcb.harvard.edu; yfleung at genomicshome.com
>URL: http://genomicshome.com
Eric Blalock, PhD
Dept Pharmacology, UKMC
859 323-8033
STATEMENT OF CONFIDENTIALITY
The contents of this e-mail message and any attachments are confidential
and are intended solely for addressee. The information may also be legally
privileged. This transmission is sent in trust, for the sole purpose of
delivery to the intended recipient. If you have received this transmission
in error, any use, reproduction or dissemination of this transmission is
strictly prohibited. If you are not the intended recipient, please
immediately notify the sender by reply e-mail or at (859) 323-8033 and
delete this message and its attachments, if any.
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list