[BioC] Re: Proper Pooling Design

Eric emblal at uky.edu
Wed Jan 21 01:59:31 MET 2004


I would agree that the latter design would be of greater interest from a 
biological point of view and recommend the following two articles.
Kendziorski, C. M., Y. Zhang, H. Lan and A. D. Attie (2003). "The 
efficiency of pooling mRNA in microarray experiments." Biostatistics 4(3): 
465-77.
         In a microarray experiment, messenger RNA samples are oftentimes 
pooled across subjects out of necessity, or in an effort to reduce the 
effect of biological variation. A basic problem in such experiments is to 
estimate the nominal expression levels of a large number of genes. Pooling 
samples will affect expression estimation, but the exact effects are not 
yet known as the approach has not been systematically studied in this 
context. We consider how mRNA pooling affects expression estimates by 
assessing the finite-sample performance of different estimators for designs 
with and without pooling. Conditions under which it is advantageous to pool 
mRNA are defined; and general properties of estimates from both pooled and 
non-pooled designs are derived under these conditions. A formula is given 
for the total number of subjects and arrays required in a pooled experiment 
to obtain gene expression estimates and confidence intervals comparable to 
those obtained from the no-pooling case. The formula demonstrates that by 
pooling a perhaps increased number of subjects, one can decrease the number 
of arrays required in an experiment without a loss of precision. The 
assumptions that facilitate derivation of this formula are considered using 
data from a quantitative real-time PCR experiment. The calculations are not 
specific to one particular method of quantifying gene expression as they 
assume only that a single, normalized, estimate of expression is obtained 
for each gene. As such, the results should be generally applicable to a 
number of technologies provided sufficient pre-processing and normalization 
methods are available and applied.

Peng, X., C. L. Wood, E. M. Blalock, K. C. Chen, P. W. Landfield and A. J. 
Stromberg (2003). "Statistical implications of pooling RNA samples for 
microarray experiments." BMC Bioinformatics 4(1): 26.
         BACKGROUND: Microarray technology has become a very important tool 
for studying gene expression profiles under various conditions. Biologists 
often pool RNA samples extracted from different subjects onto a single 
microarray chip to help defray the cost of microarray experiments as well 
as to correct for the technical difficulty in getting sufficient RNA from a 
single subject. However, the statistical, technical and financial 
implications of pooling have not been explicitly investigated. RESULTS: 
Modeling the resulting gene expression from sample pooling as a mixture of 
individual responses, we derived expressions for the experimental error and 
provided both upper and lower bounds for its value in terms of the 
variability among individuals and the number of RNA samples pooled. Using 
"virtual" pooling of data from real experiments and computer simulations, 
we investigated the statistical properties of RNA sample pooling. Our study 
reveals that pooling biological samples appropriately is statistically 
valid and efficient for microarray experiments. Furthermore, optimal 
pooling design(s) can be found to meet statistical requirements while 
minimizing total cost. CONCLUSIONS: Appropriate RNA pooling can provide 
equivalent power and improve efficiency and cost-effectiveness for 
microarray experiments with a modest increase in total number of subjects. 
Pooling schemes in terms of replicates of subjects and arrays can be 
compared before experiments are conducted.

At 11:48 PM 1/20/2004 +0100, you wrote:
>Message: 7
>Date: Tue, 20 Jan 2004 17:47:04 -0500
>From: YUK FAI LEUNG <yfleung at mcb.harvard.edu>
>Subject: [BioC] Proper pooling design
>To: bioconductor at stat.math.ethz.ch
>Message-ID: <400DAFE8.5030709 at mcb.harvard.edu>
>Content-Type: text/plain; charset=us-ascii; format=flowed
>
>Hi there,
>
>I am designing a pilot microarray study on embryoic developmental mutant
>using affy platform. The comparison itself is very simple, the mutant vs
>normal at one time point. Due to various reasons (mostly funding and
>limited amount of tissue), I can't start with the "ideal" approach in
>which each sample is hybridized to an individual chip.
>
>Since I can easily rear a lot of animals, it seems that pooling is the
>only choice for the pilot study. However I am not sure what is the best
>way to allocate the pooled samples to each chip. For example if I want
>to do 3 array replicates each for the mutant and control. Is it better
>to pool enough samples for 3 arrays and then separate the pooled sample
>in 3 portions for hybridization or just pool different individual
>samples for different replicates?
>
>It seems to me that the first way is like getting a group expression
>average with accessment of technical variation, while the second
>approach can also provide some sort of evalution of biological
>variation, abeit an averaged one by the pooling. I suspect the latter
>approach is better, and would love to know the suggestions from you.
>
>Thanks!
>
>Fai
>
>--
>Yuk Fai Leung
>Department of Molecular and Cellular Biology
>Harvard University
>BL 2079, 16 Divinity Avenue
>Cambridge, MA 02138
>Tel: 617-495-2599
>Fax: 617-496-3321
>email: yfleung at mcb.harvard.edu; yfleung at genomicshome.com
>URL: http://genomicshome.com

Eric Blalock, PhD
Dept Pharmacology, UKMC
859 323-8033

STATEMENT OF CONFIDENTIALITY

The contents of this e-mail message and any attachments are confidential 
and are intended solely for addressee. The information may also be legally 
privileged. This transmission is sent in trust, for the sole purpose of 
delivery to the intended recipient. If you have received this transmission 
in error, any use, reproduction or dissemination of this transmission is 
strictly prohibited. If you are not the intended recipient, please 
immediately notify the sender by reply e-mail or at (859) 323-8033 and 
delete this message and its attachments, if any.
	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list