[BioC] computing SD using a list of gene expressions
Norman Pavelka
norman.pavelka at unimib.it
Thu Apr 21 17:31:31 CEST 2005
Dear Lana,
You could consider to try and evaluate a recently released BioC
package, called 'plgem' (Power Law Global Error Model), that is
available from the developmental repository under
<http://www.bioconductor.org/repository/devel/package/html/plgem.html>
. Briefly, it represent a method that estimates SD of single genes,
based on a global behavior of all genes in a dataset of replicated
samples. It takes advantage of the fact that the SD of a given gene
depends on the average expression level of the gene itself, following a
power law.
After installing the package (which depends on MASS) there is a quite
straightforward wrapper that fits the model to the data, computes
model-based differential expression statistics and outputs a list of
significantly changing genes, based on a set of random resamplings of
the data used for fitting the model. In case you do not have enough
replicates in the dataset to perform the resampling step, the first n
(default is 100) genes are selected.
You first need to create from your data an object of class exprSet
with a phenodata slot that contains a covariate called
conditionName, in which you provide some coding of your classes (e.g.
treated, ctrl, etc.). The only important thing here is that you
give the same value to samples you wish to be treated as replicates.
Other covariates in addition to conditionName are allowed, but will
be ignored.
Then simply type:
>run.plgem(esdata)->list.of.significant.genes
where esdata is an object of class exprSet as described above. This
will assume by default that your baseline samples are the first
encountered in your phenodata and that you want to perform the
selection at an overall significance level of 0.001. To change some of
these or other defaults, please refer to the help pages and to the
vignette provided in the package.
Of course you will need at least one condition with 3 replicates in
order to fit the model, but in the remaining experimental conditions
the SD can be estimated even from single samples.
Reference article: <http://www.biomedcentral.com/1471-2105/5/203>
I will be happy to help you if encounter any difficulties.
Good luck!
Norman
Norman Pavelka
Department of Biotechnology and Bioscience
University of Milano-Bicocca
Piazza della Scienza, 2
20126 Milan, Italy
Phone: +39 02 6448 3556
Fax: +39 02 6448 3552
> Date: Wed, 20 Apr 2005 09:34:50 -0700
> From: "Lana Schaffer" <schaffer at scripps.edu>
> Subject: [BioC] computing SD using a list of gene expressions
> To: <bioconductor at stat.math.ethz.ch>
> Message-ID: <002a01c545c6$e00089e0$54508389 at menton>
> Content-Type: text/plain
>
> Hi,
> I would like to know if there is a way to estimate standard deviations
> for all genes, using
> noise information of genes with similar intensity levels?
> This would be helpful when trying to obtain significant fold change
> from experiments without
> replicates.
> Thanks for your ideas.
> Lana
>
[[alternative text/enriched version deleted]]
More information about the Bioconductor
mailing list