[BioC] DNA micro-array normalization
michael watson (IAH-C)
michael.watson at bbsrc.ac.uk
Tue Feb 16 22:00:15 CET 2010
Wolfgang: extra prizes if you can get the entire analysis into one line...
Avehna: setting up the data structures required for this analysis in limma should be fairly simple, but if you have problems, please ask.
It looks like you have data on e coli and we have R packages for displaying quantitative data on bacterial genomes. This can be useful for looking at operons, islands etc. Again, if this seems helpful, I can provide more info.
________________________________________
From: Wolfgang Huber [whuber at embl.de]
Sent: 16 February 2010 20:54
To: michael watson (IAH-C)
Cc: avehna; bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] DNA micro-array normalization
Avehna:
I'd try with lmFit / eBayes from limma, since the moderated t-test
typically provides better power for such small sample sizes. Also,
I'd look at the output of "meanSdPlot" (from the vsn package) and
"multidensity" or "multiecdf" (from the geneplotter package) to see
whether the data need (i) transformation and (ii) between-array
normalisation. For both, "vsn2" from the vsn package is one possibility.
Michael:
one more :) -- I guess fortune(117) and fortune(234) apply. Less opaquely,
- I don't know of a test that has power to reject Normality on a sample
of size 3 or 6.
- Normality of the data is a sufficient condition for some (important)
theoretical properties of the t-test, but it is not necessary for it to
provide good enough type I error control and power in applications.
Best wishes
Wolfgang
michael watson (IAH-C) scripsit 02/16/2010 09:34 PM:
> This is definitely processed data, and without access to the original data or a description of the analysis methodology, your options are limited.
>
> Personally, I'd do a test for normality on the "Signal" values, and if they turn out to be normal, I'd run a simple t-test (control vs treatment) on each gene and correct the p-values for multiple testing.
>
> Simple stuff, but it should work.
> ________________________________________
> From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of avehna [avhena at gmail.com]
> Sent: 16 February 2010 19:47
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] DNA micro-array normalization
>
> Hi There,
>
> I've got a DNA microarray dataset that looks like this:
>
> * Probe Signal Detection
> Detection_p-value Descriptions*
> AFFX-BioB-5_at 181 P
> 0.00011 "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioB-M_at 227.3 P 0.000044
> "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioC-5_at 499.2 P
> 0.000052 "E. coli GEN=bioC DB_XREF=gb:J04423.1"
>
> I have control and treatment with 3 replicas for each one of them.
>
> But I'm not sure whether these data have been already normalized, and on the
> other hand, this is not the typical affymetrix format...
>
> Could you help me in this regard? What is the typical signal range for rough
> affymetrix data? (these data range from 0 to 9000)
>
> If the data have been already normalized, Can I calculate the mean (for
> treatment and control) followed by the differential expression of genes
> without taking into account the "Detection" column?
>
> (I guess I will need to build my ExpressionSet from scratch)
>
> Thanks a lot (I'm a newbie in bioconductor and micro-array analysis). I will
> appreciate you help!
>
> Avhena
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Best wishes
Wolfgang
--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber/contact
More information about the Bioconductor
mailing list