[BioC] DNA micro-array normalization
James W. MacDonald
jmacdon at med.umich.edu
Tue Feb 16 21:50:26 CET 2010
To add to this; these data are almost surely MAS5 processed data, as I
don't know of any other algorithm that gives the detection p-value. In
addition, the range of 0 - 9000 indicates that these data are not logged
(which is the next step for you). People normally use log base 2 so that
a difference of 1 or -1 indicates two-fold up or down regulation.
MAS5 data are normalized after the fact, so you should log transform and
then look at plots of the densities to see if they look as if they have
been normalized or not. The default is to do a scale normalization, so
you are just looking for the densities to be in same general vicinity
rather than overlaying each other.
If you could get the original celfiles, you would be much better off.
Best,
Jim
michael watson (IAH-C) wrote:
> This is definitely processed data, and without access to the original data or a description of the analysis methodology, your options are limited.
>
> Personally, I'd do a test for normality on the "Signal" values, and if they turn out to be normal, I'd run a simple t-test (control vs treatment) on each gene and correct the p-values for multiple testing.
>
> Simple stuff, but it should work.
> ________________________________________
> From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of avehna [avhena at gmail.com]
> Sent: 16 February 2010 19:47
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] DNA micro-array normalization
>
> Hi There,
>
> I've got a DNA microarray dataset that looks like this:
>
> * Probe Signal Detection
> Detection_p-value Descriptions*
> AFFX-BioB-5_at 181 P
> 0.00011 "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioB-M_at 227.3 P 0.000044
> "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioC-5_at 499.2 P
> 0.000052 "E. coli GEN=bioC DB_XREF=gb:J04423.1"
>
> I have control and treatment with 3 replicas for each one of them.
>
> But I'm not sure whether these data have been already normalized, and on the
> other hand, this is not the typical affymetrix format...
>
> Could you help me in this regard? What is the typical signal range for rough
> affymetrix data? (these data range from 0 to 9000)
>
> If the data have been already normalized, Can I calculate the mean (for
> treatment and control) followed by the differential expression of genes
> without taking into account the "Detection" column?
>
> (I guess I will need to build my ExpressionSet from scratch)
>
> Thanks a lot (I'm a newbie in bioconductor and micro-array analysis). I will
> appreciate you help!
>
> Avhena
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list