[BioC] DNA micro-array normalization
James W. MacDonald
jmacdon at med.umich.edu
Wed Feb 17 17:42:19 CET 2010
Hi Avehna,
Please don't take conversations off list. The list is considered to be a
resource that people in your situation can use in the future to answer
questions themselves.
avehna wrote:
> On Tue, Feb 16, 2010 at 3:50 PM, James W. MacDonald
> <jmacdon at med.umich.edu <mailto:jmacdon at med.umich.edu>> wrote:
>
> To add to this; these data are almost surely MAS5 processed data, as
> I don't know of any other algorithm that gives the detection
> p-value. In addition, the range of 0 - 9000 indicates that these
> data are not logged (which is the next step for you). People
> normally use log base 2 so that a difference of 1 or -1 indicates
> two-fold up or down regulation.
>
>
> OK. But in this case what would be the reference point? Wouldn't be the
> up or down regulation respect to the control? Before writing to the list
> I have browsed several tutorials and I'm still missing this part. Should
> it be log2(treatment/control)? (It's not clear what I have read)
Yes. Or since you have already taken logs, it will be log2(treatment) -
log2(control), which you will notice is the numerator of your t-statistic.
>
>
> MAS5 data are normalized after the fact, so you should log transform
> and then look at plots of the densities to see if they look as if
> they have been normalized or not. The default is to do a scale
> normalization, so you are just looking for the densities to be in
> same general vicinity rather than overlaying each other.
>
>
> OK. Could you send me some helpful references about that?
http://media.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf
Best,
Jim
>
>
>
> If you could get the original celfiles, you would be much better off.
>
>
> I will try!
>
> Best and thank you so much for your help,
>
> Avhena.
>
> Best,
>
> Jim
>
>
>
>
> michael watson (IAH-C) wrote:
>
> This is definitely processed data, and without access to the
> original data or a description of the analysis methodology, your
> options are limited.
>
> Personally, I'd do a test for normality on the "Signal" values,
> and if they turn out to be normal, I'd run a simple t-test
> (control vs treatment) on each gene and correct the p-values for
> multiple testing.
>
> Simple stuff, but it should work.
> ________________________________________
> From: bioconductor-bounces at stat.math.ethz.ch
> <mailto:bioconductor-bounces at stat.math.ethz.ch>
> [bioconductor-bounces at stat.math.ethz.ch
> <mailto:bioconductor-bounces at stat.math.ethz.ch>] On Behalf Of
> avehna [avhena at gmail.com <mailto:avhena at gmail.com>]
> Sent: 16 February 2010 19:47
> To: bioconductor at stat.math.ethz.ch
> <mailto:bioconductor at stat.math.ethz.ch>
> Subject: [BioC] DNA micro-array normalization
>
> Hi There,
>
> I've got a DNA microarray dataset that looks like this:
>
> * Probe Signal Detection
> Detection_p-value Descriptions*
> AFFX-BioB-5_at 181 P
> 0.00011 "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioB-M_at 227.3 P 0.000044
> "E. coli GEN=bioB DB_XREF=gb:J04423.1"
> AFFX-BioC-5_at 499.2 P
> 0.000052 "E. coli GEN=bioC DB_XREF=gb:J04423.1"
>
> I have control and treatment with 3 replicas for each one of them.
>
> But I'm not sure whether these data have been already
> normalized, and on the
> other hand, this is not the typical affymetrix format...
>
> Could you help me in this regard? What is the typical signal
> range for rough
> affymetrix data? (these data range from 0 to 9000)
>
> If the data have been already normalized, Can I calculate the
> mean (for
> treatment and control) followed by the differential expression
> of genes
> without taking into account the "Detection" column?
>
> (I guess I will need to build my ExpressionSet from scratch)
>
> Thanks a lot (I'm a newbie in bioconductor and micro-array
> analysis). I will
> appreciate you help!
>
> Avhena
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> <mailto:Bioconductor at stat.math.ethz.ch>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> <mailto:Bioconductor at stat.math.ethz.ch>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should
> not be used for urgent or sensitive issues
>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list