[BioC] How to analyze Affy data, CEL files not available
James W. MacDonald
jmacdon at med.umich.edu
Wed Feb 7 19:02:52 CET 2007
Hi Bobby,
Bobby Prill wrote:
> I would like to analyze a set of 40 Affy experiments, but I do not
> have the CEL files. What I have is a spreadsheet of the MAS
> expression measures, one column per array. Each row corresponds to
> one gene.
>
> I load the data:
> eset = read.exprSet(exprs="mydata.txt", phenoData="phenoData.txt")
>
> My general question is, should/can I perform some sort of
> normalization so that the arrays are comparable from one to
> another? or is this what MAS has already done? (I'm not familiar
> with Affy MAS.)
>
> Other problems include:
>
> 1. MA plots indicate that the data cloud is skewed (not perfectly
> centered on M==0 line). Should I loess?
Almost certainly not. A loess normalization is almost always an
intra-array normalization for spotted cDNA microarrays rather than
something useful for the Affy chip type. I would look at a boxplot of
the data to see if the samples tend to line up. MAS5.0 usually ends up
doing a scaling and centering of the data, so you will likely see boxes
with fairly equal medians and inter-quartile ranges.
I suppose you could do a quantile normalization at this point, but that
might not be necessary or a good idea.
>
> 2. Also, the M values have high variance at low A, which I think is
> a byproduct of the MAS. Probably nothing I can do about this.
Nope.
>
> I think the typical advice would be to obtain CEL files and run rma
> (). But if I'm stuck with the MAS expression calls, what to do?
I would make sure the boxplots line up reasonably well, then go on to
higher level analyses. If you have the P/M/A calls you can filter out
the 'absent' samples, or use one of the various options in the
genefilter package.
HTH,
Jim
>
> Thanks.
>
> - Bobby
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list