[BioC] Oligo package
bcarvalh at jhsph.edu
Mon Oct 5 16:30:42 CEST 2009
For the record, as I just replied this very same message sent privately:
below some notes that back-reference our previous communications:
1) This array is 1050 x 1050, therefore 1,102,500 features. The ~700K
distinct probes you refer to are properly annotated in the pd.hugene.
1.0.st.v1 (2.4.1 and 3.0.0) packages;
2) Summarization to the gene-level is possible using the devel-version
of the packages;
3) The summaries you're getting are at the probeset-level, as defined
by the PGF file;
I failed to mention that, on the chip, there are "things" other then
the experimental probes most of the people are interested in.
Therefore the difference (700K vs 1M). The oligo package reads them
all, but it doesn't mean that all of them are used when preprocessing.
So, Thibault, from what you report, I understand you really want to
use the packages that are, right now, on the devel-branch (oligo and
friends, plus annotation package)... or just wait for the release,
which is coming up soon. By the way, when working on the code
currently available under devel, I checked the results against those
provided by the Affymetrix tool, and they were very consistent; so
please let me know if you find something that does not agree with
their tool (assuming the use of RMA) and I'll address that promptly.
On Oct 5, 2009, at 10:50 AM, Thibault Helleputte wrote:
> I use R version 2.9.2 under MacOSX Tiger, and the oligo (1.8.3),
> oligoclasses (1.6.0) and pd.hugene.1.0.st.v1 (2.4.1) packages. I
> imported 20 human gene 1.0 st CEL files into R, and I summarized them
> with rma(). I have then several concerns:
> Once the CEL files read, I have an oligo object with 1,102,500
> and not 764,885 distinct probes mentioned in Affymetrix documentation.
> Once this R object summarized via the rma() function, I get 253,002
> features, instead of the 28,869 genes mentioned by Affymetrix. That
> suggests that only 4 probes on average are included in each probesets
> (roughly a 4:1 ratio between probes and summarized probesets). The
> median number of probes by probesets is supposed to be 26 with that
> specific technology.
> Does someone have an explanation or a comment on this issue?
> Many thanks.
More information about the Bioconductor