[BioC] Analysing Human Gene ST 1.0 Arrays with oligo and oneChannelGUI yield different number of probesets
bcarvalh at jhsph.edu
Thu Oct 29 23:49:41 CET 2009
That makes me think that I forgot one 'svn commit' sometime in the
past... Apologies for that.
In the meantime, please use the following description.
Until BioC 2.4, oligo summarized only to the probeset level (as
defined in the PGF file). Affymetrix made available meta-probeset
files (MPS) that define "new probesets", which allow summarization to
the gene-level. For exon arrays, there are 3 MPSs (depending on the
quality): core (best), extended and full. For gene arrays, there's
only "core" MPS.
Therefore, summaries to the gene level should use this additional
So, using the 'target' argument, you can set to what level you want
the summarization to be: "probeset", "core", "extended" and "full" are
the possible values (this is available starting now on BioC 2.5).
I'll make sure the documentation is updated soon to reflect this change.
Once again, apologies.
On Oct 29, 2009, at 8:21 PM, Javier Pérez Florido wrote:
> Dear Benilton,
> Thanks for your quick reply. Now, it works with the target argument.
> However, I searched on the web for the meaning of this argument and
> couldn't find anything. What is "target" for?
> Why does oligo's manual say: "The ExpressionSet returned when either
> Exon/Gene-FeatureSet objects are passed contain extra annotation on
> featureData slot that the user should take into account for
> exon/gene-level analyses"?
> I didn't work with Human Gene ST arrays before, so, I quite new on
> Thanks again,
> Benilton Carvalho escribió:
>> Dear Javier,
>> You have not provided the exact call to RMA you used nor your
>> sessionInfo() information.
>> If you're using the latest oligo (BioC 2.5), you can call:
>> results = rma(object, target="core")
>> to get the 33297 "probesets" you refer to...
>> Note that building the package yourself is a nice exercise, but you
>> could just download it via biocLite().
>> On Oct 29, 2009, at 5:42 PM, Javier Pérez Florido wrote:
>>> Dear list,
>>> Some time ago I analysed a set of Human Gene ST Arrays with
>>> oneChannelGUI. Now I'm trying to reproduce the results using oligo
>>> package but I am quite surprised with the results obtained. With
>>> package, after preprocessing with rma, the number of probesets are
>>> 253002 while with oneChannelGUI the number of probesets are 33297,
>>> the CEL files are the same!!!
>>> For oligo package, and prior to read the CEL files, I had to
>>> build the
>>> annotation package using pdInfoPackage, since the CDF file is not
>>> supported by Affymetrix. For this purpose, first I had to download
>>> library files "Human Gene 1.0 ST Array, Analysis" from Affymetrix
>>> website. The necessary files for building the package are:
>>> HuGene-1_0-st-v1.na29.hg18.probeset (CSV file)
>>> Then, I executed the following commands:
>>> baseDir <- "pathWhereTheFilesAre"
>>> (pgf <- list.files(baseDir, pattern = ".pgf",full.names = TRUE))
>>> (clf <- list.files(baseDir, pattern = ".clf",full.names = TRUE))
>>> (prob <- list.files(baseDir, pattern = ".probeset.csv",full.names =
>>> seed <- new("AffyGenePDInfoPkgSeed",pgfFile = pgf, clfFile =
>>> clf,probeFile = prob, author = "Javier",email = "email",biocViews =
>>> "AnnotationData",genomebuild = "NCBI Build 36",organism = "Human",
>>> species = "Homo Sapiens",url = "")
>>> makePdInfoPackage(seed, destDir = ".")
>>> And I installed the package:
>>> R CMD INSTALL pd.hugene.1.0.st.v1\
>>> The package was installed OK and I read and preprocessed the CEL
>>> using RMA, but the number of probesets are 253002!!!! So many
>>> compared to the ones given by oneChannelGUI.
>>> Any comments for such big difference??
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> Search the archives:
More information about the Bioconductor