[Bioc-devel] idempotent identifier mapping with GSEABase::mapIdentifiers()
Robert Castelo
robert.castelo at upf.edu
Mon Feb 27 18:08:05 CET 2012
thanks Vincent,
what you suggest fixes the situation temporarily and hopefully, as
Martin said in his message before, this can have a more generic
solution.
you suggestion makes me think that, in fact, it could be of general
interest to add a *ENTREZID (identity) map for every entrez-based
organism-level annotation package. i think this could be useful in every
situation in which one would like to programmatically retrieve the
entrez id of a feature using any annotation package without knowing
whether the feature is already an entrez id.
robert.
On Mon, 2012-02-27 at 07:45 -0500, Vincent Carey wrote:
> I have run into a very similar situation. Ultimately a uniformization
> of the annotation API will be called for.
> I wonder if a global short-term fixup would get you through this
> situation?
>
> > org.Hs.egENTREZID = new.env(hash=TRUE)
> > k = mappedkeys(org.Hs.egENSEMBL) # or any other good source of all
> keys
> > for (i in 1:length(k)) assign(k[i], k[i], org.Hs.egENTREZID)
> > get("1000", org.Hs.egENTREZID)
> [1] "1000"
>
>
> On Mon, Feb 27, 2012 at 6:25 AM, Robert Castelo
> <robert.castelo at upf.edu> wrote:
> hi,
>
> i collaborate mantaining the packages GSVA and GSVAdata and i
> have a
> question about the function mapIdentifiers() from the GSEABase
> package
> which i'm going to illustrate through an example.
>
>
> 1. let's build first an ExpressionSet object whose annotation
> slot is
> going to point to the human organism-level annotation pacakge
> org.Hs.eg.db:
>
> library(Biobase)
> library(org.Hs.eg.db)
>
> mapped_genes <- mappedkeys(org.Hs.egSYMBOL)
>
> exp <- matrix(rnorm(1000), nrow=100,
> dimnames=list(mapped_genes[1:100],
> paste("sample", 1:10, sep="")))
> eset <- new("ExpressionSet", exprs=exp,
> annotation="org.Hs.eg.db")
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 100 features, 10 samples
> element names: exprs
> protocolData: none
> phenoData: none
> featureData: none
> experimentData: use 'experimentData(object)'
> Annotation: org.Hs.eg.db
>
> 2. now i'm going to load the Broad gene sets stored as a
> GeneSetCollection object in the experimental data package
> GSVAdata:
>
> library(GSVAdata)
> data(c2BroadSets)
> c2BroadSets
> GeneSetCollection
> names: NAKAMURA_CANCER_MICROENVIRONMENT_UP,
> NAKAMURA_CANCER_MICROENVIRONMENT_DN, ...,
> ST_PHOSPHOINOSITIDE_3_KINASE_PATHWAY (3272 total)
> unique identifiers: 5167, 100288400, ..., 57191 (29340 total)
> types in collection:
> geneIdType: EntrezIdentifier (1 total)
> collectionType: BroadCollection (1 total)
>
>
> 3. finally, i'd like to obtain a new GeneSetCollection object
> whose
> identifiers have been mapped between the two classes of
> identifiers in
> the GeneSetCollection and the ExpressionSet objects.
>
> in this case both objects actually work with the same class of
> identifiers (Entrez), so in fact i don't need to do that but
> this
> operation forms part of a piece of code in the package GSVA
> which i'd
> like it to work regardless of the kind of annotation package
> referred to
> in the ExpressionSet object. i had expected that the function
> mapIdentifiers() would have some kind of idempotent behavior,
> but i get
> the following error:
>
> gsc <- mapIdentifiers(c2BroadSets,
> AnnotationIdentifier(annotation(eset)))
> Error in GeneSetCollection(lapply(what, mapIdentifiers,
> to, ..., verbose
> = verbose)) :
> error in evaluating the argument 'object' in selecting a
> method for
> function 'GeneSetCollection': Error in get(mapName, envir =
> pkgEnv,
> inherits = FALSE) :
> object 'org.Hs.egENTREZID' not found
>
>
> which does not occur if the feature names and annotation of
> the
> ExpressionSet corresponds to a classical affy chip (e.g.
> "hgu95av2").
>
> i built the object c2BroadSets in the experiment data package
> GSVAdata
> by importing the entire xml file from the Broad sets so, i
> guess it
> could be also possible that i did something wrong when i built
> this
> 'c2BroadSets' object and there's no problem, bug or lacking
> feature in
> mapIdentifiers().
>
> i look forward to your diagnostic and suggestions in any of
> these
> possible directions.
>
>
> thanks,
> robert.
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
More information about the Bioc-devel
mailing list