[BioC] GSVA: using Entrez ID's as identifiers
Robert Castelo
robert.castelo at upf.edu
Tue Nov 15 08:28:53 CET 2011
hi Wendy,
i'm afraid you need to get a little bit acquainted with the way in which
annotations are handled in BioC. a good starting point could be looking
a the vignette "AnnotationDbi: How to use the .db annotation packages"
from the AnnotationDbi package.
the short answer to your problem is that hgu95a is not the only platform
for which annotations exist in BioC, basically there is an annotation
package for each platform supported by BioC (you can look all of them up
by going to http://www.bioconductor.org/packages/release/BiocViews.html
and clicking on "AnnotationData") but in order to use on such annotation
packages you need
1. install it once in your system via source() and biocLite() just as
with every software package
2. load it via the library() function.
in order to use the human organism-level package i mentioned in my
previous email you need to install it first and then load it prior to do
anything else with it.
let me know if this still does not solve your problem.
cheers,
robert.
On Mon, 2011-11-14 at 18:40 -0500, Wendy Qiao wrote:
> Hi Robert,
>
> Thank you for your reply. I happened to convert all the genes to
> hgu95a probe IDs as I found that this is the only platform that works
> with ExpressionSet. It would be great that we could make the entrez ID
> works. Following is my error that I got with your code.
>
>
> Thank you.
> Wendy
>
>
> > BcellSet
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 12148 features, 7 samples
> element names: exprs
> protocolData: none
> phenoData
> sampleNames: Illumi_PREBCEL_1 Illumi_PREBCEL_2 ... Affy_PREBCEL_4 (7
> total)
> varLabels: CellType Platform Replicates
> varMetadata: labelDescription
> featureData: none
> experimentData: use 'experimentData(object)'
> Annotation: org.Hs.eg.db
> >
> preBcell.KEGG<-gsva(BcellSet,KEGGc2BroadSets,abs.ranking=FALSE)$es.obs
> Mapping identifiers between gene sets and feature names
> Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ...,
> verbose = verbose)) :
> error in evaluating the argument 'object' in selecting a method for
> function 'GeneSetCollection': Error in get(mapName, envir = pkgEnv,
> inherits = FALSE) :
> object 'org.Hs.egENTREZID' not found
>
>
>
>
> On 14 November 2011 12:27, Robert Castelo <robert.castelo at upf.edu>
> wrote:
> hi Wendy,
>
> sorry for my late answer. in principle there is no problem for
> the
> gsva() function to take Entrez IDs in your expression data
> matrix.
>
> if the expression data comes as a matrix, and rows are
> annotated with
> Entrez IDs and the gene sets are also annotated with Entrez
> IDs, there
> should be absolutely no problem.
>
> if the expression data comes as an ExpressionSet object where
> the
> 'features' are not Affy probe IDs but just EntrezIDs. just
> make sure
> that the annotation slot has the corresponding organism-level
> package.
> for instance, in the case of human:
>
> annotation(eset) <- "org.Hs.eg.db"
>
> let me know if you have any problem with this.
>
> cheers,
> robert.
>
> On Fri, 2011-11-11 at 14:44 -0500, Wendy Qiao wrote:
> > Hi all,
> >
> > I am using the GSVA package for some analysis. I found that
> the package
> > only takes the gene expression matrix annotated with
> affymetrix probe IDs,
> > although the gene set collection is made of Entrez IDs. I
> imagine there a
> > step in the package for converting the Affymetrix probe IDs
> to Entrez IDs.
> > As my data are from the Illumina platform, I am wondering if
> an expression
> > matrix annotated with Entrez IDs can be used directly.
> >
> > Thank you,
> > Wendy
> >
>
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
>
>
>
More information about the Bioconductor
mailing list