[BioC] GSVA: using Entrez ID's as identifiers
Robert Castelo
robert.castelo at upf.edu
Wed Nov 16 22:04:06 CET 2011
hi Som,
i'm cc'ing the BioC mailing list, please remember to do it when you
answer since this works as a knowledge base for everyone else.
i'd need two bits of information from you to find out what might be
happening: one, right after the error pops up, please write in the R shell:
traceback()
and paste here the output of this function.
two, please paste also here the ouput of
sessionInfo()
both steps are in fact recommended by the BioC mailinglist posting guide:
http://www.bioconductor.org/help/mailing-list/posting-guide
robert.
On 11/16/11 9:04 PM, somnath bandyopadhyay wrote:
> Hi Robert,
>
> I am trying to use GSVA on a microarray dataset and I am trying to use
> one of the Broad gene set collections for the enrichment purposes.
>
>
> library(GSEABase)
> library(Biobase)
> library(genefilter)
> library(limma)
> library(RColorBrewer)
> library(graph)
> library(GSVA)
>
> c3gsc2 <-
> getGmt("c2.cp.kegg.v3.0.entrez.gmt",collectionType=BroadCollection(category="c3"),geneIdType=EntrezIdentifier())
> class(c3gsc2)
> c3gsc2
>
> data <- read.table("gsva_infliximab_data.txt", header=T, row.names=1,
> sep="\t")# the data matrix is filtered for low expressors etc. and I am
> using Entrez Gene ID as row identifiers.
> class(data)
> data.m <- as.matrix(data)
>
> new <- gsva(data.m,
> c3gsc2,abs.ranking=TRUE,min.sz=1,max.sz=Inf,no.bootstraps=0,bootstrap.percent
> = .632,parallel.sz=0,parallel.type="SOCK",verbose=TRUE,mx.diff=TRUE)
>
>
> I keep getting the following error at this step
> Error in match(x, y) : 'match' requires vector arguments
>
> Could you pleaase tell me what I am doing wrong?
>
> Thanks so much,
> Som.
>
>
>
>
>
>
>
>
> > From: robert.castelo at upf.edu
> > To: kellert at ohsu.edu
> > Date: Wed, 16 Nov 2011 08:43:48 +0100
> > CC: wendy2.qiao at gmail.com; bioconductor at r-project.org
> > Subject: Re: [BioC] GSVA: using Entrez ID's as identifiers
> >
> > hi Tom,
> >
> > i'm a bit unsure what are you asking in relationship with this thread,
> > but i guess you're interested in creating a custom annotation package.
> > For that purpose i'd recommend you to read through the vignettes of the
> > AnnotationDbi package. i'm not an expert in creating custom annotation
> > packages so if you encounter problems to go ahead i think you should
> > start a new thread with the specific question or problem you want to
> > solve.
> >
> > cheers,
> > robert.
> >
> > On Tue, 2011-11-15 at 14:24 -0800, Tom Keller wrote:
> > > Greetings,
> > > The annotation for the miRNA chip does not seem to have the same
> amount of
> > > information as the hgu95 db. Is there some help available for
> mapping miRNA
> > > probes to their target genes?
> > >
> > > thanks
> > > Thomas (Tom) Keller, PhD
> > > kellert at ohsu.edu
> > > 503.494.2442
> > > 6588 R Jones Hall (BSc/CROET)
> > > MMI DNA Services
> > > Member of OHSU Shared Resources
> > >
> > > On Nov 14, 2011, at 11:28 PM, Robert Castelo wrote:
> > >
> > > > hi Wendy,
> > > >
> > > > i'm afraid you need to get a little bit acquainted with the way
> in which
> > > > annotations are handled in BioC. a good starting point could be
> looking
> > > > a the vignette "AnnotationDbi: How to use the .db annotation
> packages"
> > > > from the AnnotationDbi package.
> > > >
> > > > the short answer to your problem is that hgu95a is not the only
> platform
> > > > for which annotations exist in BioC, basically there is an annotation
> > > > package for each platform supported by BioC (you can look all of
> them up
> > > > by going to
> http://www.bioconductor.org/packages/release/BiocViews.html
> > > > and clicking on "AnnotationData") but in order to use on such
> annotation
> > > > packages you need
> > > >
> > > > 1. install it once in your system via source() and biocLite() just as
> > > > with every software package
> > > >
> > > > 2. load it via the library() function.
> > > >
> > > > in order to use the human organism-level package i mentioned in my
> > > > previous email you need to install it first and then load it
> prior to do
> > > > anything else with it.
> > > >
> > > > let me know if this still does not solve your problem.
> > > >
> > > > cheers,
> > > > robert.
> > > >
> > > > On Mon, 2011-11-14 at 18:40 -0500, Wendy Qiao wrote:
> > > >> Hi Robert,
> > > >>
> > > >> Thank you for your reply. I happened to convert all the genes to
> > > >> hgu95a probe IDs as I found that this is the only platform that
> works
> > > >> with ExpressionSet. It would be great that we could make the
> entrez ID
> > > >> works. Following is my error that I got with your code.
> > > >>
> > > >>
> > > >> Thank you.
> > > >> Wendy
> > > >>
> > > >>
> > > >>> BcellSet
> > > >> ExpressionSet (storageMode: lockedEnvironment)
> > > >> assayData: 12148 features, 7 samples
> > > >> element names: exprs
> > > >> protocolData: none
> > > >> phenoData
> > > >> sampleNames: Illumi_PREBCEL_1 Illumi_PREBCEL_2 ... Affy_PREBCEL_4 (7
> > > >> total)
> > > >> varLabels: CellType Platform Replicates
> > > >> varMetadata: labelDescription
> > > >> featureData: none
> > > >> experimentData: use 'experimentData(object)'
> > > >> Annotation: org.Hs.eg.db
> > > >>>
> > > >>
> preBcell.KEGG<-gsva(BcellSet,KEGGc2BroadSets,abs.ranking=FALSE)$es.obs
> > > >> Mapping identifiers between gene sets and feature names
> > > >> Error in GeneSetCollection(lapply(what, mapIdentifiers, to, ...,
> > > >> verbose = verbose)) :
> > > >> error in evaluating the argument 'object' in selecting a method for
> > > >> function 'GeneSetCollection': Error in get(mapName, envir = pkgEnv,
> > > >> inherits = FALSE) :
> > > >> object 'org.Hs.egENTREZID' not found
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On 14 November 2011 12:27, Robert Castelo <robert.castelo at upf.edu>
> > > >> wrote:
> > > >> hi Wendy,
> > > >>
> > > >> sorry for my late answer. in principle there is no problem for
> > > >> the
> > > >> gsva() function to take Entrez IDs in your expression data
> > > >> matrix.
> > > >>
> > > >> if the expression data comes as a matrix, and rows are
> > > >> annotated with
> > > >> Entrez IDs and the gene sets are also annotated with Entrez
> > > >> IDs, there
> > > >> should be absolutely no problem.
> > > >>
> > > >> if the expression data comes as an ExpressionSet object where
> > > >> the
> > > >> 'features' are not Affy probe IDs but just EntrezIDs. just
> > > >> make sure
> > > >> that the annotation slot has the corresponding organism-level
> > > >> package.
> > > >> for instance, in the case of human:
> > > >>
> > > >> annotation(eset) <- "org.Hs.eg.db"
> > > >>
> > > >> let me know if you have any problem with this.
> > > >>
> > > >> cheers,
> > > >> robert.
> > > >>
> > > >> On Fri, 2011-11-11 at 14:44 -0500, Wendy Qiao wrote:
> > > >>> Hi all,
> > > >>>
> > > >>> I am using the GSVA package for some analysis. I found that
> > > >> the package
> > > >>> only takes the gene expression matrix annotated with
> > > >> affymetrix probe IDs,
> > > >>> although the gene set collection is made of Entrez IDs. I
> > > >> imagine there a
> > > >>> step in the package for converting the Affymetrix probe IDs
> > > >> to Entrez IDs.
> > > >>> As my data are from the Illumina platform, I am wondering if
> > > >> an expression
> > > >>> matrix annotated with Entrez IDs can be used directly.
> > > >>>
> > > >>> Thank you,
> > > >>> Wendy
> > > >>>
> > > >>
> > > >>> [[alternative HTML version deleted]]
> > > >>>
> > > >>> _______________________________________________
> > > >>> Bioconductor mailing list
> > > >>> Bioconductor at r-project.org
> > > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > >>> Search the archives:
> > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >
> > > > _______________________________________________
> > > > Bioconductor mailing list
> > > > Bioconductor at r-project.org
> > > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> > >
> > >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list