[BioC] GO.db redundance
Marc Carlson
mcarlson at fhcrc.org
Fri May 15 19:11:10 CEST 2009
Hi Giacomo,
Lets try to keep this on list so that others can benefit from your
questions.
So the reason why you cannot coerce this into a data.frame is because
what you have in the GO_CC_description object is actually a list of
"GOTerms "objects. The error message R is giving you is trying to tell
you that it does not know how to cast that into a data.frame. You can
see this for yourself if you use the str() function like this:
str(GO_CC_description)
So I think you can see that if you want to get the individual
descriptions out of there you are going to have to be a bit more
specific. So to just continue your example:
##You started by just getting the GOTERM info for the 1st set of elements
GO_CC_description=mget(clnList[[1]],GOTERM,ifnotfound=NA)
##And as we discussed this gives you a list of GOTerms objects back
##So if you really want a data frame, then you can always
## just break the parts of this object out (the parts you want) and
##then reassemble those into a data frame like this:
##Get the terms out
GO_CC_terms = sapply(GO_CC_description, function(x) x at Term)
##Lets combine those with the GO IDs
GO_CC_IDs = clnList[[1]]
df = data.frame(cbind(GO_CC_IDs,GO_CC_terms))
df
Hope this helps.
Marc
giacomo.tuana at unimib.it wrote:
> Hi Marc,
>
> thanks a lot for your suggestions. Now I've another kind of problem. I
> want to coerce GO terms found into a data.frame or list for printing
> out a table file. Or to create it by use of some extract function for
> GO terms data type. But I How can I do?
>
> I used your previous code:
>
> library("mgu74av2.db")
> library("GO.db")
>
> ##Get the IDs you wanted
> all_probes_mgu <- ls(mgu74av2ENTREZID)
> ##Get the GO IDs for these IDs
> GOIDs = mget(all_probes_mgu, mgu74av2GO, ifnotfound=NA)
>
> ##You also wanted to remove things that were not part of the
> ##"CC" ontology. There is a good way to do this in ever so convenient
> ##annotate package...
> ##So for example, we can make use of the getOntology method like this:
> library("annotate")
> clnList = lapply(GOIDs, getOntology, "CC")
>
> so I added this lines:
> GO_CC_description=mget(clnList[[1]],GOTERM,ifnotfound=NA)
> GO_CC_description_df=as.data.frame(GO_CC_description)
> Error in as.data.frame.default(x[[i]], optional = TRUE) :
> cannot coerce class "GOTerms" into a data.frame
>
>
>
> Best Regards
>
>
> Giacomo
>
>
>
>
> --
>
>
> Dr. Giacomo Tuana Franguel
>
> Genopolis Consortium
> University of Milano-Bicocca
> Dept. of Biotechnology and Bioscience/ U4
> Piazza della Scienza 4 20126 Milano, Italy
> Tel +39 02 6448 3530
> Fax +39 02 4074 6210
>
>
> On Mon, 11 May 2009 11:49:03 -0700
> mcarlson at fhcrc.org wrote:
> > Hi Giacomo,
> >
> > The problem isn't with the databases or the annotation
> >packages, but
> > with how you are using toTable(). I would not use
> >toTable() like that since this is not what it was
> >designed to do. Instead, I would recommend an approach
> >more like this:
> >
> > library("mgu74av2.db")
> > library("GO.db")
> >
> > ##Get the IDs you wanted
> > all_probes_mgu <- ls(mgu74av2ENTREZID)
> > ##Get the GO IDs for these IDs
> > GOIDs = mget(all_probes_mgu, mgu74av2GO, ifnotfound=NA)
> >
> > ##You also wanted to remove things that were not part of
> >the
> > ##"CC" ontology. There is a good way to do this in ever
> >so convenient
> > ##annotate package...
> > ##So for example, we can make use of the getOntology
> >method like this:
> > library("annotate")
> > clnList = lapply(GOIDs, getOntology, "CC")
> >
> > ##Finally if we want to get more details for each of
> >these GOIDs, we
> > ##can use the GOTERM mapping in the usual way:
> >
> > ##So for the probe you used in your example:
> > clnList[1]
> > ##You can look up the details from the GOTERM table like
> >this:
> > mget(clnList[[1]],GOTERM,ifnotfound=NA)
> >
> >
> > You weren't super clear about what exactly you were
> >trying to do, so I hope that this answers your
> >questions. If not, please let us know.
> >
> >
> > Marc
> >
> >
> >
> >
> >
> > Quoting giacomo.tuana at unimib.it:
> >
> >>
> >> Hi,
> >> I found a redundance in GO annotation database trying
> >>to build a global
> >> table of annotation with probeset_ID, DB
> >>crossreferences (entrez_ID, gene
> >> name....) and GO annotation. For a single probe there
> >>are more GO
> >> terms very
> >> similar (use of synonymous) or equal (different
> >>punctuation) in GO term
> >> definitions; I think this could be a problem for
> >>functional
> >> annotation. Can
> >> someone suggest me how to deal with this situation?
> >>Or different way to
> >> build a global table of annotation?
> >> Here the code I used for CC category, example with
> >>"100001_at"
> >> probeset ID:
> >> library("mgu74av2.db")
> >> library("GO.db")
> >> go_mgu<-toTable(mgu74av2GO)
> >> go_term_description<-toTable(GOTERM)
> >> all_probes_mgu <- ls(mgu74av2ENTREZID)
> >> go_mgu_descr<-merge(go_mgu[,1:3],go_term_description,by.x=2,by.y=1)
> >> go_mgu_cc<-go_mgu_descr[which((go_mgu_descr[,6])=="CC"),]
> >> go_mgu_cc[which((go_mgu_cc[,2]=="100001_at")),]
> >> Thanks
> >> Giacomo
> >> --
> >> Dr. Giacomo Tuana Franguel
> >> Genopolis Consortium
> >> University of Milano-Bicocca
> >> Dept. of Biotechnology and Bioscience/ U4
> >> Piazza della Scienza 4 20126 Milano, Italy
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> >
> >
> >
> >
>
More information about the Bioconductor
mailing list