[BioC] error while using goProfiles package on arabidopsis entrez gene IDs
Marc Carlson
mcarlson at fhcrc.org
Sat Aug 24 01:07:50 CEST 2013
Hi,
So now I can see a little better what you are doing. The problem is
what is happening inside of goProfiles. Now this is not my package and
I have never really used it much myself, so I just did a little
debugging to see what was happening, and this is what I found:
The basicProfile() function is expecting you to give it a central ID for
the org package you name for it. It seems to be assuming that this will
be an entrez gene ID. But that is *not* what the arabidopsis community
usually uses. That community likes to use TAIR IDs. So the
org.At.tair.db, uses TAIR IDs as the central ID (this is why TAIR is in
the middle of the package name). You can get and use entrez gene IDs
with the org.At.tair.db package, but they are not the central id that is
expected by many of the older methods like mget() etc. These days, we
have moved away from that model and now use the select method. We feel
it's less confusing since there is no longer the need to pay attention
to which key type is most important for a package etc. Instead the
select() interface just asks you to provide the kind of key that you are
using. We feel this is more transparent.
So anyways here is how I was able to make it run:
## 1st take some of your entrez gene IDs
egIDs <- c("839235", "838362", "838961", "837091", "837455", "837543")
## use select to quickly translate these into TAIR IDs, and then grab
that column of IDs back out.
## (You may find it more convenient to just start with the TAIR IDs that
you said were in your file, but I don't have those here)
tairIDs <- as.character(select(org.At.tair.db, keys=egIDs, cols="TAIR",
keytype="ENTREZID")[[2]])
## THEN call basicProfile function and pass in tair IDs instead...
## Now when it calls mget on the GO mapping, it will actually get some
matches.
basicProfile(tairIDs, idType="Entrez", onto ="ANY", level=2,
orgPackage="org.At.tair.db", ord=FALSE)
I hope this helps you,
Marc
On 08/22/2013 02:07 AM, dd [guest] wrote:
> Hi all,
> I was using goProfiles package for functional analysis using a genelist of 316 Arabidopsis entrez gene IDs as shown below in the R command sessionInfo().
>
> - Read a file containing Entrez IDs and TAIR IDs.
> - Subset the Entrez IDs and converted to character vector.
> - Used the vector as genelist.
> -Used goProfiles package function basicProfile for this genelist with organism package of Arabidopsis.
>
> OUTPUT :Error in GOtermslist[[i]] : subscript out of bounds.
>
> Can somebody please help me in finding any mistake I might have done?
>
> Thanks in advance.
>
> -- output of sessionInfo():
>
> Console output :
>
>>> a<-read.table("tair_ids to gene_ids.csv" ,header=TRUE,sep=",")
>>
>>> b<-as.character(a[,2])
>>> head(b)
>> [1] "839235" "838362" "838961" "837091" "837455" "837543"
>>
>>> h<-basicProfile(b,idType="Entrez",onto ="ANY",level=2,orgPackage="org.At.tair.db",ord=FALSE)
>> Error in GOtermslist[[i]] : subscript out of bounds
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list