[BioC] [devteam-bioc] Getting GO ids for genenames in plasmodium falciparum

Martin Morgan mtmorgan at fhcrc.org
Tue Oct 8 14:14:22 CEST 2013


On 10/08/2013 01:09 AM, Maintainer wrote:
>
> I have a list of genenames - plasmodium falciparum gotten from the plasmodb website.
>
> I am trying to get the associated GO:IDs in order to bin the genes into housekeeping versus non-housekeeping genes.
> And also in terms of functional and process.
>
> I have installed the org.Pf.plasmo.db using biocLite.

I'm guessing you have keys like

 > ids <- head(keys(org.Pf.plasmo.db, "SYMBOL"))
 > ids
[1] "PF3D7_0100100" "PF3D7_0100200" "PF3D7_0100300" "PF3D7_0100400"
[5] "PF3D7_0100500" "PF3D7_0100600"

and what you want to do is create your own vector 'ids' and then

   select(org.Pf.plasmo.db, ids, "GO", keytype="SYMBOL")

Martin

>
> I have tried to use this example:
>
>   x <- org.Pf.plasmoGO
>      # Get the ORF identifiers that are mapped to a GO ID
>      mapped_genes <- mappedkeys(x)
>      # Convert to a list
>      xx <- as.list(x[mapped_genes])
>      if(length(xx) > 0) {
>          # Try the first one
>          got <- xx[[1]]
>          got[[1]][["GOID"]]
>          got[[1]][["Ontology"]]
>          got[[1]][["Evidence"]]
>      }
>
> It doesnt provide an opportunity to create a column and enter my own gene names. It appears to be a premapped set of genenames. As a result I decided to use the example to get all mappings in the list xx
>
> Unfortunately, I am unable to iterate through the list to generate it in a dataframe to meaningfully divide up the data.
>
> Secondly is there a way to actually query a database directly via R to get the associated GO:ID where the input would be a genename.
>
> Sorry to sound confused. I am pretty new to R and bioconductor.
>
>
>   -- output of sessionInfo():
>
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] org.Pf.plasmo.db_2.9.0 BiocInstaller_1.10.3   GO.db_2.9.0            hgu95av2.db_2.9.0      org.Hs.eg.db_2.9.0
>   [6] RSQLite_0.11.4         DBI_0.2-7              AnnotationDbi_1.22.6   Biobase_2.20.1         BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
>   [1] digest_0.6.3       grid_3.0.1         gtable_0.1.2       IRanges_1.18.4     plyr_1.8           proto_0.3-10
>   [7] RColorBrewer_1.0-5 reshape2_1.2.2     stats4_3.0.1       stringr_0.6.2      tools_3.0.1
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> ________________________________________________________________________
> devteam-bioc mailing list
> To unsubscribe from this mailing list send a blank email to
> devteam-bioc-leave at lists.fhcrc.org
> You can also unsubscribe or change your personal options at
> https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list