[BioC] Gene Ontology Annotations from Gene Names

Marc Carlson mcarlson at fhcrc.org
Mon Feb 10 20:37:30 CET 2014


Hi Joseph,

Here is a newer tarball, build with all release packages so that it 
should be able to install properly for you without modification.

And Jim is right, you should be able to proceed from here now that you 
have an org package.  Basically, we didn't used to have as many tools 
for making org packages, so not being supported used to be a much more 
serious problem than it is today.

Hope this helps,


  Marc



On 02/10/2014 08:44 AM, James W. MacDonald wrote:
> Hi Joseph,
>
> Please don't take conversations off-list.
>
> On Friday, February 07, 2014 9:00:06 PM, Joseph Shaw wrote:
>> Hi Jim,
>>
>> Thanks for all your assistance. I really appreciate it!
>>
>> Unfortunately, when I attempt to run
>>
>>> install.packages("org.Cjejuni_0.0.1.tar.gz", repos = NULL, type = 
>>> "source")
>>
>> I get the error warning
>>
>>> Error : package 'AnnotationDbi' 1.24.0 was found, but >= 1.25.2 is 
>>> required by 'org.Cjejuni.eg.db'
>>
>> I have since attempted to reinstall and update the AnnotationDbi
>> package on my system to a compatible iteration, but the process
>> results in the same error.
>
> Hmm. Weird. I seem to have one iteration of a devel AnnotationDbi 
> package in my release BioC install.
>
> You could probably just untar and ungzip that file and then manually 
> change the DESCRIPTION file to require AnnotationDbi >= 1.24.0 and 
> then install using
>
> install.packages("org.Cjejuni.eg.db", type = "source", repos = NULL)
>
>
>>
>> On a separate but related note, is it possible to restrict the list of
>> gene annotations from org.Cjejuni.eg.db used in the GO analysis (i.e.
>> the GSEAGOHyperGParams()* function) to simply include the probes used
>> in the experiment (i.e. create two subsets; a gene universe and a
>> collection of genes identified as differentially expressed)?
>>
>> (*The GSEAGOHyperGParams() function is used in the unuspported model
>> organisms vignette, but the author simply uses the entire gene mapping
>> as the gene universe and selects the first 500 genes as differentially
>> expressed; ideally, I would like to include genes in the universe
>> based on gene IDs, but this might not be the most efficient way.)
>
> You are reading the wrong vignette. While this is technically a 
> 'unsupported organism', since you have an org package, you can just 
> use the regular infrastructure:
>
>> univ <- Lkeys(org.Cjejuni.egACCNUM)
>> gns <- univ[sample(1:1670, 100)] ## here I am just selecting genes at 
>> random
>> p <- new("GOHyperGParams", geneIds = gns, universeGeneIds = univ, 
>> ontology = "BP", annotation = "org.Cjejuni.eg.db", conditional = TRUE)
>> hyp <- hyperGTest(p)
>> summary(hyp)
>      GOBPID      Pvalue OddsRatio  ExpCount Count Size                 
> Term
> 1 GO:0012501 0.003677779       Inf 0.1221239     2    2 programmed 
> cell death
> 2 GO:0016265 0.003677779       Inf 0.1221239     2 2                 
> death
>
> I get an infinite odds ratio here because I randomly selected the only 
> two apoptosis genes on the array. Yay for me!
>
> Best,
>
> Jim
>
>
>>
>> Relevant Vignette:
>> http://www.bioconductor.org/packages/devel/bioc/vignettes/GOstats/inst/doc/GOstatsForUnsupportedOrganisms.pdf 
>>
>>
>> Joseph
>>
>> On Fri, Feb 7, 2014 at 7:03 PM, James W. MacDonald <jmacdon at uw.edu> 
>> wrote:
>>> See attached.
>>>
>>>
>>> On 2/6/2014 8:32 PM, Joseph Shaw wrote:
>>>>
>>>> Hi Jim,
>>>>
>>>>> You can check to see if it is a viable option by just giving it a 
>>>>> shot.
>>>>
>>>> I have attempted to call the makeOrgPackageFromNCBI() as described in
>>>> your previous mail (having provided my details for the author and
>>>> maintainer arguments); however, the function call doesn't fully
>>>> complete. In particular, the steps outline below are completed, but it
>>>> appears to make it no further.
>>>>
>>>>> Loading required package: GO.db
>>>>>
>>>>> Getting data for gene2pubmed.gz
>>>>> Loading required package: RCurl
>>>>> Loading required package: bitops
>>>>> discarding data from other organisms
>>>>> Populating gene2pubmed table:
>>>>> table gene2pubmed filled
>>>>> Getting data for gene2accession.gz
>>>>
>>>> I'm not sure if the function has failed or if the function is still in
>>>> the process of completion. Could you tell me, approximately, how long
>>>> the function should take to complete? For reference, I'm currently
>>>> running OS X with 1.8 GHz processor and 4GB memory.
>>>>
>>>> Joseph
>>>
>>>
>>> -- 
>>> James W. MacDonald, M.S.
>>> Biostatistician
>>> University of Washington
>>> Environmental and Occupational Health Sciences
>>> 4225 Roosevelt Way NE, # 100
>>> Seattle WA 98105-6099
>>>
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-------------- next part --------------
A non-text attachment was scrubbed...
Name: org.Cjejuni.eg.db_0.1.tar.gz
Type: application/x-gzip
Size: 5874006 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140210/823b90c3/attachment-0001.gz>


More information about the Bioconductor mailing list