[BioC] Gene Ontology Annotations from Gene Names

Joseph Shaw josph.sh at gmail.com
Tue Feb 11 00:20:00 CET 2014


Hi all,

Thank you so much. Your assistance has been invaluable!

Joseph

On Mon, Feb 10, 2014 at 7:37 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
> Hi Joseph,
>
> Here is a newer tarball, build with all release packages so that it should
> be able to install properly for you without modification.
>
> And Jim is right, you should be able to proceed from here now that you have
> an org package.  Basically, we didn't used to have as many tools for making
> org packages, so not being supported used to be a much more serious problem
> than it is today.
>
> Hope this helps,
>
>
>  Marc
>
>
>
>
> On 02/10/2014 08:44 AM, James W. MacDonald wrote:
>>
>> Hi Joseph,
>>
>> Please don't take conversations off-list.
>>
>> On Friday, February 07, 2014 9:00:06 PM, Joseph Shaw wrote:
>>>
>>> Hi Jim,
>>>
>>> Thanks for all your assistance. I really appreciate it!
>>>
>>> Unfortunately, when I attempt to run
>>>
>>>> install.packages("org.Cjejuni_0.0.1.tar.gz", repos = NULL, type =
>>>> "source")
>>>
>>>
>>> I get the error warning
>>>
>>>> Error : package 'AnnotationDbi' 1.24.0 was found, but >= 1.25.2 is
>>>> required by 'org.Cjejuni.eg.db'
>>>
>>>
>>> I have since attempted to reinstall and update the AnnotationDbi
>>> package on my system to a compatible iteration, but the process
>>> results in the same error.
>>
>>
>> Hmm. Weird. I seem to have one iteration of a devel AnnotationDbi package
>> in my release BioC install.
>>
>> You could probably just untar and ungzip that file and then manually
>> change the DESCRIPTION file to require AnnotationDbi >= 1.24.0 and then
>> install using
>>
>> install.packages("org.Cjejuni.eg.db", type = "source", repos = NULL)
>>
>>
>>>
>>> On a separate but related note, is it possible to restrict the list of
>>> gene annotations from org.Cjejuni.eg.db used in the GO analysis (i.e.
>>> the GSEAGOHyperGParams()* function) to simply include the probes used
>>> in the experiment (i.e. create two subsets; a gene universe and a
>>> collection of genes identified as differentially expressed)?
>>>
>>> (*The GSEAGOHyperGParams() function is used in the unuspported model
>>> organisms vignette, but the author simply uses the entire gene mapping
>>> as the gene universe and selects the first 500 genes as differentially
>>> expressed; ideally, I would like to include genes in the universe
>>> based on gene IDs, but this might not be the most efficient way.)
>>
>>
>> You are reading the wrong vignette. While this is technically a
>> 'unsupported organism', since you have an org package, you can just use the
>> regular infrastructure:
>>
>>> univ <- Lkeys(org.Cjejuni.egACCNUM)
>>> gns <- univ[sample(1:1670, 100)] ## here I am just selecting genes at
>>> random
>>> p <- new("GOHyperGParams", geneIds = gns, universeGeneIds = univ,
>>> ontology = "BP", annotation = "org.Cjejuni.eg.db", conditional = TRUE)
>>> hyp <- hyperGTest(p)
>>> summary(hyp)
>>
>>      GOBPID      Pvalue OddsRatio  ExpCount Count Size
>> Term
>> 1 GO:0012501 0.003677779       Inf 0.1221239     2    2 programmed cell
>> death
>> 2 GO:0016265 0.003677779       Inf 0.1221239     2 2                 death
>>
>> I get an infinite odds ratio here because I randomly selected the only two
>> apoptosis genes on the array. Yay for me!
>>
>> Best,
>>
>> Jim
>>
>>
>>>
>>> Relevant Vignette:
>>>
>>> http://www.bioconductor.org/packages/devel/bioc/vignettes/GOstats/inst/doc/GOstatsForUnsupportedOrganisms.pdf
>>>
>>> Joseph
>>>
>>> On Fri, Feb 7, 2014 at 7:03 PM, James W. MacDonald <jmacdon at uw.edu>
>>> wrote:
>>>>
>>>> See attached.
>>>>
>>>>
>>>> On 2/6/2014 8:32 PM, Joseph Shaw wrote:
>>>>>
>>>>>
>>>>> Hi Jim,
>>>>>
>>>>>> You can check to see if it is a viable option by just giving it a
>>>>>> shot.
>>>>>
>>>>>
>>>>> I have attempted to call the makeOrgPackageFromNCBI() as described in
>>>>> your previous mail (having provided my details for the author and
>>>>> maintainer arguments); however, the function call doesn't fully
>>>>> complete. In particular, the steps outline below are completed, but it
>>>>> appears to make it no further.
>>>>>
>>>>>> Loading required package: GO.db
>>>>>>
>>>>>> Getting data for gene2pubmed.gz
>>>>>> Loading required package: RCurl
>>>>>> Loading required package: bitops
>>>>>> discarding data from other organisms
>>>>>> Populating gene2pubmed table:
>>>>>> table gene2pubmed filled
>>>>>> Getting data for gene2accession.gz
>>>>>
>>>>>
>>>>> I'm not sure if the function has failed or if the function is still in
>>>>> the process of completion. Could you tell me, approximately, how long
>>>>> the function should take to complete? For reference, I'm currently
>>>>> running OS X with 1.8 GHz processor and 4GB memory.
>>>>>
>>>>> Joseph
>>>>
>>>>
>>>>
>>>> --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> University of Washington
>>>> Environmental and Occupational Health Sciences
>>>> 4225 Roosevelt Way NE, # 100
>>>> Seattle WA 98105-6099
>>>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list