[BioC] making use of the Apis mellifera BeeBase assembly 4 data in goseq

Alicia Oshlack alicia.oshlack at mcri.edu.au
Sat Feb 25 10:38:40 CET 2012


Hi Vanessa,

In answer to your question number 2, in order for you to use a genome which
is not supported  (if it's not in UCSC then it's not supported in goseq)
then you are right in that you will need the annotation (gene length) and
the mapping from geneids to GO terms. It's not enough just to have the
genome in order to use goseq.

Cheers,
Alicia


On 25/02/12 6:32 PM, "bioconductor-request at r-project.org"
<bioconductor-request at r-project.org> wrote:

> 
> Message: 9
> Date: Fri, 24 Feb 2012 12:51:37 -0800
> From: Herv? Pag?s <hpages at fhcrc.org>
> To: "Corby, Vanessa" <Vanessa.Corby at ARS.USDA.GOV>
> Cc: "myoung at wehi.edu.au" <myoung at wehi.edu.au>,
> "bioconductor at r-project.org" <Bioconductor at r-project.org>
> Subject: Re: [BioC] making use of the Apis mellifera BeeBase assembly
> 4 data in goseq
> Message-ID: <4F47F859.5080200 at fhcrc.org>
> Content-Type: text/plain; charset=windows-1252; format=flowed
> 
> Hello Vanessa,
> 
> On 02/24/2012 10:45 AM, Corby, Vanessa wrote:
>> Hello Herve and Matt,
>> 
>> After looking through the Bioconductor documentation for the BeeBase
>> assembly 4 package Herv? posted (information on the Apis 4 annotation
>> stored in Biostrings objects), the documentation for the org.Hs.eg.db
>> Annotation database documentation, the bioconductor mailing list, the
>> BSgenome documentation, and the goseq documentation, I am still very
>> confused about whether I can use the assembly 4 package that Herv?
>> posted in goseq.
> 
> Just to clarify, goseq is not my package so I can't "post" anything
> in it, whatever that means. I assume you are talking about the
> BSgenome.Amellifera.BeeBase.assembly4 package that I made and that
> is part of Bioconductor.
> 
>> The reason that I want to use the assembly 4 data is
>> that I would presume that it will have more current information than the
>> natively supported (by goseq) Apis release 2.
> 
> It's a more recent assembly so I would expect it to be more accurate
> (i.e. closer to reality).
> 
>> 
>> So, here are my questions:
>> 
>> 1.Will release 4 offer much improvement over release 2? If this is not
>> the case, then the next two questions are moot.
> 
> It's just a more recent assembly, with all what that implies.
> 
>> 
>> 2.Do I need to get information on the transcript lengths and the
>> associations between the geneids and GO terms for the Apis 4 release and
>> build 2 new files of this information for goseq to use?
> 
> I'm not familiar with the goseq package so I'll let Matt answer this.
> 
>> Is that
>> information available (perhaps through UCSC or Baylor?s site for the
>> honeybee projects)? Can I use Bioconductor for this if I have the
>> annotation database file Herv? posted?
> 
> The BSgenome.Amellifera.BeeBase.assembly4 package only contains the
> DNA sequences of Apis 4 release. It does *not* contain annotations
> for this assembly.
> 
> One advantage of using the BSgenome.Amellifera.UCSC.apiMel2 package
> instead is that you have an easy access to a world of annotations for
> this genome thru the UCSC genome browser. Too bad that the UCSC folks
> have not plans to support apiMel4:
> 
>    https://lists.soe.ucsc.edu/pipermail/genome/2007-October/014763.html
> 
> apiMel2 is 7 year old now!
> 
> Note that the GenomicFeatures and rtracklayer packages make it really
> simple to import and work with those annotations in R/Bioconductor.
> 
>> 
>> 3.Do I just have to rename the Apis 4 genome package that Herv? posted
>> in order to use it in goseq (I see that there are several naming
>> conventions on the Annotation Data packages)?
> 
> I'll let Matt answer this.
> 
>> 
>> You can see that some of these questions are more appropriate for Herv?
>> and some for Matt, so I decided to email both of you. Some of these
>> issues arise simply because I?ve only been successful with the example
>> in the goseq documentation (using the org.Hs.eg.db Annotation database).
>> Others arise because I am just very new to R and the Bioconductor packages.
> 
> For what is worth, I don't think there is any org.* package for Bee
> (would probably be named something like org.Am.eg.db if there was one).
> And if there was one, you would need to double-check that the
> annotations in it are actually compatible with whatever genome assembly
> you finally decided to use.
> 
>> 
>> Thanks for any help you can offer. And apologies if this is the 100^th
>> time you?ve received an email about this from newbies such as myself.
> 
> No problem. Wish I could help more. I'm cc'ing the Bioconductor mailing
> list (hope you don't mind). It's a better place to ask questions like
> this as other people might be able to help and also the whole
> discussion will be archived and searchable for further reference.
> 
> Cheers,
> H.
> 
>> 
>> Vanessa Corby-Harris
>> 
>> Research Molecular Biologist
>> 
>> USDA-ARS
>> 
>> Carl Hayden Bee Research Center
>> 
>> 2000 E. Allen Rd., Tucson, AZ 85719
>> 
>> (520) 647-9269
>> 
>> This electronic message contains information generated by the USDA
>> solely for the intended recipients. Any unauthorized interception of
>> this message or the use or disclosure of the information it contains may
>> violate the law and subject the violator to civil or criminal penalties.
>> If you believe you have received this message in error, please notify
>> the sender and delete the email immediately.
> 
> 
> -- 
> Herv? Pag?s
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpages at fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
> 
> 
> 
> ------------------------------


______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com



More information about the Bioconductor mailing list