[BioC] Where to get BAM files for easyRNASeq human use case ALSO ANNOTATION
Martin Morgan
mtmorgan at fhcrc.org
Thu Aug 16 19:34:52 CEST 2012
On 08/16/2012 10:29 AM, Richard Friedman wrote:
> Steve,
>
> Thanks. I use annaffy for microarrays and was hoping for an
> already-worked-out protocol. I will however look into the package
> you recommend if no more explicit protocol is available.
Not so much an already worked out protocol but an elaboration of Steve's bet
An AnnotateSeq package would be a useful addition; the info in annaffy
is in the org packages, discoverable with 'cols', 'keytypes' (often
synonymous with 'cols'), and accessible via 'select'. The plans for the
next release are OrganismDb objects that make the merge that one would
do across, say, org*, TxDb*, and GO.db packages transparent.
> library(org.Dm.eg.db)
> cols(org.Dm.eg.db)
[1] "ENTREZID" "ACCNUM" "ALIAS" "CHR"
"CHRLOC"
[6] "CHRLOCEND" "ENZYME" "MAP" "PATH" "PMID"
[11] "REFSEQ" "SYMBOL" "UNIGENE" "ENSEMBL"
"ENSEMBLPROT"
[16] "ENSEMBLTRANS" "GENENAME" "UNIPROT" "GO"
"EVIDENCE"
[21] "ONTOLOGY" "FLYBASE" "FLYBASECG" "FLYBASEPROT"
> select(org.Dm.eg.db, "FBtr0005009", c("GENENAME", "SYMBOL"),
"ENSEMBLTRANS")
ENSEMBLTRANS GENENAME SYMBOL
1 FBtr0005009 Muscle protein 20 Mp20
Martin
>
> Best wishes,
> Rich
>
> On Aug 16, 2012, at 1:25 PM, Steve Lianoglou wrote:
>
>> Hi,
>>
>> On Thu, Aug 16, 2012 at 1:17 PM, Richard Friedman
>> <friedman at cancercenter.columbia.edu> wrote:
>> [snip]
>>> I would like then to ask a broader question - one that I was
>>> going to ask after I completed the vignette:
>>> Is it possible to obtain annotation for RNASeq data analogous
>>> to the kind obtained for microarrays?
>>> To be specific, when I analyze affymetrix microarrays I get, for
>>> each probeset the Entrez gene symbol and a description of the gene
>>> which could be several words long, as well as gene ontology categories
>>> and pathways. I can output this information as an Excel spreadsheet.
>>> When I work through the drosophila vignette with transcriptCounts or
>>> geneCounts I got accession numbers (e.g.,"FBtr0005009") but no gene
>>> symbols etc.
>>>
>>> Do you have any suggestions as to how to get Entrez Gene Symbols,
>>> descriptions, etc, for RNASeq output with easy RNASeq?
>> [/snip]
>>
>> Perhaps I'm missing something, but given accession numbers (or other
>> gene identifiers), it should be pretty straightforward to jimmy up
>> something using the org.*.eg.db packages, no?
>>
>> I suspect you won't get gene descriptions there -- but if I were a
>> gambling man, I would bet you can probably get that last piece of the
>> puzzle from biomaRt.
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list