[BioC] Regarding extraction of 3' and 5'UTRs and exonic region of a gene.

Hervé Pagès hpages at fhcrc.org
Sat Jun 29 00:23:32 CEST 2013


Hi Abdul,

Good that you mention KEGG and I should probably have mentioned the
KEGG.db package for step 1 of the proposed workflow. Even though I've
no direct experience with it. Unfortunately, my understanding is that
it's about to be deprecated  (because of licensing issues). I heard
there are some alternatives though. Hopefully more knowledgeable people
will chime in with helpful suggestions.

Cheers,
H.


On 06/27/2013 09:57 PM, Abdul Rawoof wrote:
> Thanks for your kind suggestion and I will try to follow your suggested
> workflow and obviously it will take time to learn all this packages as I
> never go through it.
>
> One more thing I want to ask that how can I download the list of all
> available cancer genes for human from Kegg database  for wnt signaling
> pathways??
> Please forgive me if I asked any senseless question as I have not tried
> that mentioned packages till now.
>
> Thanks,
> Abdul Rawoof
>
>
>
> On Thu, Jun 27, 2013 at 11:01 PM, Hervé Pagès <hpages at fhcrc.org
> <mailto:hpages at fhcrc.org>> wrote:
>
>     Hi Abdul,
>
>     Suggested workflow:
>
>     1. Build the list of genes involved in the particular cancer you're
>         interested in. Could be a vector of gene ids or transcript ids (not
>         all transcripts are necessarily linked to a gene).
>
>         Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages,
>         maybe the DO.db package, etc... I'm not sure what would be the best
>         tool for this. But maybe you already have your list of genes?
>
>     2. Use the TxDb.Hsapiens.UCSC.hg19.__knownGene + GenomicFeatures
>     packages
>         to extract the coordinates of the 5'UTRs and 3'UTRs.
>         Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions
>         for this. They'll return the result in a GRangesList object (you'll
>         have to become a bit familiar with those objects first).
>
>     3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the
>         extractTranscriptsFromGenome() function from the GenomicFeatures
>         package to extract the UTR sequences.
>         The name of the function is misleading but it can be used to extract
>         CDS or UTR sequences in addition to transcript sequences.
>
>     If you've never used those tools before, it will take you some time to
>     get familiarized with them. Your best friends are the man pages for the
>     individual functions/classes you're going to run into (don't miss the
>     examples section) and the vignettes in the GenomicRanges and
>     GenomicFeatures package.
>
>     Let us know if you have specific questions or run into specific problems
>     (show us what you've done and explain the problem -- don't forget your
>     sessionInfo()).
>
>     Good luck,
>     H.
>
>
>     On 06/27/2013 01:58 AM, Abdul Rawoof wrote:
>
>         Hello everyone,
>
>
>         Could anyone show me the way how can I extract the *3' and 5'
>         UTRs and
>         exonic regions *of all *Human genes* from *Ensembl and Kegg
>         database* that
>         are involved in particular cancer specially *breast cancer *using
>
>         R/Biocondutor.
>
>         Thanks in advance.
>
>         Abdul Rawoof
>
>                  [[alternative HTML version deleted]]
>
>         _________________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/__listinfo/bioconductor
>         <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>         Search the archives:
>         http://news.gmane.org/gmane.__science.biology.informatics.__conductor
>         <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fhcrc.org <mailto:hpages at fhcrc.org>
>     Phone:  (206) 667-5791
>     Fax:    (206) 667-1319
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list