[BioC] Regarding extraction of 3' and 5'UTRs and exonic region of a gene.
Hervé Pagès
hpages at fhcrc.org
Thu Jun 27 19:31:03 CEST 2013
Hi Abdul,
Suggested workflow:
1. Build the list of genes involved in the particular cancer you're
interested in. Could be a vector of gene ids or transcript ids (not
all transcripts are necessarily linked to a gene).
Suggested tools (no exhaustive): GO.db and org.Hs.eg.db packages,
maybe the DO.db package, etc... I'm not sure what would be the best
tool for this. But maybe you already have your list of genes?
2. Use the TxDb.Hsapiens.UCSC.hg19.knownGene + GenomicFeatures packages
to extract the coordinates of the 5'UTRs and 3'UTRs.
Use the fiveUTRsByTranscript() and threeUTRsByTranscript() functions
for this. They'll return the result in a GRangesList object (you'll
have to become a bit familiar with those objects first).
3. Use the BSgenome.Hsapiens.UCSC.hg19 package and the
extractTranscriptsFromGenome() function from the GenomicFeatures
package to extract the UTR sequences.
The name of the function is misleading but it can be used to extract
CDS or UTR sequences in addition to transcript sequences.
If you've never used those tools before, it will take you some time to
get familiarized with them. Your best friends are the man pages for the
individual functions/classes you're going to run into (don't miss the
examples section) and the vignettes in the GenomicRanges and
GenomicFeatures package.
Let us know if you have specific questions or run into specific problems
(show us what you've done and explain the problem -- don't forget your
sessionInfo()).
Good luck,
H.
On 06/27/2013 01:58 AM, Abdul Rawoof wrote:
> Hello everyone,
>
>
> Could anyone show me the way how can I extract the *3' and 5' UTRs and
> exonic regions *of all *Human genes* from *Ensembl and Kegg database* that
> are involved in particular cancer specially *breast cancer *using
> R/Biocondutor.
>
> Thanks in advance.
>
> Abdul Rawoof
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list