[BioC] GenomicFeatures.Hsapiens.UCSC.hg19

Hervé Pagès hpages at fhcrc.org
Sat Apr 24 23:26:55 CEST 2010


Hi Burak,

Burak Kutlu wrote:
> HI
> Are there plans to release hg19 of GenomicFeatures.Hsapiens.UCSC soon?
> We would greatly appreciate it.

The GenomicFeatures *data* packages are deprecated. Use the
GenomicFeatures *software* package (make sure you use the
current BioC release, i.e. BioC 2.6, it requires R-2.11) to do:

   txdb <- makeTranscriptDbFromUCSC("hg19")

   > txdb
   TranscriptDb object:
   | Db type: TranscriptDb
   | Data source: UCSC
   | Genome: hg19
   | UCSC Table: knownGene
   | Type of Gene ID: Entrez Gene ID
   | Full dataset: yes
   | transcript_nrow: 77614
   | exon_nrow: 281605
   | cds_nrow: 236664
   | Db created by: GenomicFeatures package from Bioconductor
   | Creation time: 2010-04-24 14:22:14 -0700 (Sat, 24 Apr 2010)
   | GenomicFeatures version at creation time: 1.0.0
   | RSQLite version at creation time: 0.8-4

Then have a look at the vignettes in GenomicFeatures to learn how
to query 'txdb' (a TranscriptDb object). For example, to extract
the exon structure of all the transcripts (or, said otherwise, to
extract the exons grouped by transcript):

 > exbytx <- exonsBy(txdb, "tx")
 > exbytx
GRangesList of length 77614
$1
GRanges with 3 ranges and 3 elementMetadata values
     seqnames         ranges strand |   exon_id   exon_name exon_rank
        <Rle>      <IRanges>  <Rle> | <integer> <character> <integer>
[1]     chr1 [11874, 12227]      + |         1          NA         1
[2]     chr1 [12613, 12721]      + |         2          NA         2
[3]     chr1 [13221, 14409]      + |         3          NA         3

$2
GRanges with 3 ranges and 3 elementMetadata values
     seqnames         ranges strand |   exon_id   exon_name exon_rank
        <Rle>      <IRanges>  <Rle> | <integer> <character> <integer>
[1]     chr1 [11874, 12227]      + |         1          NA         1
[2]     chr1 [12595, 12721]      + |         4          NA         2
[3]     chr1 [13403, 14409]      + |         5          NA         3

$3
GRanges with 3 ranges and 3 elementMetadata values
     seqnames         ranges strand |   exon_id   exon_name exon_rank
        <Rle>      <IRanges>  <Rle> | <integer> <character> <integer>
[1]     chr1 [11874, 12227]      + |         1          NA         1
[2]     chr1 [12646, 12697]      + |         6          NA         2
[3]     chr1 [13221, 14409]      + |         3          NA         3

...
<77611 more elements>


seqlengths
                   chr1                  chr2 ... chr18_gl000207_random
              249250621             243199373 ...                  4262

Cheers,
H.

> Thanks
> -burak
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list