[Bioc-devel] seqnames missing in headerTabix()

Anita Lerch anita.lerch at fmi.ch
Tue Aug 23 16:02:40 CEST 2011


Hi,

I tried to stream a 'gtf' file from the ensemble with the Tabix methods.
The creation of the index files seems to work, but when I checked it
with headerTabix(tbx)$seqnames and got character(0).
Of course the scanTabix() didn't worked then too.
I do not have this problem with the example file in the Rsamtools
package.
Does anybody has an explanation for this?

Thanks in advance,
Anita

> library(Rsamtools)
> url <- "ftp://ftp.ensembl.org/pub/release-62/gtf/drosophila_melanogaster/Drosophila_melanogaster.BDGP5.25.62.gtf.gz"
> gtfFn <- "Drosophila_melanogaster.BDGP5.25.62.gtf.gz"
> download.file(url, gtfFn, "wget")
> indexTabix(gtfFn, format="gff")
[1] "Drosophila_melanogaster.BDGP5.25.62.gtf.gz.tbi"
> tbx <- open(TabixFile(gtfFn))
> headerTabix(tbx)
$seqnames
character(0)

$indexColumns
  seq start   end 
    1     4     5 

$skip
[1] 0

$comment
[1] "#"

$header
character(0)

> seqnamesTabix(tbx)
character(0)
> cat(yieldTabix(tbx, yieldSize=1L))
> param <- GRanges(c("3L", "3R"), IRanges(c(1, 1), width=100000))
> scanTabix(tbx, param=param)
Error: scanTabix: '3L' not present in tabix index
  path: /home_fmi/01/lerchani/workspace/Drosophila_melanogaster.BDGP5.25.62.gtf.gz

> sessionInfo()
R Under development (unstable) (2011-08-23 r56776)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rsamtools_1.5.51     Biostrings_2.21.9    GenomicRanges_1.5.28 IRanges_1.11.24     

loaded via a namespace (and not attached):
[1] BSgenome_1.21.3     RCurl_1.6-9         rtracklayer_1.13.11 tools_2.14.0        XML_3.4-2           zlibbioc_0.1.7   

-- 
Anita Lerch
Friedrich Miescher Institute
Maulbeerstrasse 66
WRO-1066.P22
4058 Basel
Phone: +41 (0)61 697 5172
Email: anita.lerch at fmi.ch



More information about the Bioc-devel mailing list