I just stumbled upon some oddities with rtracklayer and data generated from the NCBI/Ensembl annotation of the Drosophila genome. It seems like bigwig saved with the official NCBI/Ensembl/flybase mitochorion chromosome name dmel_mitochondrion_genome breaks the importation of a previously saved files (which does not break when loaded in other tools like the Broad IGV). Changing the chromosome name to something smaller fixes the issues. Is that a bug or it’s per design… Look like a bug to me… Here’s a sample script recapitulating the issue.

Thanks

library(GenomicFeatures)
library(rtracklayer)

## Retreveing the chromsome info from BioMart
chr.info <- getChromInfoFromBiomart(biomart="ensembl",
                                    dataset="dmelanogaster_gene_ensembl")

## Simulate some coverage along all the main chromsome
## Randomly distribute these 50 nt reads along the chromosomes in either orientation
density <- 0.05
nreads <- sapply(chr.info$length, function(length) round(rnorm(1,length*density,(length*density)*0.075)))
chrs <- Rle(rep(chr.info$chrom,nreads))
starts <- unlist(mapply(function(n,l) ceiling((runif(n,0,l))),nreads,chr.info$length))
strands <- Rle(unlist(sapply(nreads,function(n) sample(c('+','-'),n,replace=TRUE))))

## Create a GRanges object similuating the read positions
GR <- GRanges(chrs,IRanges(starts,width=50),strands)

## Computing the coverage along the chromosomes
cov <- coverage(GR)

## Saving as bigwig
export(cov,'cov.bw')

## Reading back the bigwig
bw <- import('cov.bw')

## This is the error I get:
## Error in .local(con, format, text, ...) : UCSC library operation failed
## In addition: Warning message:
## In .local(con, format, text, ...) :
##   Value size mismatch between bptFileFind (valSize=0) and cov.bw (valSize=8)

## Replacing the name dmel_mitochondrion_genome for MT
names(cov)[grep("mito",names(cov))] <- 'MT'

## Saving as bigwig
export(cov,'cov_MT.bw')

## Reading back the bigwig with the modified name now works
bw <- import('cov_MT.bw')

——
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] rtracklayer_1.22.7     GenomicFeatures_1.14.5 AnnotationDbi_1.24.0
[4] Biobase_2.22.0         GenomicRanges_1.14.4   XVector_0.2.0
[7] IRanges_1.20.7         BiocGenerics_0.8.0

loaded via a namespace (and not attached):
 [1] biomaRt_2.18.0    Biostrings_2.30.1 bitops_1.0-6      BSgenome_1.30.0
 [5] compiler_3.0.2    DBI_0.2-7         RCurl_1.95-4.1    Rsamtools_1.14.3
 [9] RSQLite_0.11.4    stats4_3.0.2      tools_3.0.2       XML_3.98-1.1
[13] zlibbioc_1.8.0

--  Marco Blanchette, Ph.D.
Genomic Scientist
Stowers Institute for Medical Research
1000 East 50th Street
Kansas City MO 64110
www.stowers.ot

Tel: 816-926-4071
Cell: 816-726-8419
Fax: 816-926-2018

	[[alternative HTML version deleted]]

