[Bioc-devel] exonsBy dropping genes from TxDb

Leonard Goldstein goldstein.leonard at gene.com
Fri Oct 27 18:07:06 CEST 2017


Dear bioc-devel,

I noticed exonsBy is dropping a lot of genes when run on a TxDb object
created with makeTxDbFromBiomart (see below). Please also see related post
on the Bioconductor support site:

https://support.bioconductor.org/p/101951/#102160

Thanks for your help.

Leonard

--
> tx <- makeTxDbFromBiomart()
> txs_by_gene <- transcriptsBy(tx, "gene")
> exs_by_gene <- exonsBy(tx, "gene")
> length(txs_by_gene)
[1] 63967
> length(exs_by_gene)
[1] 36751
> subsetByOverlaps(txs_by_gene, GRanges("8", IRanges(127735434,127741434)))
GRangesList object of length 1:
$ENSG00000136997
GRanges object with 9 ranges and 2 metadata columns:
      seqnames                 ranges strand |     tx_id         tx_name
         <Rle>              <IRanges>  <Rle> | <integer>     <character>
  [1]        8 [127735434, 127740477]      + |    101876 ENST00000259523
  [2]        8 [127735473, 127735817]      + |    101877 ENST00000641252
  [3]        8 [127735519, 127738772]      + |    101878 ENST00000517291
  [4]        8 [127736046, 127736612]      + |    101879 ENST00000641036
  [5]        8 [127736069, 127741434]      + |    101880 ENST00000621592
  [6]        8 [127736084, 127741434]      + |    101881 ENST00000377970
  [7]        8 [127736220, 127741372]      + |    101882 ENST00000524013
  [8]        8 [127736231, 127738475]      + |    101883 ENST00000520751
  [9]        8 [127736594, 127740958]      + |    101884 ENST00000613283

-------
seqinfo: 555 sequences (1 circular) from an unspecified genome
> subsetByOverlaps(exs_by_gene, GRanges("8", IRanges(127735434,127741434)))
GRangesList object of length 0:
<0 elements>

-------
seqinfo: 555 sequences (1 circular) from an unspecified genome
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6

Matrix products: default
BLAS:
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK:
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] GenomicFeatures_1.28.5 AnnotationDbi_1.38.2   Biobase_2.36.2
[4] GenomicRanges_1.28.6   GenomeInfoDb_1.12.3    IRanges_2.10.5
[7] S4Vectors_0.14.7       BiocGenerics_0.22.1

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.13               XVector_0.16.0
 [3] GenomicAlignments_1.12.2   zlibbioc_1.22.0
 [5] BiocParallel_1.10.1        bit_1.1-12
 [7] lattice_0.20-35            rlang_0.1.2
 [9] blob_1.1.0                 tools_3.4.1
[11] grid_3.4.1                 SummarizedExperiment_1.6.5
[13] DBI_0.7                    matrixStats_0.52.2
[15] bit64_0.9-7                digest_0.6.12
[17] tibble_1.3.4               Matrix_1.2-11
[19] GenomeInfoDbData_0.99.0    rtracklayer_1.36.6
[21] bitops_1.0-6               RCurl_1.95-4.8
[23] biomaRt_2.32.1             memoise_1.1.0
[25] RSQLite_2.0                DelayedArray_0.2.7
[27] compiler_3.4.1             Rsamtools_1.28.0
[29] Biostrings_2.44.2          XML_3.98-1.9
[31] pkgconfig_2.0.1
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list