[BioC] zero-row result breaks select() on PolyPhen.Hsapiens.* and SIFT.Hsapiens.*

Valerie Obenchain vobencha at fhcrc.org
Mon Sep 23 21:41:50 CEST 2013


Hi Robert,

Thanks for reporting this. Now fixed in VariantAnnotation 1.7.47.

Have you looked at the ensemblVEP package? It's a wrapper to Ensembl's 
Variant Effect Predictor tool. We encourage the use of ensemblVEP 
instead of the SIFT and PolyPhen databases because it accesses the most 
current information. As you know, the SIFT and PolyPhen dbs are becoming 
dated and we don't have plans to package newer versions.

emsemblVEP requires that you download and install the script located here,

http://uswest.ensembl.org/info/docs/tools/vep/script/vep_download.html

The variant_effect_predictor.pl executable must be in your path. Let us 
know if you have trouble with the install/setup.

Valerie

On 09/20/2013 05:25 PM, Robert Castelo wrote:
> Dear list,
>
> interrogating the TxDb.Hsapiens.UCSC.hg19.knownGene package with no
> result gives the following expected result:
>
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> select(TxDb.Hsapiens.UCSC.hg19.knownGene, keys="dummy",
> keytype="GENEID", cols="SYMBOL")
> [1] GENEID
> <0 rows> (or 0-length row.names)
>
> however, when i try the same with the annotation packages
> PolyPhen.Hsapiens.dbSNP131 and SIFT.Hsapiens.dbSNP132, the select
> instruction breaks into an error:
>
> library(SIFT.Hsapiens.dbSNP132)
> library(PolyPhen.Hsapiens.dbSNP131)
>
> select(SIFT.Hsapiens.dbSNP132, keys=c("dummy"))
> Error in data.frame(RSID = unlist(rsid), PROTEINID = unlist(protein_id),  :
>    arguments imply differing number of rows: 1, 0
>
> select(PolyPhen.Hsapiens.dbSNP131, keys="dummy")
> Error in `*tmp*`$RSID : $ operator is invalid for atomic vectors
>
> i guess these two annotation packages should work analogously to
> TxDb.Hsapiens.UCSC.hg19.knownGene, and give just a 0-row data.frame
> object, right?
>
> these errors reproduce also with the current devel version of BioC,
> please find below both sessionInfo() outputs.
>
> cheers,
> robert.
>
> =====RELEASE====
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets methods base
>
> other attached packages:
>   [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 GenomicFeatures_1.12.3
>   [3] AnnotationDbi_1.22.6 Biobase_2.20.1
>   [5] PolyPhen.Hsapiens.dbSNP131_1.0.2 SIFT.Hsapiens.dbSNP132_1.0.2
>   [7] RSQLite_0.11.4 DBI_0.2-7
>   [9] VariantAnnotation_1.6.7 Rsamtools_1.12.4
> [11] Biostrings_2.28.0 GenomicRanges_1.12.5
> [13] IRanges_1.18.3 BiocGenerics_0.6.0
> [15] vimcom_0.9-8 setwidth_1.0-3
> [17] colorout_1.0-0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.16.0     bitops_1.0-6       BSgenome_1.28.0
> RCurl_1.95-4.1     rtracklayer_1.20.4
> [6] stats4_3.0.1       tools_3.0.1        XML_3.95-0.2 zlibbioc_1.6.0
>
>
>
> =====DEVEL=====
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets methods base
>
> other attached packages:
>   [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 GenomicFeatures_1.13.40
>   [3] AnnotationDbi_1.23.23 Biobase_2.21.7
>   [5] PolyPhen.Hsapiens.dbSNP131_1.0.2 SIFT.Hsapiens.dbSNP132_1.0.2
>   [7] RSQLite_0.11.4 DBI_0.2-7
>   [9] VariantAnnotation_1.7.46 Rsamtools_1.13.41
> [11] Biostrings_2.29.19 GenomicRanges_1.13.44
> [13] XVector_0.1.4 IRanges_1.19.37
> [15] BiocGenerics_0.7.5 vimcom_0.9-8
> [17] setwidth_1.0-3 colorout_1.0-0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.17.3      bitops_1.0-6        BSgenome_1.29.1
> RCurl_1.95-4.1      rtracklayer_1.21.12
> [6] stats4_3.0.1        tools_3.0.1         XML_3.95-0.2 zlibbioc_1.7.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list