[BioC] zero-row result breaks select() on PolyPhen.Hsapiens.* and SIFT.Hsapiens.*
Valerie Obenchain
vobencha at fhcrc.org
Mon Sep 23 21:41:50 CEST 2013
Hi Robert,
Thanks for reporting this. Now fixed in VariantAnnotation 1.7.47.
Have you looked at the ensemblVEP package? It's a wrapper to Ensembl's
Variant Effect Predictor tool. We encourage the use of ensemblVEP
instead of the SIFT and PolyPhen databases because it accesses the most
current information. As you know, the SIFT and PolyPhen dbs are becoming
dated and we don't have plans to package newer versions.
emsemblVEP requires that you download and install the script located here,
http://uswest.ensembl.org/info/docs/tools/vep/script/vep_download.html
The variant_effect_predictor.pl executable must be in your path. Let us
know if you have trouble with the install/setup.
Valerie
On 09/20/2013 05:25 PM, Robert Castelo wrote:
> Dear list,
>
> interrogating the TxDb.Hsapiens.UCSC.hg19.knownGene package with no
> result gives the following expected result:
>
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> select(TxDb.Hsapiens.UCSC.hg19.knownGene, keys="dummy",
> keytype="GENEID", cols="SYMBOL")
> [1] GENEID
> <0 rows> (or 0-length row.names)
>
> however, when i try the same with the annotation packages
> PolyPhen.Hsapiens.dbSNP131 and SIFT.Hsapiens.dbSNP132, the select
> instruction breaks into an error:
>
> library(SIFT.Hsapiens.dbSNP132)
> library(PolyPhen.Hsapiens.dbSNP131)
>
> select(SIFT.Hsapiens.dbSNP132, keys=c("dummy"))
> Error in data.frame(RSID = unlist(rsid), PROTEINID = unlist(protein_id), :
> arguments imply differing number of rows: 1, 0
>
> select(PolyPhen.Hsapiens.dbSNP131, keys="dummy")
> Error in `*tmp*`$RSID : $ operator is invalid for atomic vectors
>
> i guess these two annotation packages should work analogously to
> TxDb.Hsapiens.UCSC.hg19.knownGene, and give just a 0-row data.frame
> object, right?
>
> these errors reproduce also with the current devel version of BioC,
> please find below both sessionInfo() outputs.
>
> cheers,
> robert.
>
> =====RELEASE====
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 GenomicFeatures_1.12.3
> [3] AnnotationDbi_1.22.6 Biobase_2.20.1
> [5] PolyPhen.Hsapiens.dbSNP131_1.0.2 SIFT.Hsapiens.dbSNP132_1.0.2
> [7] RSQLite_0.11.4 DBI_0.2-7
> [9] VariantAnnotation_1.6.7 Rsamtools_1.12.4
> [11] Biostrings_2.28.0 GenomicRanges_1.12.5
> [13] IRanges_1.18.3 BiocGenerics_0.6.0
> [15] vimcom_0.9-8 setwidth_1.0-3
> [17] colorout_1.0-0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.16.0 bitops_1.0-6 BSgenome_1.28.0
> RCurl_1.95-4.1 rtracklayer_1.20.4
> [6] stats4_3.0.1 tools_3.0.1 XML_3.95-0.2 zlibbioc_1.6.0
>
>
>
> =====DEVEL=====
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.9.2 GenomicFeatures_1.13.40
> [3] AnnotationDbi_1.23.23 Biobase_2.21.7
> [5] PolyPhen.Hsapiens.dbSNP131_1.0.2 SIFT.Hsapiens.dbSNP132_1.0.2
> [7] RSQLite_0.11.4 DBI_0.2-7
> [9] VariantAnnotation_1.7.46 Rsamtools_1.13.41
> [11] Biostrings_2.29.19 GenomicRanges_1.13.44
> [13] XVector_0.1.4 IRanges_1.19.37
> [15] BiocGenerics_0.7.5 vimcom_0.9-8
> [17] setwidth_1.0-3 colorout_1.0-0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.17.3 bitops_1.0-6 BSgenome_1.29.1
> RCurl_1.95-4.1 rtracklayer_1.21.12
> [6] stats4_3.0.1 tools_3.0.1 XML_3.95-0.2 zlibbioc_1.7.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list