[BioC] Multi-sample VCF file and filterVcf
Sigve Nakken [guest]
guest at bioconductor.org
Fri May 16 16:05:33 CEST 2014
Hi,
I have a multi-sample VCF file where each record contains genotype information (e.g. GT, DP, AD etc. ) for a number of samples, as well as various annotation tags/values. Basically, I wonder how I can combine the VariantAnnotation and the filterVcf functionality to
- Parse the VCF and
a) subset/filter variants in samples that satisfy given criteria (e.g. DP >= 10)
b) subset/filter variants according to annotation criteria or the FILTER column
c) output the filtered variants in a sample-wise manner
I could not find any examples that dealt with multi-sample VCF files from the documentation.
Thanks,
Sigve
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_0.9.3.1 stringr_0.6.2 xtable_1.7-3 reshape2_1.4 lattice_0.20-29
[6] dplyr_0.1.3 VariantAnnotation_1.10.1 Rsamtools_1.16.0 Biostrings_2.32.0 XVector_0.4.0
[11] GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.6 BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.26.0 assertthat_0.1 BatchJobs_1.2 BBmisc_1.6 Biobase_2.24.0
[6] BiocParallel_0.6.0 biomaRt_2.20.0 bitops_1.0-6 brew_1.0-6 BSgenome_1.32.0
[11] codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7 digest_0.6.4 evaluate_0.5.5
[16] fail_1.2 foreach_1.4.2 formatR_0.10 GenomicAlignments_1.0.1 GenomicFeatures_1.16.0
[21] grid_3.1.0 gtable_0.1.2 iterators_1.0.7 knitr_1.5 labeling_0.2
[26] MASS_7.3-31 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.1
[31] RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.24.0 scales_0.2.4 sendmailR_1.1-2
[36] stats4_3.1.0 tools_3.1.0 XML_3.98-1.1 zlibbioc_1.10.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list