[BioC] Multi-sample VCF file and filterVcf

Sigve Nakken [guest] guest at bioconductor.org
Fri May 16 16:05:33 CEST 2014


Hi,

I have a multi-sample VCF file where each record contains genotype information (e.g. GT, DP, AD etc. ) for a number of samples, as well as various annotation tags/values. Basically, I wonder how I can combine the VariantAnnotation and the filterVcf functionality to

- Parse the VCF and
 a) subset/filter variants in samples that satisfy given criteria (e.g. DP >= 10)
 b) subset/filter variants according to annotation criteria or the FILTER column
 c) output the filtered variants in a sample-wise manner

I could not find any examples that dealt with multi-sample VCF files from the documentation.

Thanks,
Sigve






 -- output of sessionInfo(): 

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_0.9.3.1          stringr_0.6.2            xtable_1.7-3             reshape2_1.4             lattice_0.20-29         
 [6] dplyr_0.1.3              VariantAnnotation_1.10.1 Rsamtools_1.16.0         Biostrings_2.32.0        XVector_0.4.0           
[11] GenomicRanges_1.16.3     GenomeInfoDb_1.0.2       IRanges_1.22.6           BiocGenerics_0.10.0     

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.26.0    assertthat_0.1          BatchJobs_1.2           BBmisc_1.6              Biobase_2.24.0         
 [6] BiocParallel_0.6.0      biomaRt_2.20.0          bitops_1.0-6            brew_1.0-6              BSgenome_1.32.0        
[11] codetools_0.2-8         colorspace_1.2-4        DBI_0.2-7               digest_0.6.4            evaluate_0.5.5         
[16] fail_1.2                foreach_1.4.2           formatR_0.10            GenomicAlignments_1.0.1 GenomicFeatures_1.16.0 
[21] grid_3.1.0              gtable_0.1.2            iterators_1.0.7         knitr_1.5               labeling_0.2           
[26] MASS_7.3-31             munsell_0.4.2           plyr_1.8.1              proto_0.3-10            Rcpp_0.11.1            
[31] RCurl_1.95-4.1          RSQLite_0.11.4          rtracklayer_1.24.0      scales_0.2.4            sendmailR_1.1-2        
[36] stats4_3.1.0            tools_3.1.0             XML_3.98-1.1            zlibbioc_1.10.0 

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list