[BioC] [VariantAnnotation] subsetting VCF objects
Paul Theodor Pyl
paul.theodor.pyl at embl.de
Wed Nov 14 14:41:24 CET 2012
Hi all,
I am reading in some .vcf files with the readVcf function and realized
that I cannot subset the resulting VCF objects if the info field is
empty, see example below.
Is there a workaround except for loading the info at least partially?
Thanks,
Paul
The Example:
> vcf_full = readVcf("test.vcf.gz", "hg19")
> vcf_no_info = readVcf("test.vcf.gz", "hg19", param = ScanVcfParam(
geno=c("GT","GQ"), fixed="ALT", info=NA ))
vcf_full
class: VCF
dim: 71128 2
genome: hg19
exptData(1): header
fixed(4): REF ALT QUAL FILTER
info(22): AC AF ... SB STR
geno(5): AD DP GQ GT PL
rownames(71128): rs62224610 rs141578542 ... 22:51243743 22:51244332
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
> vcf_no_info
class: VCF
dim: 71128 2
genome: hg19
exptData(1): header
fixed(2): REF ALT
info(0):
geno(2): GQ GT
rownames(71128): rs62224610 rs141578542 ... 22:51243743 22:51244332
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
> vcf_full[1:10]
class: VCF
dim: 10 2
genome: hg19
exptData(1): header
fixed(4): REF ALT QUAL FILTER
info(22): AC AF ... SB STR
geno(5): AD DP GQ GT PL
rownames(10): rs62224610 rs141578542 ... 22:16058463 rs149413786
rowData values names(1): paramRangeID
colnames(2): sample_one sample_two
colData names(1): Samples
> vcf_no_info[1:10]
Error in slot(x, "info")[i, , drop = FALSE] :
selecting rows: subscript contains NAs or out of bounds indices
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] VariantAnnotation_1.4.3 Rsamtools_1.10.2 Biostrings_2.26.2
[4] GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.20.2 Biobase_2.18.0 biomaRt_2.14.0
[4] bitops_1.0-5 BSgenome_1.26.1 compiler_2.15.2
[7] DBI_0.2-5 GenomicFeatures_1.10.0 parallel_2.15.2
[10] RCurl_1.95-3 RSQLite_0.11.2 rtracklayer_1.18.0
[13] stats4_2.15.2 tools_2.15.2 XML_3.95-0.1
[16] zlibbioc_1.4.0
More information about the Bioconductor
mailing list