[Bioc-devel] asVCF error coming from normalizeSingleBracketSubscript
Valerie Obenchain
vobencha at fhcrc.org
Mon Nov 25 23:08:42 CET 2013
Hi Stephanie,
The error is thrown from SeqArray:::.info at line 216 in the file and is
related to the handing of NA values.
x[x == ""] <- NA
Output from the == comparison can contain NAs and therefore can't be
used (consistently) in subsetting operations.
'x' is a NumericList.
Browse[2]> x
NumericList of length 5
[[1]] 0.5
[[2]] 0.017000000923872
[[3]] 0.333000004291534 0.666999995708466
[[4]] <NA>
[[5]] <NA> <NA>
Here we see NAs returned for the NA values,
Browse[2]> x == ""
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] <NA>
[[5]] <NA> <NA>
which fail on subsetting.
Browse[2]> x[x == ""]
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs
One solution is use %in% which does not return NAs.
Browse[2]> x %in% ""
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] FALSE
[[5]] FALSE FALSE
Valerie
On 11/22/2013 03:11 PM, Stephanie M. Gogarten wrote:
> Hi Valerie,
>
> The asVCF method in SeqArray is failing as of today with a (to me)
> mysterious error. I get it for the test files chr22.vcf.gz, ex2.vcf,
> and gl_chr1.vcf in extdata of VariantAnnotation, but not for
> SeqArray/extdata/CEU_Exon.vcf. Do you have any suggestions of where I
> might look to figure out where this error is coming from?
>
> thanks,
> Stephanie
>
> > vcffile <- system.file("extdata", "ex2.vcf",
> package="VariantAnnotation")
> > gdsfile <- tempfile()
> > seqVCF2GDS(vcffile, gdsfile)
> > gdsobj <- seqOpen(gdsfile)
> > options(error=recover)
> > vcfg <- asVCF(gdsobj)
> Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs
>
> Enter a frame number, or 0 to exit
>
> 1: asVCF(gdsobj)
> 2: asVCF(gdsobj)
> 3: .local(x, ...)
> 4: VCF(rowData = .rowData(x), colData = .colData(x), exptData =
> SimpleList(hea
> 5: .info(x, info)
> 6: `[<-`(`*tmp*`, x == "", value = NA)
> 7: `[<-`(`*tmp*`, x == "", value = NA)
> 8: lsubset_List_by_List(x, i, value)
> 9: .fast_lsubset_List_by_List(x, i, value)
> 10: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
> 11: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
> 12: extractROWS(setNames(seq_along(x), names(x)), i)
> 13: extractROWS(setNames(seq_along(x), names(x)), i)
> 14: normalizeSingleBracketSubscript(i, x)
>
> > sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] VariantAnnotation_1.8.6 Rsamtools_1.14.1 Biostrings_2.30.1
> [4] GenomicRanges_1.14.3 XVector_0.2.0 IRanges_1.20.6
> [7] BiocGenerics_0.8.0 SeqArray_1.2.0 gdsfmt_1.0.0
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.24.0 Biobase_2.22.0 biomaRt_2.18.0
> [4] bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7
> [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1 RSQLite_0.11.4
> [10] rtracklayer_1.22.0 stats4_3.0.2 tools_3.0.2
> [13] XML_3.95-0.2 zlibbioc_1.8.0
More information about the Bioc-devel
mailing list