[Bioc-devel] gds nodes dimensions are inconsistent

Xiuwen Zheng zhengx @end|ng |rom u@w@@h|ngton@edu
Tue Feb 11 21:58:10 CET 2020


Hi Qian,

I have modified the GDS file in the SeqArray package.
"annotation/info/AA" should have fewer values than "annotation/info/AC",
since it is a variable-length vector in the new GDS file.
VCF format allows storing variable-length data, so SeqArray also allows
variable-length data.

seqGetData() has been updated in SeqArray_1.27.8, with a new option
'.padNA' for padding array with NA if possible.
Please revise your package according to the new function in SeqArray.

Best wishes,

Xiuwen



On Tue, Feb 11, 2020 at 12:45 PM Liu, Qian <Qian.Liu using roswellpark.org> wrote:

> Dear Dr. Zheng & SeqArray maintainer,
>
> I have a Bioconductor package called "GDSArray" that interfaces GDS file
> nodes as DelayedArray instances. In this new Bioc devel version of 3.11,
> this package failed all platforms. The debugging shows inconsistent
> dimensions calculated from different SeqArray / gdsfmt functions. Following
> is some reproducible code showing that the "annotation/info/AA" node has
> different dimension from "AC" and the overall "num.variant" calculated from
> "SeqSummary". It works fine in the Bioc 3.10 (dimension of AA is 1348).
> Thanks!
>
> Best,
> Qian
>
>
> ```{r}
>
>
> library(SeqArray)
> file <- seqExampleFileName("gds")
> f <- seqOpen(file)
> objdesp.gdsn(index.gdsn(f, "annotation/info/AA"))$dim
> ## [1] 1328
>
>
> objdesp.gdsn(index.gdsn(f, "annotation/info/AC"))$dim
> ## [1] 1348
>
>
> seqSummary(f, verbose=FALSE)$num.variant
> ## [1] 1348
>
>
> seqClose(f)
>
> !> sessionInfo()
>  R Under development (unstable) (2020-01-07 r77631)
>  Platform: x86_64-pc-linux-gnu (64-bit)
>  Running under: Ubuntu 18.04.3 LTS
>
>  Matrix products: default
>  BLAS:   /home/qian/miniconda3/envs/r-devel/lib/R/lib/libRblas.so
>  LAPACK: /home/qian/miniconda3/envs/r-devel/lib/R/lib/libRlapack.so
>
>  locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>  [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>  attached base packages:
>  [1] stats     graphics  grDevices utils     datasets  methods   base
>
>  other attached packages:
>  [1] SeqArray_1.27.8 gdsfmt_1.23.5
>
>  loaded via a namespace (and not attached):
>   [1] IRanges_2.21.2         Biostrings_2.55.4      crayon_1.3.4
>   [4] bitops_1.0-6           GenomeInfoDb_1.23.1    stats4_4.0.0
>   [7] zlibbioc_1.33.1        XVector_0.27.0         S4Vectors_0.25.11
>  [10] tools_4.0.0            RCurl_1.95-4.12        parallel_4.0.0
>  [13] compiler_4.0.0         BiocGenerics_0.33.0    GenomicRanges_1.39.1
>  [16] GenomeInfoDbData_1.2.2
> ```
>
>
>
>
> This email message may contain legally privileged and/or confidential
> information. If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited. If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list