[Bioc-devel] concat problem with CharacterList in mcols of GRanges
Hervé Pagès
hpages at fhcrc.org
Mon Mar 17 20:47:44 CET 2014
Hi Vince,
On 03/16/2014 06:11 PM, Vincent Carey wrote:
> It seems that there is diversity in the classes assigned for ALT in results
> of readVcf, and there was some discussion of this in 1/2013.
Was this discussion on the mailing list?. Can't find it.
If using diverse/unpredictable classes for ALT cannot be avoided, have
you considered using a BStringSetList instead of a CharacterList when
the variant are "structural"?
There is a big divide between DNAStringSetList and CharacterList in
terms of internal representation. But not so much between
DNAStringSetList and BStringSetList. So using BStringSetList instead
of CharacterList would help smoothing out the kind of issues you're
facing here. In particular, even though combining DNAStringSetList
and BStringSetList objects doesn't work right now, that's something
we should definitely support (it would be easy to add).
Cheers,
H.
> So it looks
> like this is predictable and solvable with some upstream work after the
> read.
>
>
> On Sun, Mar 16, 2014 at 7:43 PM, Vincent Carey
> <stvjc at channing.harvard.edu>wrote:
>
>>> c(x[[1]][1:3,1:2], x[[3]][1:3,1:2]) # this works
>> GRanges with 6 ranges and 2 metadata columns:
>> seqnames ranges strand | paramRangeID REF
>> <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
>> [1] 1 [ 10583, 10583] * | dhs_chr1_10402 G
>> [2] 1 [ 10611, 10611] * | dhs_chr1_10402 C
>> [3] 1 [ 10583, 10583] * | dhs_chr1_10502 G
>> [4] 1 [832178, 832178] * | dhs_chr1_833139 A
>> [5] 1 [832266, 832266] * | dhs_chr1_833139 G
>> [6] 1 [832297, 832299] * | dhs_chr1_833139 CTG
>> ---
>> seqlengths:
>> 1
>> NA
>>> x[[1]][1:3,1:3]
>> GRanges with 3 ranges and 3 metadata columns:
>> seqnames ranges strand | paramRangeID REF
>> <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
>> [1] 1 [10583, 10583] * | dhs_chr1_10402 G
>> [2] 1 [10611, 10611] * | dhs_chr1_10402 C
>> [3] 1 [10583, 10583] * | dhs_chr1_10502 G
>> ALT
>> <CharacterList>
>> [1] A
>> [2] G
>> [3] A
>> ---
>> seqlengths:
>> 1
>> NA
>>> c(x[[1]][1:3,1:3], x[[3]][1:3,1:3]) # if i try to concatenate while ALT
>> is included
>> Error in .Primitive("c")(<S4 object of class "CompressedCharacterList">,
>> :
>> all arguments in '...' must have an element class that extends that of
>> the first argument
>>
>> Enter a frame number, or 0 to exit
>>
>> 1: c(x[[1]][1:3, 1:3], x[[3]][1:3, 1:3])
>> 2: c(x[[1]][1:3, 1:3], x[[3]][1:3, 1:3])
>> 3: .local(x, ..., recursive = recursive)
>> 4: .unlist_list_of_GenomicRanges(args, ignore.mcols = ignore.mcols)
>> 5: do.call(rbind, lapply(x, mcols, FALSE))
>> 6: do.call(rbind, lapply(x, mcols, FALSE))
>> 7: (function (..., deparse.level = 1)
>> standardGeneric("rbind"))(<S4 object of
>> 8: standardGeneric("rbind")
>> 9: eval(.dotsCall, env)
>> 10: eval(.dotsCall, env)
>> 11: eval(expr, envir, enclos)
>> 12: .Method(..., deparse.level = deparse.level)
>> 13: lapply(seq_len(length(df)), function(i) {
>> cols <- lapply(args, `[[`, cn[
>> 14: lapply(seq_len(length(df)), function(i) {
>> cols <- lapply(args, `[[`, cn[
>> 15: FUN(1:3[[3]], ...)
>> 16: do.call(c, unname(cols))
>> 17: do.call(c, unname(cols))
>> 18: .Primitive("c")(<S4 object of class "CompressedCharacterList">, <S4
>> object
>> 19: .Primitive("c")(<S4 object of class "CompressedCharacterList">, <S4
>> object
>>
>>> sessionInfo()
>> R Under development (unstable) (2014-03-15 r65199)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] parallel stats graphics grDevices datasets utils tools
>> [8] methods base
>>
>> other attached packages:
>> [1] Biostrings_2.31.14 XVector_0.3.7 GenomicRanges_1.15.39
>> [4] GenomeInfoDb_0.99.19 IRanges_1.21.34 BiocGenerics_0.9.3
>> [7] BatchJobs_1.2 BBmisc_1.5 weaver_1.29.1
>> [10] codetools_0.2-8 digest_0.6.4 BiocInstaller_1.13.3
>>
>> loaded via a namespace (and not attached):
>> [1] DBI_0.2-7 RSQLite_0.11.4 Rcpp_0.11.1 brew_1.0-6
>> [5] fail_1.2 plyr_1.8.1 sendmailR_1.1-2 stats4_3.2.0
>> [9] stringr_0.6.2
>>
>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list