[Bioc-devel] concat problem with CharacterList in mcols of GRanges
Valerie Obenchain
vobencha at fhcrc.org
Mon Mar 17 17:38:11 CET 2014
Hi Vince,
ALT is a CharacterList when the variants are 'structural', i.e., when
the ALT field has non-nucleotide characters. Otherwise, ALT is a
DNAStringSetList.
It looks like ALT is a CharacterList in x[[1]] and is probably a
DNAStringSetList in x[[3]]. So you are trying to combine GRanges with
these two different types in ALT:
gr1 <- GRanges("1", IRanges(1:3, width=1),
ALT = CharacterList(list("A", "G", "A")))
gr2 <- GRanges("1", IRanges(5:7, width=1),
ALT = DNAStringSetList(list("T", "A", "C")))
>> c(gr1, gr2)
> Error in .Primitive("c")(<S4 object of class "CompressedCharacterList">, :
> all arguments in '...' must have an element class that extends that of the first argument
The ALT values in the subset of x[[1]] are all nucleotides but not all
ALT values in x[[1]] are which is why readVcf() made it a CharacterList.
To combine these GRanges you need to either coerce all of x[[3]] ALT to
CharacterList or coerce the subset of x[[1]] to DNAStringSetList.
Coerce the subset of x[[1]] to DNAStringSetList:
>> gr1$ALT <- DNAStringSetList(gr1$ALT)
>> gr1
> GRanges with 3 ranges and 1 metadata column:
> seqnames ranges strand | ALT
> <Rle> <IRanges> <Rle> | <DNAStringSetList>
> [1] 1 [1, 1] * | A
> [2] 1 [2, 2] * | G
> [3] 1 [3, 3] * | A
> ---
Combine:
>> c(gr1, gr2)
> GRanges with 6 ranges and 1 metadata column:
> seqnames ranges strand | ALT
> <Rle> <IRanges> <Rle> | <DNAStringSetList>
> [1] 1 [1, 1] * | A
> [2] 1 [2, 2] * | G
> [3] 1 [3, 3] * | A
> [4] 1 [5, 5] * | T
> [5] 1 [6, 6] * | A
> [6] 1 [7, 7] * | C
Val
On 03/16/2014 06:11 PM, Vincent Carey wrote:
> It seems that there is diversity in the classes assigned for ALT in results
> of readVcf, and there was some discussion of this in 1/2013. So it looks
> like this is predictable and solvable with some upstream work after the
> read.
>
>
> On Sun, Mar 16, 2014 at 7:43 PM, Vincent Carey
> <stvjc at channing.harvard.edu>wrote:
>
>>> c(x[[1]][1:3,1:2], x[[3]][1:3,1:2]) # this works
>> GRanges with 6 ranges and 2 metadata columns:
>> seqnames ranges strand | paramRangeID REF
>> <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
>> [1] 1 [ 10583, 10583] * | dhs_chr1_10402 G
>> [2] 1 [ 10611, 10611] * | dhs_chr1_10402 C
>> [3] 1 [ 10583, 10583] * | dhs_chr1_10502 G
>> [4] 1 [832178, 832178] * | dhs_chr1_833139 A
>> [5] 1 [832266, 832266] * | dhs_chr1_833139 G
>> [6] 1 [832297, 832299] * | dhs_chr1_833139 CTG
>> ---
>> seqlengths:
>> 1
>> NA
>>> x[[1]][1:3,1:3]
>> GRanges with 3 ranges and 3 metadata columns:
>> seqnames ranges strand | paramRangeID REF
>> <Rle> <IRanges> <Rle> | <factor> <DNAStringSet>
>> [1] 1 [10583, 10583] * | dhs_chr1_10402 G
>> [2] 1 [10611, 10611] * | dhs_chr1_10402 C
>> [3] 1 [10583, 10583] * | dhs_chr1_10502 G
>> ALT
>> <CharacterList>
>> [1] A
>> [2] G
>> [3] A
>> ---
>> seqlengths:
>> 1
>> NA
>>> c(x[[1]][1:3,1:3], x[[3]][1:3,1:3]) # if i try to concatenate while ALT
>> is included
>> Error in .Primitive("c")(<S4 object of class "CompressedCharacterList">,
>> :
>> all arguments in '...' must have an element class that extends that of
>> the first argument
>>
>> Enter a frame number, or 0 to exit
>>
>> 1: c(x[[1]][1:3, 1:3], x[[3]][1:3, 1:3])
>> 2: c(x[[1]][1:3, 1:3], x[[3]][1:3, 1:3])
>> 3: .local(x, ..., recursive = recursive)
>> 4: .unlist_list_of_GenomicRanges(args, ignore.mcols = ignore.mcols)
>> 5: do.call(rbind, lapply(x, mcols, FALSE))
>> 6: do.call(rbind, lapply(x, mcols, FALSE))
>> 7: (function (..., deparse.level = 1)
>> standardGeneric("rbind"))(<S4 object of
>> 8: standardGeneric("rbind")
>> 9: eval(.dotsCall, env)
>> 10: eval(.dotsCall, env)
>> 11: eval(expr, envir, enclos)
>> 12: .Method(..., deparse.level = deparse.level)
>> 13: lapply(seq_len(length(df)), function(i) {
>> cols <- lapply(args, `[[`, cn[
>> 14: lapply(seq_len(length(df)), function(i) {
>> cols <- lapply(args, `[[`, cn[
>> 15: FUN(1:3[[3]], ...)
>> 16: do.call(c, unname(cols))
>> 17: do.call(c, unname(cols))
>> 18: .Primitive("c")(<S4 object of class "CompressedCharacterList">, <S4
>> object
>> 19: .Primitive("c")(<S4 object of class "CompressedCharacterList">, <S4
>> object
>>
>>> sessionInfo()
>> R Under development (unstable) (2014-03-15 r65199)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] parallel stats graphics grDevices datasets utils tools
>> [8] methods base
>>
>> other attached packages:
>> [1] Biostrings_2.31.14 XVector_0.3.7 GenomicRanges_1.15.39
>> [4] GenomeInfoDb_0.99.19 IRanges_1.21.34 BiocGenerics_0.9.3
>> [7] BatchJobs_1.2 BBmisc_1.5 weaver_1.29.1
>> [10] codetools_0.2-8 digest_0.6.4 BiocInstaller_1.13.3
>>
>> loaded via a namespace (and not attached):
>> [1] DBI_0.2-7 RSQLite_0.11.4 Rcpp_0.11.1 brew_1.0-6
>> [5] fail_1.2 plyr_1.8.1 sendmailR_1.1-2 stats4_3.2.0
>> [9] stringr_0.6.2
>>
>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B155
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: vobencha at fhcrc.org
Phone: (206) 667-3158
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list