[BioC] Comparing DNAStringSetLists

Steve Lianoglou lianoglou.steve at gene.com
Wed Oct 16 07:12:07 CEST 2013


Hi Vince,

On Tue, Oct 15, 2013 at 4:16 PM, Vince S. Buffalo <vsbuffalo at gmail.com> wrote:
> Hi All,
>
> I have two vectors of alleles stored as DNAStringSetLists. For each element
> in both lists, I need to find the length of the intersecting set. Using
> mapply() and intersect() take too long, as does sapply(dna.set.list,
> as.character) (and then using mclapply or lapply to find intersect on
> characters). Is there a fast way to do this? I have vectors ~12 million
> rows long.

Perhaps data.table can help here, but I'm having a hard time
understanding the data you have and the output you want.

Could you provide small test-set sized dataset that we can poke at?

Thanks,
-steve

-- 
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech



More information about the Bioconductor mailing list