[Bioc-devel] From Biostring matching to short read mapping
Bhagwat, Aditya
Ad|ty@@Bh@gw@t @end|ng |rom mp|-bn@mpg@de
Thu Nov 7 11:11:10 CET 2019
Dear bioc-devel,
multicrispr<https://gitlab.gwdg.de/loosolab/software/multicrispr> provides functions for Crispr/Cas9 gRNA design (and is being prepared for BioC). One task involves finding all genomic (mis)matches of a 23-bp candidate Cas9 sequence. Currently this is done with `Biostrings::vcountPDict`, an approach that is successful, though not fast. An alternative would be to switch to short read mapping rather than (Bio)string matching, which involves a one-time indexing effort, but subsequent fast alignment.
`Rsubread::align` seems to be limited to max. 16 `nBestLocations`, whereas I know from vcountPDict that some Cas9 candidates have hundreds of genomic matches.
`QuasR::qAlign` (connecting to Bowtie) does not mention an upper limit on `maxHits`.
Feedback request...
Michael, would QuasR/(R)bowtie be a good approach to do this?
Wei, did I overlook a way to do this with Rsubread?
Herve, is there an elegant way to speed up vcountPDict (parallelize?)
Thankyou :)
Aditya
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list