[Bioc-devel] Regarding the function: oligonucleotideFrequency for k-mers > 11 bps
Rodrigo Bertollo de Alexandre
rodrigodealexandre at hotmail.com
Fri Nov 28 00:38:39 CET 2014
I've seen that it is almost impossible to work with k-mers as big as 13 with this function. This is mainly because this function doesn't create a list of k-mers from the sequence but from all possible combinations.
This is basically a bug, since in a big sequence of 1000 bps the maximum number of 13-mers is L-k+1 = 988. While the number of possible 13-mers is 4^k = 28561.This means that the code is basically analyzing 27573 nonexistent k-mers.
I'm wondering if there could have a modification in the package regarding this issue...
I did my own function for this (which it runs ok). However, having all you need in a unique package would be even better...(I posted my code on the stackoverflow: http://stackoverflow.com/a/27178731/4004499)
Sincerely,Rodrigo
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list