[Bioc-devel] Regarding the function: oligonucleotideFrequency for k-mers > 11 bps

Rodrigo Bertollo de Alexandre rodrigodealexandre at hotmail.com
Fri Nov 28 00:38:39 CET 2014


I've seen that it is almost impossible to work with k-mers as big as 13 with this function. This is mainly because this function doesn't create a list of k-mers from the sequence but from all possible combinations.
This is basically a bug, since in a big sequence of 1000 bps the maximum number of 13-mers is L-k+1 = 988. While the number of possible 13-mers is 4^k = 28561.This means that the code is basically analyzing 27573 nonexistent k-mers. 
I'm wondering if there could have a modification in the package regarding this issue...
I did my own function for this (which it runs ok). However, having all you need in a unique package would be even better...(I posted my code on the stackoverflow: http://stackoverflow.com/a/27178731/4004499)
Sincerely,Rodrigo 		 	   		  
	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list