[BioC] Group millions of the same DNA sequences?

Tue Nov 16 11:46:13 CET 2010

Hi all,

I have millions like 100M DNA reads each of which is ~150nt, some of them are duplicate. Is there any way to group the same sequences into one and count the number, like unique() function in R, but with the occurrence of read and also more efficient? 
Also, if I want to cluster these 100M  reads based on their similarity, like editor distance or some distance <=2, is there some function or package can be used? 
Thank you!

Xiaohui