[R] Removing 99% similar sequence help

Nick Jeffery nick.w.jeffery3 at gmail.com
Wed Feb 4 20:42:50 CET 2015


Dear R users,

I am having trouble finding a package and function to remove DNA sequences
from a fasta file that are >99% similar and/or create an output of the
remaining "unique" sequences. I found the uniquefasta function in phytools
but R can't find this function and also doesn't allow me to set the 99%
parameter. This is because I'm building a phylogeny with tons of nearly
identical sequences so I want to reduce the number of individuals.


Thanks for any help and suggestions,
Nick

-- 
Nick Jeffery, PhD Candidate
Integrative Biology
SCIE 1453
University of Guelph
Guelph, Ontario, Canada

	[[alternative HTML version deleted]]



More information about the R-help mailing list