Christian, Have you looked at TM? There is also Quantitative Corpus Linguistics with R by Stefan Gries, which you may find of interest. Tau, Readme and rattle might be worth looking into. See also: http://ses.telecom-paristech.fr/lebart/ regards Bob