[R] Essay identification
Greg Snow
greg.snow at ihc.com
Mon Jun 13 18:02:25 CEST 2005
This topic is sometimes called wordprinting or stylometry. The spring
2003 issue of Chance magazine had several articles on the topic.
A colleague of mine and I have been working on a perl program (along
with various graduate students) to extract many of the common statistics
used in wordprinting (counts/percentages of non-contextual words, word
pattern ratios, vocabulary richness). The data can then be loaded into
R (or any other stats package) to be analyzed.
The program is currently in a beta state (usable, but we want to
possibly add more features and documentation), but I can send a copy to
anyone who is interested (specify if you have perl, or need a stand
alone copy (windows only)).
hope this helps,
Greg Snow, Ph.D.
Statistical Data Center, LDS Hospital
Intermountain Health Care
greg.snow at ihc.com
(801) 408-8111
>>> Werner Bier <aliscla at yahoo.com> 06/12/05 01:29PM >>>
Hi R-help,
I have a database of 10 students who have written an overall of 78
essays.
The challenge? I would like to identify who wrote the 79th essay.
Has anybody used R in this context?
Even if not, would you suggest me which pattern recognition technique I
might possibly apply?
Thanks a lot and regards,
Tom
---------------------------------
[[alternative HTML version deleted]]
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list