[R] help: about lemmatization with treetagger tool

Ranjana Girish ranjanagirish30 at gmail.com
Tue Aug 16 09:16:33 CEST 2016

To do lemmatization in R, I executed code below

tagged.results <- treetag(c("run", "ran", "running"), treetagger="manual",
                          TT.tknz=FALSE , lang="en",
Files/TreeTagger", preset="en"))
tagged.results at TT.res

and got some error

Assuming 'UTF-8' as encoding for the input file. If the results turn out to
be erroneous, check the file for invalid characters, e.g. em.dashes or
fancy quotes, and/or consider setting 'encoding' manually.
Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow =
TRUE,  :
'data' must be of a vector type, was 'NULL'
In addition: Warning message:
running command 'C:\WINDOWS\system32\cmd.exe /c type
 C:\Users\SULOCH~1\AppData\Local\Temp\Rtmp2De3bl\tokenize5044ef851b9.txt |
 grep -v '^$' | C:\Program Files\TreeTagger\bin\tree-tagger.exe C:\Program
Files\TreeTagger\lib\english-utf8.par -token -lemma -sgml -pt-with-lemma
-quiet | perl -pe 's\\tV[BDHV]\\tVB\;s\IN\\that\\tIN\;'' had status 255

i am not understanding what this error is, could someone tell what it is??

and please send any other code to do lemmatization ,which will give correct

	[[alternative HTML version deleted]]

More information about the R-help mailing list