[R] help: about lemmatization with treetagger tool
Ranjana Girish
ranjanagirish30 at gmail.com
Tue Aug 16 09:16:33 CEST 2016
To do lemmatization in R, I executed code below
library("koRpus")
tagged.results <- treetag(c("run", "ran", "running"), treetagger="manual",
format="obj",
TT.tknz=FALSE , lang="en",
TT.options=list(path="C:/Program
Files/TreeTagger", preset="en"))
tagged.results at TT.res
and got some error
>source('D:/Rprograms/lemanew.R')
Assuming 'UTF-8' as encoding for the input file. If the results turn out to
be erroneous, check the file for invalid characters, e.g. em.dashes or
fancy quotes, and/or consider setting 'encoding' manually.
Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow =
TRUE, :
'data' must be of a vector type, was 'NULL'
In addition: Warning message:
running command 'C:\WINDOWS\system32\cmd.exe /c type
C:\Users\SULOCH~1\AppData\Local\Temp\Rtmp2De3bl\tokenize5044ef851b9.txt |
grep -v '^$' | C:\Program Files\TreeTagger\bin\tree-tagger.exe C:\Program
Files\TreeTagger\lib\english-utf8.par -token -lemma -sgml -pt-with-lemma
-quiet | perl -pe 's\\tV[BDHV]\\tVB\;s\IN\\that\\tIN\;'' had status 255
i am not understanding what this error is, could someone tell what it is??
and please send any other code to do lemmatization ,which will give correct
output
[[alternative HTML version deleted]]
More information about the R-help
mailing list