[R] Lemmatization with WORDNET package
BHANUMATHI H M
bhanumathihm6 at gmail.com
Mon Aug 22 12:28:42 CEST 2016
Sir I am working on classification project before that i have to do feature
selection process. i am very interested to apply lemmatization rather
stemming. So i executed the code below according to definition of
lemmatization it should give root words like run for running and ran, think
for thought etc... but my code is not giving correct output.. could you
please Please help me to finding out where actually i went wrong please
sir..
CODE:
library("tm")
library("NLP")
library("wordnet")
setDict("C:/Program Files/WordNet/2.1/dict")
vector.documents <- c("The children something to the playground The cars %s
down the avenue")
corpus.documents <- Corpus(VectorSource(vector.documents))
initDict("C:/Program Files/WordNet/2.1/dict")
lapply(corpus.documents,function(x){
sapply(unlist(strsplit(as.character(x),"[[:space:]]+")), function(word) {
x.filter <- getTermFilter("StartsWithFilter", word, TRUE)
x.filter
x
terms <- getIndexTerms("NOUN",1,x.filter)
terms
if(!is.null(terms)) sapply(terms,getLemma)
})
})
OUTPUT:
$`1`
$`1`$The
[1] "the absurd"
$`1`$children
NULL
$`1`$playing
[1] "playing"
$`1`$playground
[1] "playground"
$`1`$The
[1] "the absurd"
$`1`$cars
[1] "carson"
$`1`$landing
[1] "landing"
$`1`$avenue
[1] "avenue"
I also tried by applying other POS and type like "Containsfilter"
but that also not worked please please help me !!!
Thanks in advance.
with regards,
BHANUMATHI H M
[[alternative HTML version deleted]]
More information about the R-help
mailing list