[R] PROBLEM USING DICTIONARY WITH TM PACKAGE
Patrick Casimir
patrcasi at nova.edu
Fri May 19 16:12:45 CEST 2017
Dear Members & Experts,
Since the Dictionary () function is no longer available with the tm package. How do I use other functions to do the same as below? I want to capture a list of specific terms from a corpus. By example, if my corpus has 102 files. I want to see a list with occurrences of prostatic, adenocarcinoma, grade in all 102 files. When I use the function Dictionary (), I got the error: Error: could not find function "Dictionary"
> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade"))
> inspect(DocumentTermMatrix(docs, list(dictionary = d)))
But if I use the codes below using inspect, the dictionary only returns the terms for 10 files instead of 102. I need a way to get my dictionary to capture and return those terms for all 102 files or whatever other terms I select. I know I am close but inspect () is not the right function.
> myTerms <- c("prostatic", "adenocarcinoma", "grade")
> inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))
<<DocumentTermMatrix (documents: 102, terms: 3)>>
Non-/sparse entries: 292/14
Sparsity : 5%
Maximal term length: 14
Weighting : term frequency (tf)
Sample :
Terms
Docs adenocarcinoma grade prostatic
Patient14.txt 11 6 3
Patient15.txt 7 12 2
Patient16.txt 13 16 4
Patient19.txt 5 13 2
Patient24.txt 11 12 4
Patient25.txt 8 9 4
Patient41.txt 8 10 4
Patient46.txt 8 10 3
Patient8.txt 9 12 2
Patient9.txt 8 23 2
Thanks
Patrick Casimir, PhD
Health Analytics, Data Science, Big Data Expert & Independent Consultant
C: 954.614.1178
[[alternative HTML version deleted]]
More information about the R-help
mailing list