[R] Why is DocumentTermMatrix showing 0 term?
Patrick Casimir
patrcasi at nova.edu
Tue Dec 6 15:29:23 CET 2016
Thanks Ista. See codes below. I am not sure why the DTM is showing 0 term. I have 4 documents in the corpus. And I was able to make transformations
to the documents inside the corpus.
> cname <- file.path("C:\\Users\\Desktop\\Text Mining\\Cases\\MyCorpus")
> dir(cname)
[1] "case1.txt" "case2.txt" "case3.txt" "case4.txt"
> library(tm)
> docs <- Corpus(DirSource(cname))
> install.packages("magrittr" ,dependencies=TRUE)
> viewDocs <- function(d, n) {d %>% extract2(n) %>% as.character() %>% writeLines()}
> viewDocs(docs, 1)
> toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x))
> docs <- tm_map(docs, toSpace, "/|@|nn|")
> inspect(docs[1])
> docs <- tm_map(docs, removePunctuation)
> docs <- tm_map(docs, removeWords, stopwords("english"))
> inspect(docs[1])
> docs <- tm_map(docs, stripWhitespace)
> docs <- tm_map(docs, stemDocument)
> dtm <- DocumentTermMatrix(docs)
> dtm
<<DocumentTermMatrix (documents: 4, terms: 0)>>
Non-/sparse entries: 0/0
Sparsity : 100%
Maximal term length: 0
Weighting : term frequency (tf)
>
________________________________
From: Ista Zahn <istazahn at gmail.com>
Sent: Tuesday, December 6, 2016 9:09:37 AM
To: Patrick Casimir
Cc: r-help at r-project.org
Subject: Re: [R] Why is DocumentTermMatrix showing 0 term?
Hi Patrick,
How could anyone possibly answer this question with only the information you've provided? It's like showing me an empty cup and asking why it's empty. Maybe you didn't put anything in it. Maybe you did and then you dog drank it or your cat knocked it over or your girlfriend drank it. How would I possibly know?
Bottom line, you need to show exactly what you did to produce that result, preferably in the form of a few lines of code that we can run to reproduce your problem.
Finally, you may find it helpful take some time to learn how to ask questions the smart way. http://catb.org/~esr/faqs/smart-questions.html is a good place to learn this important skill.
Best,
Ista
On Dec 6, 2016 7:58 AM, "Patrick Casimir" <patrcasi at nova.edu<mailto:patrcasi at nova.edu>> wrote:
<<DocumentTermMatrix (documents: 4, terms: 0)>>
Non-/sparse entries: 0/0
Sparsity : 100%
Maximal term length: 0
Weighting : term frequency (tf)
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list