[R] [R} how to build TermDocMatrix in tm text mining package of R

Kum-Hoe Hwang phdhwang at gmail.com
Mon Jan 12 07:25:30 CET 2009


Thank your comments very much.

Thank to your help, I understood a flow for a text analysis.

However, I could not run the above R scripts because tm package does
not work in my PC that is a critical error.

Kum Hwang Ph.D.


On Sat, Jan 10, 2009 at 12:39 AM, Tony Breyal
<tony.breyal at googlemail.com> wrote:
> Hi there, I think something like the following is what you want:
>
> ### R start...
> # if you put your plain text files in a folder like this
> my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\'
>
> # then you can construct a simple tdm like this
> library(tm)
> my.corpus <- Corpus(DirSource(my.path), readerControl = list
> (reader=readPlain))
> my.tdm <- TermDocMatrix(my.corpus)
>
> # this show show how words are distributed in the first text document
> my.tdm[1, ]
> ### R end.
>
> by the way, there are some nice examples of using the tm package in
> the last Rnews letter (Volume 8/2, October 2008), under the section
> 'An Introduction to Text Mining in R':
> http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf
>
> Hope that helps a little bit,
> Tony Breyal
>
> On 9 Jan, 14:21, "Kum-Hoe Hwang" <phdhw... at gmail.com> wrote:
>> Howdy Gurus
>>
>> I 'd like to ask a question about how to build TermDocMatrix in tm text
>> mining package.
>>
>> It is not clear about importing a plain text file, and them converting that
>> text file into TermDocMatrix file, etc to me.
>> How can I build a TermDocMatrix of " a plain text document file for text
>> association?
>> Or are there any good manuals?
>>
>> Thank you in advance,
>>
>> --
>> Kum-Hoe Hwang, Ph.D.
>>
>> Phone : 82-31-250-3516
>> Email : phdhw... at gmail.com
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : phdhwang at gmail.com




More information about the R-help mailing list