[R] How do I use R to build a dictionary of proper nouns?

Fri May 5 10:39:25 CEST 2017

Did you try using the table() function, possibly in combination with sort() or rank()?

Consider:

myNouns <- c("proper", "nouns", "domain", "ontology", "dictionary",
             "dictionary", "corpus", "patent", "files", "proper", "nouns",
             "word", "frequency", "file", "preprocess", "corpus", "proper",
             "nouns", "domain", "ontology", "idea", "nouns", "dictionary",
             "dictionary", "corpus", "attachments", "texts", "corpus",
             "preprocesses", "proper", "nouns")

myNounFrequencies <- table(myNouns)
myNounFrequencies

myNounFrequencies <- sort(myNounFrequencies, decreasing = TRUE)
myNounFrequencies

which(names(myNounFrequencies) == "corpus")

> On May 5, 2017, at 1:58 AM, θ ＂ <yarmi1224 at hotmail.com> wrote:
> 
> θ ＂ 已與您共用 OneDrive 檔案。若要檢視檔案，請按下面的連結。
> 
> 
> <https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>
> [https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>
> 
> 2.corpus_patent text.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVRRxXomVBv0YV0j>
> 
> <https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>
> [https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>
> 
> 3ontology_proper nouns keywords.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVURiS7MbYH6hJzo>
> 
> <https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>
> [https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>
> 
> 1.patents.PNG<https://1drv.ms/u/s!Aq27nOPOP5izgVYuRVxM1OyzIPzF>
> 
> 
> 
> 
> Hi :
> 
> I want to do patents text mining in R.
> I need to use the proper nouns of domain ontology to build a dictionary.
> Then use the dictionary to analysis my corpus of patent files.
> I want to calculate the proper nouns and get the word frequency that appears in each file.
> 
> Now I have done the preprocess for the corpus and extract the proper nouns from domain ontology.
> But I have no idea how to build a proper nouns dictionary and use the dictionary to analysis my corpus.
> 
> The Attachments are my texts, corpus preprocesses and proper nouns.
> 
> Thanks.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.