[R] Parsing and counting expressions in .txt-files

Bert Gunter bgunter.4567 at gmail.com
Wed Apr 20 23:35:29 CEST 2016


also check out this CRAN task view:

https://cran.r-project.org/web/views/NaturalLanguageProcessing.html

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 20, 2016 at 9:07 AM, Alexander Nikles <24790 at novasbe.pt> wrote:
> Dear Community,
>
>
>
> I hope that I have the right category selected because I am relatively new
> to the "R" world. I come with a relatively challenging problem in the
> luggage.  I would like to realize, that "R" reads text files (there are
> several hundred pieces in my folder) sequentially, and screens for specific
> terms. If the term is found, the program should write a 1, if not a 0.
> Another task is to scrape a ten-digit number from the file after a
> particular keyword, so that I can map the results. The Programm should
> create an .txt file ideally.
>
>
>
> A brief example:
>
>
>
> Keywords: "surpassed" "achieved", "very motivated"
>
> Text1:
>
> "Personnel number: 0123456789
>
>
>
> The employee has exceeded the set targets and was also otherwise always
> motivated (...) "
>
>
>
> So I want that my program for this case, ideally reflects the following (in
> lines and columns=
>
>
>
> Personell number;surpassed;achieved; very motivated (do not write)
> 0123456789;1;0;1
>
>
> For the following files, he shall all continue analogously in line 2, 3, 4
> and so on.
>
>
>
> Could you give a brief assessment, how to realize such a thing? How do I
> start best and whether you are possibly "stumbled" in advance about
> something similar in R? I am grateful for any suggestions/proposals.
>
>
>
> Thank you in advance,
>
>
>
> Alex
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list