[R] fgrep with caret (^) meta-character in system() call
Ken
vicvoncastle at gmail.com
Wed Oct 5 07:07:42 CEST 2011
man awk?
I've used awk for similar tasks (if I am reading the post correctly.) Google-Fu should turn up some useful examples.
Also awk should be on your linux installation in some form or another.
Regards,
Ken Hutchison
On Oct 4, 2554 BE, at 10:52 PM, "Tom D. Harray" <tomdharray at gmail.com> wrote:
> Hi there,
>
> I would like to use my linux system's fgrep to search for a text pattern
> in a file. Calling system with
>
> system("fgrep \"SearchPattern\" /path/to/the/textFile.txt")
>
> works in general, but I need to search for the search pattern at the
> beginning of the line.
>
> The corresponding shell command
>
> fgrep "^SearchPattern" /path/to/the/textFile.txt
> |
> |___ here's my problem
>
> does exactly what I want. I tried various combinations on ", \", \^, but
> failed to make system() work.
>
> How can I call the working shell command including the caret
> meta-character with system()?
>
> Thanks and regards,
>
> dirk
>
>
> P.S.: Actually I have to search for about 5.000 patterns, stored in an R
> list, in a text file with about 30.000.000 lines. The patterns appear in
> one or more lines of the text file. Only those lines have to be
> extracted if the patterns at the beginning of the line.
>
> Example with matching line 1, non matching line 2, non-matching line 3
> (line three comprises aaa, but not at the beginning of the line 3):
>
> SearchPattern = "^aaa"
>
> Text file: aaaooooooooooo
> bbbiiiiiiiiiii
> aacttttttttaaa
>
> Going line by line through the file in R is too slow, and I cannot
> program it in C or C++. Hence I use the fgrep command. I would
> appreciate if anyone has a fast alternative which works with R on Linux
> and Windows systems.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list