[R] Search within a file
Seth Falcon
sfalcon at fhcrc.org
Fri Nov 4 07:43:40 CET 2005
On 3 Nov 2005, JAROSLAW.W.TUSZYNSKI at saic.com wrote:
> I am looking for a way to search a file for position of some
> expression, from within R. My current code:
>
> sha1Pos = gregexpr("<sha1>", readChar(filename,
> file.info(filename)$size))[[1]]
>
> Works fine for small files, but text files I will be working with
> might get up to Gb range, so I was trying to accomplish the same
> without loading the whole file into R.
I would think you could use readLines to read in a batch of lines, run
(g)regexpr, and keep track of matches and position.
Create a connection to the file using file() first, and then
subsequent calls to readLines will start where you left off.
But you will need to adjust the position indices returned by gregexpr
by how far into the file you are. Seems very doable.
+ seth
More information about the R-help
mailing list