[R] Using readLines on a file without resetting internal file offset parameter?

William Dunlap wdunlap at tibco.com
Wed Oct 29 17:22:50 CET 2014


Open your file object before calling readLines and close it when you
are done with
a sequence of calls to readLines.

  > tf <- tempfile()
  > cat(sep="\n", letters[1:10], file=tf)
  > f <- file(tf)
  > open(f)
  > # or f <- file(tf, "r") instead of previous 2 lines
  > readLines(f, n=1)
  [1] "a"
  > readLines(f, n=1)
  [1] "b"
  > readLines(f, n=2)
  [1] "c" "d"
  > close(f)

I/O operations on an unopened connection generally open it, do the operation,
then close it.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Oct 29, 2014 at 8:23 AM, Thomas Nyberg <tomnyberg at gmail.com> wrote:
> Hi everyone,
>
> I would like to read a file line by line, but I would rather not load all
> lines into memory first. I've tried using readLines with n = 1, but that
> seems to reset the internal file descriptor's file offset after each call.
> I.e. this is the current behavior:
>
> -------
>
> bash $ echo 1 > testfile
> bash $ echo 2 >> testfile
> bash $ cat testfile
> 1
> 2
>
> bash > R
> R > f <- file('testfile')
> R > readLines(f, n = 1)
> [1] "1"
> R > readLines(f, n = 1)
> [1] "1"
>
> -------
>
> I would like the behavior to be:
>
> -------
>
> bash > R
> R > f <- file('testfile')
> R > readLines(f, n = 1)
> [1] "1"
> R > readLines(f, n = 1)
> [1] "2"
>
> -------
>
> I'm coming to R from a python background, where the default behavior is
> exactly the opposite. I.e. when you read a line from a file it is your
> responsibility to use seek explicitly to get back to the original position
> in the file (this is rarely necessary though). Is there some flag to turn
> off the default behavior of resetting the file offset in R?
>
> Cheers,
> Thomas
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list