[R] How to read last (incomplete) line from gzfile()?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Apr 20 20:04:33 CEST 2008
On Sun, 20 Apr 2008, Stephen Eglen wrote:
> Hi,
>
> I have some text files that do not have trailing \n on the last
> (incomplete) line; how do I read in the last line?
>
> e.g. here is a test case: [linux + R version 2.6.1 (2007-11-26) ]
>
>
> z <- gzfile("short.gz", open="w")
> cat("7\n5\n3", file=z)
> close(z)
>
> z <- gzfile('short.gz')
> readLines(z)
>
> [1] "7" "5"
>
> readLines would indicate that for blocking connections (which I assume
> this is), the last line (containing 3) would be read in, and a warning
> would be generated. I get the desired behaviour if I use a file() on
> a file that is not compressed:
>
> y <- file("short", open="w")
> cat("7\n5\n3", file=y)
> close(y)
>
> y <- file('short')
> readLines(y)
>
> [1] "7" "5" "3"
> Warning message:
> In readLines(y) : incomplete final line found on 'short'
>
> What am I missing?
gzfile is not a text-mode connection, and the help is wrong for those.
E.g.
> readLines(pipe("gzip -dc short.gz"))
[1] "7" "5"
does the same. It is only text-mode blocking connections which do this.
(I just looked at the code, and you might have found that illuminating.)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list