[R] Possible bug in gzcon() (6161:src/main/connections.c)

André Wildberg @ndre@w||dberg @end|ng |rom out|ook@com
Fri Apr 25 15:41:35 CEST 2025


Hi developers,


originally sent as bug report request, but got re-routed here:

Problem:
Connections established via gzcon (also used by packages e.g. vroom/readr) may end reading/connection prematurely (see https://stackoverflow.com/questions/79587028/read-csv-only-reads-a-fraction-of-rows-from-a-zipped-file-when-reading-from-url<https://stackoverflow.com/questions/79587028/read-csv-only-reads-a-fraction-of-rows-from-a-zipped-file-when-reading-from-url#comment140365314_79587028>)

Reproducible example:

addr <- "https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/by_station/USW00014839.csv.gz"


# online/stream

nrow(read.csv(gzcon(url(addr), text=T), header=F))

# [1] 1798


# local

download.file(addr, destfile=basename(addr))

nrow(read.csv(gzcon(file(basename(addr), "r"), text=T), header=F))

# [1] 429498

# or

nrow(read.csv(gzfile(basename(addr), "r"), header=F))

# [1] 429498

closeAllConnections()

> sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin24.4.0
Running under: macOS Sequoia 15.4.1

likely to be platform independent.

	[[alternative HTML version deleted]]



More information about the R-help mailing list