[R] string size limits in RCurl
Elmore, Ryan
Ryan.Elmore at nrel.gov
Wed Apr 24 18:45:58 CEST 2013
Hi All,
I am running into what appears to be character size limit in a JSON string when trying retrieve data from either `curlPerform()` or `getURL()`. Here is non-reproducible code [1], but it should shed some light on the problem.
# Note that .base.url is the basic url for the API, q is a query, user
# is specified, etc.
session = getCurlHandle()
curl.opts <- list(userpwd = paste(user, ":", key, sep = ""),
httpheader = "Content-Type: application/json")
request <- paste(.base.url, q, sep = "")
txt <- getURL(url = request, curl = session, .opts = curl.opts,
write = basicTextGatherer())
or
r = dynCurlReader()
curlPerform(url = request, writefunction = r$update, curl = session,
.opts = curl.opts)
My guess is that the `update` or `value` functions in the `basicTextGather` or `dynCurlReader` text handler objects are having trouble with the large strings. In this example, `r$value()` will return a truncated string that is approximately 2 MB. The code given above will work fine for queries < 2 MB.
Note that I can easily do the following from the command line (or using `system()` in R), but writing to disc seems like a waste if I am doing the subsequent analysis in R.
curl -v --header "Content-Type: application/json" --user username:register:passwd https://base.url.for.api/getdata/select+*+from+sometable > stream.json
where `file.json` is a roughly 14MB json string. I can read the string into R using either
con <- file(paste(.project.path, "data/stream.json", sep = ""), "r")
string <- readLines(con)
or directly to list as
tmp <- fromJSON(file = paste(.project.path, "data/stream.json", sep = ""))
Any thoughts are very much appreciated. Note that I posted this same question/comment to StackOverflow and will happily provide any helpful suggestions to that list as well.
Ryan
[1] - Sorry for not providing reproducible code, but I'm dealing with a govt firewall.
More information about the R-help
mailing list