[R] RCurl and cookies in POST requests
Christian M.
chr+r-help at tx0.org
Sun Nov 14 15:30:30 CET 2010
Hello.
I know that it's usually possible to write cookies to a cookie
file by removing the curl handle and doing a gc() call. I can do
this with getURL(), but I just can't obtain the same results with
postForm().
If I use:
curlHandle <- getCurlHandle(cookiefile=FILE, cookiejar=FILE)
and then do:
getURL(http://example.com/script.cgi, curl=curlHandle)
rm(curlHandle)
gc()
it's OK, the cookie is there. But, if I do (same handle; the
parameter is a dummy):
postForm(site, .params=list(par="cookie"), curl=curlHandle,
style="POST")
rm(curlHandle)
gc()
no cookie is written.
Probably I'm doing something wrong, but don't know what.
Is it possible to store cookies read from the output of a
postForm() call? How?
Thanks.
Christian
PS.: I'm attaching a script that can be sourced (and its .txt
version). It contains an example. The expected result is a file
(cookies.txt) with two cookies. The script currently uses
getURL() and two cookies are stored. If postForm() is used
(currently commented), only 1 cookie is written.
--
SDF Public Access UNIX System - http://sdf.lonestar.org
-------------- next part --------------
# The script will call no_cookie() if no cookies file exists.
# It will then read the cookie XXX (its value is "cookie1").
# When XXX has been read, it will be written to "c_file" with
# "rm(curlHandle) ; gc()". Finally the script will test whether
# cookie ZZZ (its value is "cookie2") exists. If it doesn't,
# then the script will retrive the same URL as before, which
# will reply with ZZZ if XXX was sent.
#
# If XXX and ZZZ are in "cookies.txt", this should be removed
# to test again (otherwise the script will find both cookies
# and won't do anything).
library(RCurl)
site <- "http://chr.tx0.org/arch/ml/r/cookie-20101114.cgi"
c_file <- "cookies.txt"
no_cookie <- function() {
curlHandle <- getCurlHandle(cookiefile=c_file, cookiejar=c_file)
getURL(site, curl=curlHandle)
rm(curlHandle)
gc()
}
if ( file.exists(c_file) == FALSE ) {
file.create(c_file)
no_cookie()
}
cookie_1 <- sub(".*(XXX)\t", "", grep("XXX",readLines(c_file),value=T))
if ( length(grep("ZZZ",readLines(c_file))) == 0 ) {
curlHandle <- getCurlHandle(cookiefile=c_file, cookiejar=c_file)
# Either getURL OR postForm:
aaa <- getURL(site, curl=curlHandle)
#aaa <- postForm(site, .params=list(par=cookie_1), curl=curlHandle, style="POST")
# Debug POST:
#d = debugGatherer()
#postForm(site, .params=list(par=cookie_1), .opts=list(httpheader = c(
# header = "foo: bar"), debugfunction = d$update, verbose=T),
# curl=curlHandle, style="POST")
#write.table(d$value()[["headerIn"]], file="debug_in.txt")
#write.table(d$value()[["headerOut"]], file="debug_out.txt")
rm(curlHandle)
gc()
}
-------------- next part --------------
# The script will call no_cookie() if no cookies file exists.
# It will then read the cookie XXX (its value is "cookie1").
# When XXX has been read, it will be written to "c_file" with
# "rm(curlHandle) ; gc()". Finally the script will test whether
# cookie ZZZ (its value is "cookie2") exists. If it doesn't,
# then the script will retrive the same URL as before, which
# will reply with ZZZ if XXX was sent.
#
# If XXX and ZZZ are in "cookies.txt", this should be removed
# to test again (otherwise the script will find both cookies
# and won't do anything).
library(RCurl)
site <- "http://chr.tx0.org/arch/ml/r/cookie-20101114.cgi"
c_file <- "cookies.txt"
no_cookie <- function() {
curlHandle <- getCurlHandle(cookiefile=c_file, cookiejar=c_file)
getURL(site, curl=curlHandle)
rm(curlHandle)
gc()
}
if ( file.exists(c_file) == FALSE ) {
file.create(c_file)
no_cookie()
}
cookie_1 <- sub(".*(XXX)\t", "", grep("XXX",readLines(c_file),value=T))
if ( length(grep("ZZZ",readLines(c_file))) == 0 ) {
curlHandle <- getCurlHandle(cookiefile=c_file, cookiejar=c_file)
# Either getURL OR postForm:
aaa <- getURL(site, curl=curlHandle)
#aaa <- postForm(site, .params=list(par=cookie_1), curl=curlHandle, style="POST")
# Debug POST:
#d = debugGatherer()
#postForm(site, .params=list(par=cookie_1), .opts=list(httpheader = c(
# header = "foo: bar"), debugfunction = d$update, verbose=T),
# curl=curlHandle, style="POST")
#write.table(d$value()[["headerIn"]], file="debug_in.txt")
#write.table(d$value()[["headerOut"]], file="debug_out.txt")
rm(curlHandle)
gc()
}
More information about the R-help
mailing list