[R] read.csv fails to read a CSV file from google docs
Duncan Temple Lang
duncan at wald.ucdavis.edu
Sat Apr 30 00:00:32 CEST 2011
Hi Tal
You can add
ssl.verifypeer = FALSE
in the .opts list so that the certificate is simply accepted.
Alternatively, you can tell libcurl where to find the certification
authority file containing signatures. This can be done via the cainfo
option, e.g.
cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"),
Often such a collection of certificates is installed with the ssl library.
D.
On 4/29/11 2:42 PM, Tal Galili wrote:
> Hello Duncan,
> Thank you for having a look at this.
>
> I tried the code you provided but it failed in the getForm stage. running this:
>
> > tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub",
> + hl ="en", key = "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
> + single = "true", gid ="0",
> + output = "csv",
> + .opts = list(followlocation = TRUE, verbose = TRUE))
>
> Resulted in the following error:
>
> Error in curlPerform(url = url, headerfunction = header$update, curl = curl, :
> SSL certificate problem, verify that the CA cert is OK. Details:
> error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
>
>
> Did I miss some step?
>
>
>
>
>
> ----------------Contact Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com <mailto:Tal.Galili at gmail.com> | 972-52-7275845
> Read me: www.talgalili.com <http://www.talgalili.com> (Hebrew) | www.biostatistics.co.il
> <http://www.biostatistics.co.il> (Hebrew) | www.r-statistics.com <http://www.r-statistics.com> (English)
> ----------------------------------------------------------------------------------------------
>
>
>
>
> On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang <duncan at wald.ucdavis.edu <mailto:duncan at wald.ucdavis.edu>> wrote:
>
>
> Thanks David for fixing the early issues.
>
> The reason for the failure is that the response
> from the Web server is a to redirect the requester
> to another page, specifically
>
> https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> <https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv>
>
> Note that this is https, not http, and the built-in URL reading facilities in R don't suport https.
>
>
> One way to see this is to use look at the headers in your browser (e.g. Live HTTP Headers),
> or to use curl, or the RCurl package
>
> tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub",
> hl ="en", key = "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
> single = "true", gid ="0",
> output = "csv",
> .opts = list(followlocation = TRUE, verbose = TRUE))
>
>
> The verbose option shows the entire dialog, and tt contains the
> text of the CSV document.
>
> read.csv(textConnection(tt))
>
> then yields the data frame
>
> D.
>
>
> On 4/29/11 10:36 AM, David Winsemius wrote:
> >
> > On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
> >
> >> Hello all,
> >> I wish to use read.csv to read a google doc spreadsheet.
> >>
> >> I try using the following code:
> >>
> >> data_url <- "
> >>
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> <http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv>
> >>
> >> "
> >> read.csv(data_url)
> >>
> >> Which results in the following error:
> >>
> >> Error in file(file, "rt") : cannot open the connection
> >>
> >>
> >> I'm on windows 7. And the code was tried on R 2.12 and 2.13
> >>
> >> I remember trying this a few months ago and it worked fine.
> >
> > I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In
> > this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it
> > could not possibly have succeeded as presented.
> >
> > But now it's time to admit I do not know why it is not succeeding when I correct those flaws.
> >
> >> closeAllConnections()
> >> data_url <-
> >
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> <http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv>")
> >
> >> read.csv(data_url)
> > Error in open.connection(file, "rt") : cannot open the connection
> >
> >> closeAllConnections()
> >> dd <- read.csv(con <-
> >
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> <http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv>"))
> >
> > Error in open.connection(file, "rt") : cannot open the connection
> >
> >
> > So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was.
> >
> >
> >> Any suggestion what might be causing this or how to solve it?
> >
> >
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list