[R] read.csv fails to read a CSV file from google docs
Duncan Temple Lang
duncan at wald.ucdavis.edu
Fri Apr 29 20:18:13 CEST 2011
Thanks David for fixing the early issues.
The reason for the failure is that the response
from the Web server is a to redirect the requester
to another page, specifically
https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Note that this is https, not http, and the built-in URL reading facilities in R don't suport https.
One way to see this is to use look at the headers in your browser (e.g. Live HTTP Headers),
or to use curl, or the RCurl package
tt = getForm("http://spreadsheets0.google.com/spreadsheet/pub",
hl ="en", key = "0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE",
single = "true", gid ="0",
output = "csv",
.opts = list(followlocation = TRUE, verbose = TRUE))
The verbose option shows the entire dialog, and tt contains the
text of the CSV document.
read.csv(textConnection(tt))
then yields the data frame
D.
On 4/29/11 10:36 AM, David Winsemius wrote:
>
> On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
>
>> Hello all,
>> I wish to use read.csv to read a google doc spreadsheet.
>>
>> I try using the following code:
>>
>> data_url <- "
>> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
>>
>> "
>> read.csv(data_url)
>>
>> Which results in the following error:
>>
>> Error in file(file, "rt") : cannot open the connection
>>
>>
>> I'm on windows 7. And the code was tried on R 2.12 and 2.13
>>
>> I remember trying this a few months ago and it worked fine.
>
> I am always amused at such claims. Occasionally they are correct, but more often a crucial step has been omitted. In
> this case you have at a minimum embedded line-feeds in your URL string and have not established a connection, so it
> could not possibly have succeeded as presented.
>
> But now it's time to admit I do not know why it is not succeeding when I correct those flaws.
>
>> closeAllConnections()
>> data_url <-
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv")
>
>> read.csv(data_url)
> Error in open.connection(file, "rt") : cannot open the connection
>
>> closeAllConnections()
>> dd <- read.csv(con <-
> url("http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv"))
>
> Error in open.connection(file, "rt") : cannot open the connection
>
>
> So, I guess I'm not reading the help pages for `url` and `read.csv` as well I thought I was.
>
>
>> Any suggestion what might be causing this or how to solve it?
>
>
More information about the R-help
mailing list