[R] read.csv fails to read a CSV file from google docs
Philipp Pagel
p.pagel at wzw.tum.de
Fri Apr 29 20:27:45 CEST 2011
On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote:
>
> data_url <- "
> http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
> "
> read.csv(data_url)
> Error in file(file, "rt") : cannot open the connection
I get the same error (R 2.11.1, Debian LINUX) and don't have a
solution. But I did some tests and found the origin of the problem
I can download the file from google with wget but get some interesting
´information in the process:
------------------------------------------------------------
$ wget -v 'http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv'
--2011-04-29 20:07:40-- http://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 209.85.148.138, ...
Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv [following]
--2011-04-29 20:07:41-- https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv
Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: “pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1”
[ <=> ] 41 --.-K/s in 0s
2011-04-29 20:07:42 (342 KB/s) - “pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&gid=0&output=csv.1” saved [41]
------------------------------------------------------------
The message that caught my attention was the http redirection: "302 Moved
Temporarily".
If you try again with the new url you get this:
> read.csv(url("https://spreadsheets0.google.com/spreadsheet/pub?hl=en&hl=en&key=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE&single=true&g"))
Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") : unsupported URL scheme
?url told me "Note that ‘https://’ connections are not supported."
Case closed, problem unsolved...
Dirty workaround: use system() and wget or whatever command is available on
Windows for this.
cu
Philipp
--
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/
More information about the R-help
mailing list