[R] read.*: How to read from a URL?
Martin Morgan
mtmorgan at fhcrc.org
Thu Dec 11 00:08:25 CET 2008
Prof Brian Ripley wrote:
> On Wed, 10 Dec 2008, hadley wickham wrote:
>
>> Hi Michael,
>>
>> In general, I think you should be able to do:
>>
>> gimage <- read.jpeg(url(gimageloc))
>
> Note that would not be really correct: it would need to be
>
> gimage <- read.jpeg(con <- url(gimageloc))
> close(con)
>
> since it otherwise leaks a connection (which would eventually be closed).
>
> However, from ?read.jpeg
>
> Arguments:
>
> filename: filename of JPEG image
>
> so it does not accept a connection (and the source code wll confirm
> that). In fact virtually all functions that accept a 'file name or
> connection' will work with URLs, as file() accepts URLs as well as file
> names (see ?file).
>
> The issue is that writers of third-party readers should be encouraged to
> support connections (which have been around for ca 7 years in R).
> It is ammazing how people take such innovations for granted.
Perhaps the discussion belongs on R-devel, but is there an example of a
user-contributed package that uses R's connections, either for parsing a
URL or, for instance, a compressed file?
Martin
>
>> or alternatively use the EBImage from bioconductor which will read
>> from a url automatically (it also opens a much wider range of file
>> types)
>>
>> library(EBImage)
>> img <- readImage(gimageloc, TrueColor)
>>
>> Hadley
>>
>>
>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca>
>> wrote:
>>> The question is how to use a URL in place of a file= argument for
>>> read.*.functions that do
>>> not support this internally.
>>>
>>> e.g., utils::read.table() and her family all support a file= argument
>>> that
>>> can take a URL
>>> equally well as a local file. So, if I have a file on the web, I can
>>> equally well do
>>>
>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>> langren <-
>>>> read.csv("http://euclid.psych.yorku.ca/SCS/Gallery/Private/langrens.csv",
>>>>
>>>> header=TRUE)
>>>
>>> where the latter is more convenient for posts to this list or
>>> distributed
>>> examples.
>>> rimage::read.jpeg() doesn't support URLs, and the only way I've found
>>> is to
>>> download the
>>> image file from a URL to a temp file, in several steps.
>>> This is probably a more general problem than just read.jpeg,
>>> so maybe there is a general idiom for this case, or better-- other
>>> read.*
>>> functions could
>>> be encouraged to support URLs.
>>>
>>>> library(rimage)
>>>> # local file: OK
>>>> gimage <-
>>>> read.jpeg("C:/Documents/milestone/images/vanLangren/google-toledo-rome3.jpg")
>>>>
>>>>
>>>> gimageloc <-
>>>> "http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg"
>>>>
>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>> download.file(gimageloc, dest, mode="wb")
>>> trying URL
>>> 'http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg'
>>>
>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>> opened URL
>>> downloaded 34 Kb
>>>
>>>> dest
>>> [1]
>>> "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>> # Is there something simpler??
>>>> gimage <- read.jpeg(dest)
>>>
>>>> # I thought file() might work, but evidently not.
>>>> gimage <- read.jpeg(file(gimageloc))
>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>>>
>>>
>>> --
>>> Michael Friendly Email: friendly AT yorku DOT ca Professor,
>>> Psychology
>>> Dept.
>>> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>> 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
>>> Toronto, ONT M3J 1P3 CANADA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793
More information about the R-help
mailing list