[R] multiple downloads of data when evaluating plot() vs. xyplot()
Uwe Ligges
ligges at statistik.tu-dortmund.de
Sun Aug 23 20:18:31 CEST 2009
Greg Hirson wrote:
> I have noticed an interesting behavior when comparing how the base
> plot() function deals with a data argument that downloads data from the
> internet vs. how xyplot() in lattice performs the same task.
>
> The goal is to plot hourly temperature data. The data is downloaded and
> formatted for R using the function cimishourly() in the package cimis.
> There is a line within the function that outputs the name of the file
> being downloaded using cat().
>
> When using plot() to plot the data, the following is written to the
> console:
>
> library(cimis)
> plot(air_temp ~ datetime, data = cimishourly("006"))
> Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
> Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
>
> When using xyplot() to perform the same plot, the data is only
> downloaded once:
>
> library(lattice)
> xyplot(air_temp ~ datetime, data = cimishourly("006"))
> Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
>
> Is this caused by a difference in how the two functions evaluate the
> data argument?
Looks like nobody answered so far:
Yes, there are several differences.
I think you should not encapsulate downloading-functions into others
anyway and download the data once before anything else and then start to
work on it.
It is evaluated in plot.formula at two positions:
if (is.matrix(eval(m$data, parent.frame())))
mf <- eval(m, parent.frame())
Generally this is not a big issue but for your function it shows quite
some performance penalty that can easily be avoided by downloading in
advance.
Best,
Uwe Ligges
> Even more interesting, when adding a type = "l" argument to plot, the
> data is downloaded 3 times.
>
> Thank you for your time,
>
> Greg
>
More information about the R-help
mailing list