Greg Hirson wrote:
I have noticed an interesting behavior when comparing how the base
plot() function deals with a data argument that downloads data from the
internet vs. how xyplot() in lattice performs the same task.
The goal is to plot hourly temperature data. The data is downloaded and
formatted for R using the function cimishourly() in the package cimis.
There is a line within the function that outputs the name of the file
being downloaded using cat().
When using plot() to plot the data, the following is written to the
console:
library(cimis)
plot(air_temp ~ datetime, data = cimishourly("006"))
Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
When using xyplot() to perform the same plot, the data is only
downloaded once:
library(lattice)
xyplot(air_temp ~ datetime, data = cimishourly("006"))
Downloading: ftp://ftpcimis.water.ca.gov/pub/hourly/hourly006.csv
Is this caused by a difference in how the two functions evaluate the
data argument?
Looks like nobody answered so far:
Yes, there are several differences.
I think you should not encapsulate downloading-functions into others
anyway and download the data once before anything else and then start to
work on it.
It is evaluated in plot.formula at two positions:
if (is.matrix(eval(m$data, parent.frame())))
mf <- eval(m, parent.frame())
Generally this is not a big issue but for your function it shows quite
some performance penalty that can easily be avoided by downloading in
advance.
Best,
Uwe Ligges
Even more interesting, when adding a type = "l" argument to plot, the
data is downloaded 3 times.
Thank you for your time,
Greg
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.