On 13-09-06 9:21 PM, Karl Millar wrote:
Hi Duncan,
I like the interface of this version a lot better, but there's still a
bunch of implementation details that need fixing:
* As previously mentioned, there are important cases where the mtime
values change in ways that this code doesn't detect.
* If the timestamp file (which is usually in the temp directory) gets
deleted (which can happen after a moderate amount of time of
inactivity on some systems), then the file_test('-nt', ...) will
always return false, even if the file has changed.
If that happened without user intervention, I think it would break other
things in R -- the temp directory is supposed to last for the whole
session. But I should be checking anyway.
* If files get added or deleted between the two calls to list.files in
fileSnapshot, it will fail with an error.
Yours won't work if path contains more than one directory. This is
probably a reasonable restriction, but it's inconsistent with
list.files, so I'd like to avoid it if I can find a way.
Duncan Murdoch
* If the path is on a remote file system, tempdir is local, and
there's significant clock skew, then you can get incorrect results.
Unfortunately, these aren't just theoretical scenarios -- I've had the
misfortune to run up against all of them in the past.
I've attached code that's loosely based on your implementation that
solves these problems AFAICT. Alternatively, Hadley's code handles
all of these correctly, with the exception that compare_state doesn't
handle the case where safe_digest returns NA very well.
Regards,
Karl
On Fri, Sep 6, 2013 at 5:40 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote:
On 13-09-06 7:40 PM, Scott Kostyshak wrote:
On Fri, Sep 6, 2013 at 3:46 PM, Duncan Murdoch <murdoch.dun...@gmail.com>
wrote:
On 06/09/2013 2:20 PM, Duncan Murdoch wrote:
I have now put the code into a temporary package for testing; if anyone
is interested, for a few days it will be downloadable from
fisher.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz
Sorry, error in the URL. It should be
http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz
Works well. A couple of things I noticed:
(1)
md5sum is being called on directories, which causes warnings. (If this
is not viewed as undesirable, please ignore the rest of this comment.)
Should this be the responsibility of the user (by passing arguments to
list.files)? In the example, changing
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE)
to
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE, include.dirs=FALSE,
recursive=TRUE")
gets rid of the warnings. But perhaps the user just wants to exclude
directories for the md5sum calculations. This can't be controlled from
fileSnapshot.
I don't see the warnings, I just get NA values. I'll try to see why there's
a difference. (One possibility is my platform (Windows); another is that
I'm generally testing in R-patched and R-devel rather than the 3.0.1 release
version.) I would rather suppress the warnings than make the user avoid
them.
Or, should the "if (md5sum)" chunk subset "fullnames" using file_test
or file.info to exclude directories (and then fill in the directories
with NA)?
(2)
If I run example(changedFiles) several times, sometimes I get:
chngdF> changedFiles(snapshot)
File changes:
mtime md5sum
file2 TRUE TRUE
and other times I get:
chngdF> changedFiles(snapshot)
File changes:
md5sum
file2 TRUE
I wonder why.
Sometimes the example runs so quickly that the new version has exactly the
same modification time as the original. That's the risk of the mtime check.
If you put a delay between, you'll get consistent results.
Duncan Murdoch
Scott
sessionInfo()
R Under development (unstable) (2013-08-31 r63780)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] testpkg_1.0
loaded via a namespace (and not attached):
[1] tools_3.1.0
--
Scott Kostyshak
Economics PhD Candidate
Princeton University
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel