On Wed, Apr 14, 2010 at 08:32:39AM +0200, Jim Meyering wrote: > In some sense, the behavior you've noticed is inevitable. > Imagine that after copying, mv were to go back and check again: > then it spots the new file (your "latestfile") and copies it. > Do we continue iterating and looking for new files in each > and every directory being copied? At some point we have to > stop and then begin the removal process (which requires removal > of each entire tree/argument). Between when we stop looking for > new files and when the removal gets to any given directory, there > will always be an interval during which someone can create a file/dir > there that will silently be removed. > > Also consider this: what if a file we've already copied is removed before > the copy completes? Should mv perform another iteration to detect that, > and then remove it also in the destination tree?
i am aware that it is impossible to atomically move all files and remove the directory on posix semantics, that's why i rather suggest leaving left-over files where they are and not removing the directory. for sake of completeness, there is even the problem with open file handles: assume a process has just written a file that is now being moved and still has a file handle. when move completes, the file is unlinked, leaving the program with a write handle on a deleted file, to which it can, to my knowledge, continue writing, but on close(), all is lost -- in the typical case originally described, this is not the case, though, and people who operate on files currently being written to usually know that there can be issues. > If we were to try to make mv remove source files only if we've copied > them, not only would that introduce a significant amount of overhead, > but [...] i've now had a look at the implementation -- current coreutils really does the equivalent of 'rm -r' if there were no errors when copying. only removing the files moved would mean tracking all of them, while the current theoretical memory requirement amounts to the maximum path depth. > [...] overhead, > but it would change mv's semantics. > > If you want to pursue this, I suggest that you bring it up with the > Austin Group (they define the POSIX standard). > http://www.opengroup.org/austin/ for what i looked up on posix specs, there are no statements about what to do in case of EXDEV (rename didn't work) [1]. do you think the austin group would bother to specify previously unspecified behavior? [1] http://www.opengroup.org/onlinepubs/9699919799/utilities/mv.html a solution that goes even deeper into the semantics but has no memory overhead issues would be to delete files immediately after moving them. this has a deeper effect on the semantics because its effect is not limited to the case described above, but also affects cases in which some files can't be read, which would be the only files left in this solution (while originally, in that case there would be a copy of readable files in the destination, but all unreadable files would be left untouched). in case we stick to the current semantics (or implement others but leave the old as default), i suggest the following section to be inserted in the man page: ------------------------------------------------------------------------ CAVEATS When directories are moved across file systems, the source is removed completely after successfully having copied all files to the destination with the equivalent of `rm -r`, regardless of files written while mv was running. ------------------------------------------------------------------------
signature.asc
Description: Digital signature