chrysn wrote: > On Wed, Apr 14, 2010 at 08:32:39AM +0200, Jim Meyering wrote: >> In some sense, the behavior you've noticed is inevitable. >> Imagine that after copying, mv were to go back and check again: >> then it spots the new file (your "latestfile") and copies it. >> Do we continue iterating and looking for new files in each >> and every directory being copied? At some point we have to >> stop and then begin the removal process (which requires removal >> of each entire tree/argument). Between when we stop looking for >> new files and when the removal gets to any given directory, there >> will always be an interval during which someone can create a file/dir >> there that will silently be removed. >> >> Also consider this: what if a file we've already copied is removed before >> the copy completes? Should mv perform another iteration to detect that, >> and then remove it also in the destination tree? > > i am aware that it is impossible to atomically move all files and remove > the directory on posix semantics, that's why i rather suggest leaving > left-over files where they are and not removing the directory.
What if I'm manually doing "cp -a dir/ dest/", then run "rm -rf dir" ? The same thing can arise if someone copies a file into "dir" while cp is running. Depending on the timing, it may or may not be copied, and then my subsequent "rm" will delete it. > for sake of completeness, there is even the problem with open file > handles: assume a process has just written a file that is now being > moved and still has a file handle. when move completes, the file is > unlinked, leaving the program with a write handle on a deleted file, to > which it can, to my knowledge, continue writing, but on close(), all is > lost -- in the typical case originally described, this is not the > case, though, and people who operate on files currently being written to > usually know that there can be issues. > > >> If we were to try to make mv remove source files only if we've copied >> them, not only would that introduce a significant amount of overhead, >> but [...] > > i've now had a look at the implementation -- current coreutils really > does the equivalent of 'rm -r' if there were no errors when copying. > only removing the files moved would mean tracking all of them, while the > current theoretical memory requirement amounts to the maximum path > depth. > >> [...] overhead, >> but it would change mv's semantics. >> >> If you want to pursue this, I suggest that you bring it up with the >> Austin Group (they define the POSIX standard). >> http://www.opengroup.org/austin/ > > for what i looked up on posix specs, there are no statements about what > to do in case of EXDEV (rename didn't work) [1]. do you think the austin > group would bother to specify previously unspecified behavior? > > [1] http://www.opengroup.org/onlinepubs/9699919799/utilities/mv.html I think your scenario is unlikely enough that we can compare it to the classical "Doctor, it hurts when I do this..." one. Well, then don't do that. However, if you find that some other implementation of rm (*BSD, opensolaris, etc.) handle this in a better manner either by default, or via an option, please let us know. > a solution that goes even deeper into the semantics but has no memory > overhead issues would be to delete files immediately after moving them. > this has a deeper effect on the semantics because its effect is not > limited to the case described above, but also affects cases in which > some files can't be read, which would be the only files left in this > solution (while originally, in that case there would be a copy of > readable files in the destination, but all unreadable files would be > left untouched). > > > in case we stick to the current semantics (or implement others but leave > the old as default), i suggest the following section to be inserted in > the man page: > > ------------------------------------------------------------------------ > > CAVEATS > When directories are moved across file systems, the source is > removed completely after successfully having copied all files to > the destination with the equivalent of `rm -r`, regardless of > files written while mv was running. > ------------------------------------------------------------------------ Thanks for the suggestion. The man page is generated from mv --help, so a note like that belongs in the more thorough "info" documentation. Would you like to reword that so it doesn't sound like we're using "rm" to copy, and present it as a patch to doc/coreutils.texi, per the contribution guidelines? http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING Consider whether cp would need a similar note. Maybe rm, too. It may or may not delete something you write into a tree that is in the process of being removed. And chown, chmod, chgrp (when using -R) and du. Perhaps this is something that is too basic to be attached to any particular tool. The behavior of hierarchy-traversing tools is usually well-specified only when they operate on a static file system. The moment you give them a moving source hierarchy, you're usually increasing the risk of unspecified behavior. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org