chrysn wrote:
> On Wed, Apr 14, 2010 at 08:32:39AM +0200, Jim Meyering wrote:
>> In some sense, the behavior you've noticed is inevitable.
>> Imagine that after copying, mv were to go back and check again:
>> then it spots the new file (your "latestfile") and copies it.
>> Do we continue iterating and looking for new files in each
>> and every directory being copied?  At some point we have to
>> stop and then begin the removal process (which requires removal
>> of each entire tree/argument).  Between when we stop looking for
>> new files and when the removal gets to any given directory, there
>> will always be an interval during which someone can create a file/dir
>> there that will silently be removed.
>>
>> Also consider this: what if a file we've already copied is removed before
>> the copy completes?  Should mv perform another iteration to detect that,
>> and then remove it also in the destination tree?
>
> i am aware that it is impossible to atomically move all files and remove
> the directory on posix semantics, that's why i rather suggest leaving
> left-over files where they are and not removing the directory.

What if I'm manually doing "cp -a dir/ dest/",
then run "rm -rf dir" ?
The same thing can arise if someone copies a file
into "dir" while cp is running.  Depending on the
timing, it may or may not be copied, and then
my subsequent "rm" will delete it.

> for sake of completeness, there is even the problem with open file
> handles: assume a process has just written a file that is now being
> moved and still has a file handle. when move completes, the file is
> unlinked, leaving the program with a write handle on a deleted file, to
> which it can, to my knowledge, continue writing, but on close(), all is
> lost -- in the typical case originally described, this is not the
> case, though, and people who operate on files currently being written to
> usually know that there can be issues.
>
>
>> If we were to try to make mv remove source files only if we've copied
>> them, not only would that introduce a significant amount of overhead,
>> but [...]
>
> i've now had a look at the implementation -- current coreutils really
> does the equivalent of 'rm -r' if there were no errors when copying.
> only removing the files moved would mean tracking all of them, while the
> current theoretical memory requirement amounts to the maximum path
> depth.
>
>>                                                       [...] overhead,
>> but it would change mv's semantics.
>>
>> If you want to pursue this, I suggest that you bring it up with the
>> Austin Group (they define the POSIX standard).
>> http://www.opengroup.org/austin/
>
> for what i looked up on posix specs, there are no statements about what
> to do in case of EXDEV (rename didn't work) [1]. do you think the austin
> group would bother to specify previously unspecified behavior?
>
> [1] http://www.opengroup.org/onlinepubs/9699919799/utilities/mv.html

I think your scenario is unlikely enough that we can compare
it to the classical "Doctor, it hurts when I do this..." one.
Well, then don't do that.

However, if you find that some other implementation of rm
(*BSD, opensolaris, etc.) handle this in a better manner
either by default, or via an option, please let us know.

> a solution that goes even deeper into the semantics but has no memory
> overhead issues would be to delete files immediately after moving them.
> this has a deeper effect on the semantics because its effect is not
> limited to the case described above, but also affects cases in which
> some files can't be read, which would be the only files left in this
> solution (while originally, in that case there would be a copy of
> readable files in the destination, but all unreadable files would be
> left untouched).
>
>
> in case we stick to the current semantics (or implement others but leave
> the old as default), i suggest the following section to be inserted in
> the man page:
>
> ------------------------------------------------------------------------
>
> CAVEATS
>        When directories are moved across file systems, the source is
>        removed completely after successfully having copied all files to
>        the destination with the equivalent of `rm -r`, regardless of
>        files written while mv was running.
> ------------------------------------------------------------------------

Thanks for the suggestion.
The man page is generated from mv --help, so a note like that belongs
in the more thorough "info" documentation.

Would you like to reword that so it doesn't sound like we're using "rm"
to copy, and present it as a patch to doc/coreutils.texi, per the
contribution guidelines?

  http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING

Consider whether cp would need a similar note.

Maybe rm, too.  It may or may not delete something you write into a
tree that is in the process of being removed.
And chown, chmod, chgrp (when using -R) and du.

Perhaps this is something that is too basic to be attached to
any particular tool.  The behavior of hierarchy-traversing tools
is usually well-specified only when they operate on a static
file system.  The moment you give them a moving source hierarchy,
you're usually increasing the risk of unspecified behavior.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to