Hi Bob,
On 2014-03-12 02:09, Bob Proulx wrote:
Hi Filipus,
Filipus Klutiero wrote:
Right. It could be
cp [-R|-a|-r] [OPTION]... FILE... DIRECTORY
cp [OPTION]... REGULAR-FILE... DIRECTORY
But then, there are more file types than regular files and
directories, so that probably wouldn't be perfect.
I think keeping it generic with simply SOURCE or FILE is better.
(Noting that directories are files too.)
The behavior of cp on whatever source file is going to be whatever is
allowed on the original file. Whether that is a "regular" file, a
pipe, a character device, a block device or whatever. Directories are
the most special because you can't open(2) them. (And the same for
unix domain sockets.)
If cp can open the file then it will read the contents then it will
write the contents out. Trying to individually document what happens
with various input files types means that if the kernel adds a new
type of file that the cp documentation is immediately out of date.
Better to avoid that and simply say it copies from the source file to
the destination.
Whatever the source file type I expect that if the kernel allows me to
open the file and read it then I expect cp to do so and to read the
source file and to copy it to the destination file. Whether this is
successful completely depends upon the file type and the kernel and
not cp.
I'm not convinced cp is merely a frontend to the kernel. I can see several
options (--copy-contents, -L and -P) besides -R which seem to indicate that cp
tries to behave as the user wants, even if that requires extra complexity.
However, I never read coreutils's source and I'm far from knowledgeable in this
area, so this is a very humble opinion.
In any case, I'm not sure where this brings us. cp's manpage can't say it
behaves like open(2) behaves without losing the vast majority of readers. I'm
certainly not arguing that documentation should cover each type individually.
What I was trying to say in my last message is that even though the SYNOPSIS I
suggested is not much more complex and seems to solve the problem, it brings
more issues, so I'm not convinced it's the right solution (since indeed, we
don't want to end up with one form per file type).
I didn't mean to blame the synopsis, it was just part of the
relevant part of the manpage. The problematic parts are the NAME
section ("copy files*and directories*") and the description "Copy
SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY."
Hmm... Interesting. So if it didn't say "and directories" it would
be better? I note that BSD and SunOS don't say that. But HP-UX
does. And the old V7 just said "copy" and nothing more. I think the
addition are trying to make things better but perhaps they do say too
much.
Not an easy call. For reference, rm has an equivalent issue:
rm - remove files or directories
I don't know if this is the best. It could be understood as meaning that you
have to choose between removing files and removing directories.
"copy" is exact, but certainly vague if you're looking at man section 1 as a
whole.
"copy files" may leave some wondering how to copy directories.
"copy files or directories" is redundant in POSIX terminology, and one may be
surprised that directories are not copied by default
Which leads me to consider more options:
"copy files and, optionally, directories" - again redundant in POSIX terminology
"copy files, optionally including directories"
The last 2 are of course more complex than would be hoped.
I think I'd go with either "copy files" or "copy files, optionally including
directories". That being said, NAME is only part of the problem. It's the combination of
misleading NAME and inexact DESCRIPTION which makes the manpage deceptive. Even the current NAME
would be a lot less problematic if there was a fat warning in DESCRIPTION.
[...]
I have no strong opinion on which approach is preferable. Either 2
forms and 2 descriptions or keeping 1 form with a different
description. As long as 1 - we don't say that cp copies SOURCE to
DEST, when SOURCE can't be a directory without options, or 2 - we
add right after a big warning about directories like the info
documentation's.
I think it would be defendable to request upstream to change to:
-Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
+Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY,\n\
+or optionally recursively directory SOURCES(s) to DIRECTORY.\n\
These formulations are incorrect, or at least highly misleading. The reader should be
able to predict cp's behavior for a given call, or at least not legitimately mispredict
cp's behavior. But if we're just looking at the current documentation ignoring bug
741390, the latter requirement is failed. Given "Copy SOURCE to DEST, or multiple
SOURCE(s) to DIRECTORY", I would conclude that:
* Given a single source, cp will copy it to DEST
* Given multiple sources, cp will copy them to DIRECTORY.
So that $ cp regular.c src/; would be the first form. But whether the call is of the first or second form, doesn't depend on the
number of sources, it depends on the last argument (unless -T is specified). And in any case, we describe the behavior
identically ("cp will copy it to DEST" vs "cp will copy them to DIRECTORY"). Unless we use "to"
identically but with different meanings ("into" vs "over").
The predictability issue is at least as problematic with the proposed version. The behavior is
complex, and I think explaining these 3 forms needs to be done in several sentences, as we see in
the base specifications, although the language describing the effect of each form there may be too
technical. "into" and "over" are much more simple and I don't see much
confusion possible.
By the way, again we face the issue of duplicating documentation efforts. It comes to mind that if
we really want to point readers to info doc, putting the reference at the end of the manpage may
not be smart. Assuming a sequential reading, we're letting users read the "bad"
documentation, and only then telling them that... they should have read the "good"
documentation instead.
Just a thought.
--
Filipus Klutiero
http://www.philippecloutier.com