shawn wrote:
> > shawn wrote:
> > >        comm -3 file1 file2
> > >               Print lines in file1 not in file2, and vice versa.
> > > seemingly conflicts with the documentation give above:
> > >        -3     suppress column 3 (lines that appear in both files)
> 
> If you supressed column 3, wouldn't you end up with all
> non-common/unique lines in BOTH files, according to the lower
> documentation.

Yes.  In column one and column two.

The comm documentation says:

  Column one contains lines unique to FILE1, column two contains lines
  unique to FILE2, and column three contains lines common to both files.

Suppressing column three would leave column one and column two.
Column one contains lines unique to FILE1.  Column two contains lines
unique to FILE2.  Or said another way, it prints lines in file1 not in
file2, and vice versa.

I think the main point of your objection is that while "print" is
correct that it doesn't specify the column it is printed in.  The
language is less precise than in the behavior description which it
more explicitly describes what is happening.  However this is in an
example which is extra additional information.  I think it is good for
examples to use alternate language.  Many people find it useful to
have alternate descriptions since different people read things
differently.  Simply repeating the previous behavior description
verbatim would be less useful to a large number of people even if it
is the most precise.

> Also, needs a warning in man page about needing to sort files first,
> which was the real problem I was having. (found doc here:
> http://pubs.opengroup.org/onlinepubs/009604599/utilities/comm.html )

Well...  The info documentation is the primary documentation for all
of the GNU software.  In the comm documentation it says:

      Before `comm' can be used, the input files must be sorted using the
   collating sequence specified by the `LC_COLLATE' locale.

The man page is generated but in the man page it says:

       Compare sorted files FILE1 and FILE2 line by line.

And there is also the --check-order option too.  So there is already
pretty strong wording in all of the documentation that the input must
be sorted.

Additionally the comm program itself will generate a warning if it
finds that the input is not sorted.  Example:

  $ comm <(printf "one\ntwo\nthree") <(printf "three\ntwo\none\n")
  one
          three
                  two
  comm: file 1 is not in sorted order
  comm: file 2 is not in sorted order
          one
  three

In the end there is only so much that can be done.

Bob

Attachment: signature.asc
Description: Digital signature

Reply via email to