Package: coreutils
Version: 8.13-3.5
Severity: normal

I was attempting to use uniq to categorise my data based on the first so
many characters and I discover that:

a) it is currently impossible to use uniq to output all lines; with lines 
  grouped by initial prefix ( -w N ) and separated by an empty line 
  (--all-repeated=separate) because there is no way to specify outputing
  all lines
b) the combination behaviour of -u -d and -D is odd and suboptimal.

Here is an example with a small example dataset:

:; cat > uniq-test
AAA
AAB
ABA
ABC
ACA
ADA
ADD
ADE
:; uniq -w 2 -u uniq-test
ACA
:; uniq -w 2 -d uniq-test
AAA
ABA
ADA
:; uniq -w 2 -D uniq-test
AAA
AAB
ABA
ABC
ADA
ADD
ADE
:; uniq -w 2 -ud uniq-test
:; uniq -w 2 -du uniq-test
:; uniq -w 2 --all-repeated=separate uniq-test
AAA
AAB

ABA
ABC

ADA
ADD
ADE
:; uniq -w 2 -u --all-repeated=separate uniq-test
AAA

ABA

ADA
ADD
:; uniq -w 2 -c -D uniq-test
uniq: printing all duplicated lines and repeat counts is meaningless
Try `uniq --help' for more information.
:;

 So in summary:

 -ud or -du produces no output; but doesn't produce an error (where -c -D does)

 -u -D produces unexpected output (all the repeated lines except the last one 
   for each set).

 There is no way to output all lines, with separations.

 There are a number of ways to address this issue; but I think the best one
would be to correct behavior and documentation such that:

 -u outputs any lines which are unique
 -d outputs the first of any lines which are duplicated
 -D outputs all lines which are duplicated
 --all-repeated=METHOD seperates any groups of lines in the specified behavior

 Such that the output I wanted would be produced with:

:; uniq -w 2 -u --all-repeated=separate uniq-test
AAA
AAB

ABA
ABC

ACA

ADA
ADD
ADE
:;

 Thanks,

J.

-- System Information:
Debian Release: 7.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 3.2.0-4-686-pae (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages coreutils depends on:
ii  dpkg          1.16.15
ii  install-info  4.13a.dfsg.1-10
ii  libacl1       2.2.51-8
ii  libattr1      1:2.4.46-8
ii  libc6         2.13-38+deb7u4
ii  libselinux1   2.1.9-5

coreutils recommends no packages.

coreutils suggests no packages.

-- no debconf information


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to