Package: coreutils Version: 6.10-6 Followup-For: Bug #435519
I am experience a similar problem, I think this is the same issue. On debian, uniq thinks different utf-8 encoded korean names are identical: w...@thwomp:~$ cat hangul | recode utf8..html 안영관 오경완 황규진 w...@thwomp:~$ cat hangul | uniq | wc -l 1 Whereas the uniq on an old FreeBSD machine behaves as expected: f...@xs2:~% cat hangul | uniq | wc -l 3 I expect this may have something to do with locale settings, these are the settings I used on both machines: w...@thwomp:~$ locale LANG=en_GB.UTF-8 LANGUAGE=fy_NL:en_GB LC_CTYPE=en_GB.UTF-8 LC_NUMERIC="en_GB.UTF-8" LC_TIME="en_GB.UTF-8" LC_COLLATE="en_GB.UTF-8" LC_MONETARY="en_GB.UTF-8" LC_MESSAGES="en_GB.UTF-8" LC_PAPER="en_GB.UTF-8" LC_NAME="en_GB.UTF-8" LC_ADDRESS="en_GB.UTF-8" LC_TELEPHONE="en_GB.UTF-8" LC_MEASUREMENT="en_GB.UTF-8" LC_IDENTIFICATION="en_GB.UTF-8" LC_ALL= Thank you for your time. -- Kuno Woudt. -- System Information: Debian Release: 5.0.1 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Kernel: Linux 2.6.26-2-686 (SMP w/2 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Versions of packages coreutils depends on: ii libacl1 2.2.47-2 Access control list shared library ii libc6 2.7-18 GNU C Library: Shared libraries ii libselinux1 2.0.65-5 SELinux shared libraries coreutils recommends no packages. coreutils suggests no packages. -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org