Hi, I saw this old bug report yesterday whilst looking for the convention for bug subjects used in dpkg.
* Changwoo Ryu [2004-07-04 06:57 +0900]: > In my ko_KR.UTF-8 locale and with the new Korean translation added > (#254590), > dpkg-query --list prints each fields in wrong columns (see attached > dpkg-list-before.png file). > > > This is because dpkg assumes that a string's length (strlen) is the > width it occupies > in text terminals, which is not true in UTF-8 locales. (A Korean hangul > character > is 3 bytes long but width is 2.) I fixed this problem by adjusting the > string lengths > in *printf() formats according to the their actual widths in text > terminal (see the > attached patch). > > With attached patch applied, it displays each fiels with Hangul > characters correctly. > (attached dpkg-list-after.png file) Just for the record, I wrote a similar function as used in the patch to align the columns for (non-dpkg) --help output. I wonder if the rest of the patch can be simplified. Maybe I'll have a look at this during the next release cycle. The code of my function is: static size_t mbswidth(const char *s) { #if ENABLE_NLS size_t rv, len, wlen; wchar_t *wstr; int wcsw; rv = len = strlen(s); if ((wstr = (wchar_t *)malloc((len + 1) * sizeof(wchar_t))) != NULL && (wlen = mbstowcs(wstr, s, len + 1)) != (size_t)-1 && (wcsw = wcswidth(wstr, wlen)) >= 0) rv = (size_t)wcsw; free(wstr); return rv; #else return strlen(s); #endif } Unlike the function in the patch with the same name, mine returns strlen(s) as fallback on errors (for example ENOMEM). Both are not thread-save nor fast when used in a loop. Mine assumes that strlen(s) >= mbstowcs(NULL,s,0). Both require POSIX 2001 + XSI extensions. Instead of ENABLE_NLS, dpkg would of course use HAVE_MBSTOWCS or similar. Regards Carsten -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org