Package: bsdextrautils
Version: 2.38.1-4
Severity: normal
X-Debbugs-Cc: bugs.debian....@wongs.net

Dear Maintainer,

The -l (--table-column-limit) option to the "column" utility does not
work correctly for data that has more than one space in a row. It is
supposed to specify a maximum number of columns and the last column
will contain all remaining line data. 

An example of how it is supposed to work can be seen when the input is
delimited by a single space. For example:

    $ printf '1 2 3 4 5 6\nOne Two Three Four Five Six\n' \
            | column -t -l5

    1    2    3      4     5 6
    One  Two  Three  Four  Five Six

Note how column 5 is the maximum column, so the data for column six is
simply appended. That is the correct behavior.

However, the problem can be easily triggered by simply piping the
output from column back into itself. This should be a no-op, but
instead mangles the data:

    $ printf '1 2 3 4 5 6\nOne Two Three Four Five Six\n' \
            | column -t -l5 \
            | column -t -l5

    1    2    3      4       3
    One  Two  Three  Four  ur

As you can see, the fifth column has been overwritten by data from
previous columns. (Perhaps a pointer problem?)

Any data with multiple spaces will trigger the bug. For example, the
output from 'ls -l':

    $ ls -lh | column -t -l7
    total       500K                         
    drwxr-xr-x  2     ben  ben    4.0K  Jan  an
    -rwxr-xr-x  1     ben  ben    2.7K  Jul  ul
    drwxr-xr-x  5     ben  ben    4.0K  Dec  ec
    -rw-r--r--  1     ben  ben    116K  Nov  ov
    -rw-r--r--  1     ben  ben    31K   Nov  Nov
    drwxr-xr-x  2     ben  ben    4.0K  Mar  ar
    -rw-r--r--  1     ben  ben    225   Oct  Oct
    drwxr-xr-x  2     ben  ben    12K   Jan  Jan
    drwxr-xr-x  12    ben  ben    260K  Jan  n



            *    *    *    *    *

This may be irrelevant, but I noticed in the source that there is some
code which seems suspicious at lines 459 and 470:

   457          if (ctl->maxncols && n + 1 == ctl->maxncols) {
   458                  if (nchars + skip < len)
-> 459                          wcdata = wcs0 + (nchars + skip);
   460                  else
   461                          wcdata = NULL;
   462          } else {
   463                  wcdata = local_wcstok(ctl, wcs, &sv);
   464
   465                  /* For the default separator ('greedy' mode) it uses
   466                   * strtok() and it skips leading white chars. In this
   467                   * case we need to remember size of the ignored white
   468                   * chars due to wcdata calculation in maxncols case */
   469                  if (wcdata && ctl->greedy
-> 470                      && n == 0 && nchars == 0 && wcdata > wcs)
   471                          skip = wcdata - wcs;
   472          }

In 459, pointer arithmetic is being done to index into the string for
the last column. However, it is a few bytes shy, perhaps because skip
is always zero. In my experiments, the test in 469-470 always failed,
thus `skip` is never changed.

The reference to wide characters made me wonder if that was the issue,
but neither export LANG=C nor recompiling with HAVE_WIDECHAR=0 helped.


-- System Information:
Debian Release: bookworm/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-1-amd64 (SMP w/8 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages bsdextrautils depends on:
ii  libc6          2.36-8
ii  libsmartcols1  2.38.1-4
ii  libtinfo6      6.4-1

bsdextrautils recommends no packages.

bsdextrautils suggests no packages.

-- no debconf information

Reply via email to