Package: pcregrep
Version: 6.4-1.1
Severity: normal

When pcregrep is outputting the input lines that matched the regexp,
it truncates each output line at the first null character.  AFAICT,
the bug affects only the output: the regexp engine correctly finds
characters that follow a null character in an input line, and it is
even possible to search for the null character itself.

[EMAIL PROTECTED]:~$ printf "pot\0kettle\nblack\n" | hd
00000000  70 6f 74 00 6b 65 74 74  6c 65 0a 62 6c 61 63 6b  |pot.kettle.black|
00000010  0a                                                |.|
00000011
[EMAIL PROTECTED]:~$ printf "pot\0kettle\nblack\n" | pcregrep k | hd
00000000  70 6f 74 0a 62 6c 61 63  6b 0a                    |pot.black.|
0000000a
[EMAIL PROTECTED]:~$ printf "pot\0kettle\nblack\n" | pcregrep '\0' | hd
00000000  70 6f 74 0a                                       |pot.|
00000004
[EMAIL PROTECTED]:~$

The bug may be in this line in the pcregrep function in pcregrep.c:

      fprintf(stdout, "%.*s\n", linelength, ptr);

Changing this to fwrite(ptr, 1, linelength, stdout) might fix the
bug, but I haven't tested it.  I believe this change would not
affect the handling of multibyte characters.

There may also be similar bugs in the output of neighbouring lines
(the --after-context=number and --before-context=number options).

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.12-1-k7
Locale: LANG=fi_FI.UTF-8, LC_CTYPE=fi_FI.UTF-8 (charmap=UTF-8)

Versions of packages pcregrep depends on:
ii  libc6                         2.3.5-6    GNU C Library: Shared libraries an
ii  libpcre3                      6.4-1.1    Perl 5 Compatible Regular Expressi

pcregrep recommends no packages.

-- no debconf information

Attachment: pgpPXJ1AjZ7yP.pgp
Description: PGP signature

Reply via email to