Package: sed
Version: 4.1.5-6
Severity: normal

Hello, 
I think the --posix option might be too aggressive wrt the \ character in
the replacement string of the 's' command.

Some examples:

[14:07:17] - [EMAIL PROTECTED]:~$ echo foobar | sed --posix -e 's/\(o\)/\1/g'
foobar

Works OK (using a newline too -- hence it is posix-compliant). But the
following behaviour might not:

Exhibit 1:

[14:07:34] - [EMAIL PROTECTED]:~$ echo foobar | sed --posix -e 's/\(o\)/\&/g'
fbar

Exhibit 2:
[14:07:27] - [EMAIL PROTECTED]:~$ echo foobar | sed --posix -e 's/\(o\)/\\1/g'
f11bar

Exhibit 3:
[14:07:30] - [EMAIL PROTECTED]:~$ echo foobar | sed --posix -e 's/\(o\)/\\/g'
fbar

Now let me take the relevant part of the POSIX spec for sed, where I put [x]
to pinpoint where the spec seems not to be respected:

"
The replacement string shall be scanned from beginning to end. An ampersand
( '&' ) appearing in the replacement shall be replaced by the string
matching the BRE. The special meaning of '&' in this context can be
suppressed by preceding it by a backslash. The characters "\n", where n is a
digit, shall be replaced by the text matched by the corresponding
backreference expression. The special meaning of "\n" where n is a digit in
this context, can be suppressed by preceding it by a backslash[2]. For each
other backslash ( '\' ) encountered, the following character shall lose its
special meaning (if any). The meaning of a '\' immediately followed by any
character other than '&'[1], '\'[3], a digit, or the delimiter character used 
for
this command, is unspecified.
"

If my reading of the spec is correct, the tests above shall return:
Exhibit 1: f&&bar (special meaning of & is suppressed, hence actual &s)
Exhibit 2: f\1\1bar (special meaning of \1 suppressed)
Exhibit 3: f\\bar (special meaning of \ suppressed)
Though I might agree that the behaviour for \\ is more implicit in the spec,
as it is deduced from the spec of [2].

I stated in the beginning that it was too aggressive for \ as in the
sources, characters other than numbers fall into a default in a switch
statement without taking \ or & into account, as I remember. 

Is my understanding of the spec correct and is this indeed a bug of the
--posix option ?

Regards,
Samuel Colin

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 2.6.24 (PREEMPT)
Locale: LANG=C, LC_CTYPE=fr_FR.UTF8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages sed depends on:
ii  libc6                         2.7-10     GNU C Library: Shared libraries

sed recommends no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to