Re: ksh: documented substitution behavior contradicts actual behavior

Kastus Shchuka Sat, 15 Oct 2022 23:37:28 -0700

On Sat, Oct 15, 2022 at 11:42:17PM -0300, Lucas de Sena wrote:
> Hi,
> 
> After trying to split a string into fields delimited with colons and
> spaces, I found this bug in how ksh(1) does substitution.  The actual
> behavior contradicts what other shells like bash and mksh do and also
> contradicts its own manual.
> 
> Running the following on other shells (say, bash) prints "/foo/bar/".
> This command splits the string " foo : bar " into two fields: "foo"
> and "bar", considering colon and space as delimiters.
> 
>       echo " foo : bar " | {
>               IFS=": "
>               read -r a b
>               printf -- "/%s/%s/\n" "$a" "$b"
>       }
> 
> However, running the same command in OpenBSD ksh(1) (or sh(1)) splits
> the string into "foo" and ": bar".


This is because the last parameter (b) is a concatenation of two fields. 
Parsing 
is done properly if you add c to the read command:

+ echo  foo : bar 
+ IFS=: 
+ read -r a b c
+ printf -- /%s/%s/%s/\n foo  bar
/foo//bar/


> 
> The manual ksh(1) provides the following, similar example:
> 
> > Example: If IFS is set to “<space>:”, and VAR is set to
> > “<space>A<space>:<space><space>B::D”, the substitution for $VAR
> > results in four fields: ‘A’, ‘B’, ‘’ (an empty field), and ‘D’.
> > Note that if the IFS parameter is set to the NULL string, no field
> > splitting is done; if the parameter is unset, the default value of
> > space, tab, and newline is used.
> 
> Let's try it:
> 
>       echo " A :  B::D" | {
>               IFS=" :"
>               read -r arg1 arg2 arg3 arg4
>               printf -- '1st: "%s"\n' "$arg1"
>               printf -- '2nd: "%s"\n' "$arg2"
>               printf -- '3rd: "%s"\n' "$arg3"
>               printf -- '4th: "%s"\n' "$arg4"
>       }
> 
> bash(1) splits the line into the following fields:
> 
>       1st: "A"
>       2nd: "B"
>       3rd: ""
>       4th: "D"
> 
> This is actually the expected output, as described in the manual.
> 
> However, running the same command in OpenBSD ksh, prints this:
> 
>       1st: "A"
>       2nd: ""
>       3rd: "B"
>       4th: ":D"
> 
> A completelly different thing.
> The same occurs with OpenBSD sh(1).

What you observe is the result of the next paragraph in the man page
after the example you quoted:

     Also, note that the field splitting applies only to the immediate result
     of the substitution.  Using the previous example, the substitution for
     $VAR:E results in the fields: `A', `B', `', and `D:E', not `A', `B', `',
     `D', and `E'.  This behavior is POSIX compliant, but incompatible with
     some other shell implementations which do field splitting on the word
     which contained the substitution or use IFS as a general whitespace
     delimiter.

> 
> I could not understand how OpenBSD does the spliting, but the way it
> does is clearly a bug: it does not only contradicts its own manual,
> but also differs from other implementations.

I do not see contradictions in the man page. It does say that behavior
is incompatible with other shells.

> 
> Thank you,
> Lucas de Sena.
>

Re: ksh: documented substitution behavior contradicts actual behavior

Reply via email to