Kastus Shchuka writes:
> On Sun, Oct 16, 2022 at 11:48:35AM +0100, [email protected] wrote:
> > So given $X:
> >
> > $ X=' A : B::D'
> >
> > Parameter substitution:
> >
> > $ ( IFS=' :'; dump $X )
> > $VAR1 = 'A';
> > $VAR2 = 'B';
> > $VAR3 = '';
> > $VAR4 = 'D';
> >
> > read substitution:
> >
> > $ echo "$X" | ( IFS=' :'; read a1 a2 a3 a4; dump "$a1" "$a2" "$a3"
> > "$a4" )
> > $VAR1 = 'A';
> > $VAR2 = '';
> > $VAR3 = 'B';
> > $VAR4 = ':D';
> >
> > It does look like read, which uses its own expansion routine, has
> > a bug: a2/VAR2 should be 'B' (or 'B::D') not ''.
>
> Not sure if it is a bug or a feature.
I can't think of a good reason why the output from these commands
should be different. The manpage section describing read states
clearly that it should be using the same splitting algorithm:
"separates the line into fields using the IFS parameter (see
Substitution above)".
The diff brings them into line with each other and I think accounts
for all the edge cases.
Matthew
Index: c_sh.c
===================================================================
RCS file: /src/datum/openbsd/cvs/src/bin/ksh/c_sh.c,v
retrieving revision 1.64
diff -u -p -r1.64 c_sh.c
--- c_sh.c 22 May 2020 07:50:07 -0000 1.64
+++ c_sh.c 17 Oct 2022 09:59:21 -0000
@@ -253,6 +253,7 @@ c_read(char **wp)
int expand = 1, savehist = 0;
int expanding;
int ecode = 0;
+ int hardws = 0;
char *cp;
int fd = 0;
struct shf *shf;
@@ -376,9 +377,21 @@ c_read(char **wp)
break;
if (ctype(c, C_IFS)) {
if (Xlength(cs, cp) == 0 && ctype(c, C_IFSWS))
- continue;
+ continue; /* Trim leading space. */
+ if (!ctype(c, C_IFSWS)) {
+ /* Do not finish this variable
+ * on non IFS whitespace if the
+ * previous variable has
+ * trailing IFS whitespace.
+ */
+ if (hardws) {
+ hardws = false;
+ continue;
+ }
+ } else
+ hardws = true;
if (wp[1])
- break;
+ break; /* Finish scanning this
variable. */
}
Xput(cs, cp, c);
}