A year ago, dash had this fix:
commit 1f1e555aba99808a82cb5090b5ef980714dea09c
Author: Herbert Xu <[email protected]>
Date: Wed May 1 17:12:27 2024 +0800
expand: Fix naked backslah leakage
Naked backslashes in patterns may incorrectly unquote subsequent
wild characters that are themselves quoted. Fix this by adding
an extra backslash when necessary.
Test case:
a="\\*bc"; b="\\"; c="*"; echo "<${a##$b"$c"}>"
Old result:
<>
New result:
<bc>
I started creating a testcase for it, covering a few more possibilities.
... and discovered that bash, in fact, is probably buggy:
The naked (not double-quoted) $b in pattern part misbehaves:
$b* combination matches literal '*' character
$b"*" combination matches ... I didn't manage to find out what it matches:
it matches neither '*' nor '\*', so I fail to imagine what glob() pattern
is internally produced to do the match when ${var#$b"*"} is evaluated...
If it's not a bug, can you specify the rules how it works?
(so that I can have meaningful comments in dash and busybox ash/hush)
The full testcase script is below, with parts where bash is not working
as I expect commented accordingly:
a='\*bc'
b='\'
c='*'
echo "a is '$a'"
echo "b is '$b'"
echo "c is '$c'"
echo '${a##?*} removes everything: '"|${a##?*}|"
echo '${a##?"*"} removes \*: '"|${a##?"*"}|"' - matches one char, then
*'
echo '${a##\*} removes nothing: '"|${a##\*}|"' - first char is not *'
echo '${a##\\*} removes everything: '"|${a##\\*}|"' - matches \, then all'
echo '${a##\\\*} removes \*: '"|${a##\\*}|"' - matches \, then *'
echo '${a##?$c} removes everything: '"|${a##?$c}|"' - matches one char, then
all'
echo '${a##?"$c"} removes \*: '"|${a##?"$c"}|"' - matches one char,
then *'
echo '${a##\\$c} removes everything: '"|${a##\\$c}|"' - matches \, then all'
echo '${a##\\"$c"} removes \*: '"|${a##\\"$c"}|"' - matches \, then *'
echo '${a##$b} removes \: '"|${a##$b}|"' - matches \'
echo '${a##"$b"} removes \: '"|${a##"$b"}|"' - matches \'
echo
# This isn't working in bash as expected
echo '${a##$b?} removes \*: '"|${a##$b?}|"' - matches \, then one
char' # bash prints |\*bc|
echo '${a##$b*} removes everything: '"|${a##$b*}|"' - matches \, then all'
# bash prints |\*bc|
echo '${a##$b$c} removes everything: '"|${a##$b$c}|"' - matches \, then all'
# bash prints |\*bc|
echo '${a##$b"$c"} removes \*: '"|${a##$b"$c"}|"' - matches \, then *'
# bash prints |\*bc|
# the cause seems to be that $b emits backslash that "glues" onto next
character if there is one:
# a='\*bc'; b='\'; c='*'; echo "|${a##?$b*}|" # bash prints |bc| - the $b*
works as \* (matches literal *)
# a='\*bc'; b='\'; c='*'; echo "|${a##\\$b*}|" # bash prints |bc|
# a='*bc'; b='\'; c='*'; echo "|${a##$b*}|" # bash prints |bc|
echo
echo '${a##"$b"?} removes \*: '"|${a##"$b"?}|"' - matches \, then one
char'
echo '${a##"$b"*} removes everything: '"|${a##"$b"*}|"' - matches \, then all'
echo '${a##"$b""?"} removes nothing: '"|${a##"$b""?"}|"' - second char is not
?' # bash prints |bc|
echo '${a##"$b""*"} removes \*: '"|${a##"$b""*"}|"' - matches \, then *'
echo '${a##"$b"\*} removes \*: '"|${a##"$b"\*}|"' - matches \, then *'
echo '${a##"$b"$c} removes everything:'"|${a##"$b"$c}|"' - matches \, then all'
echo '${a##"$b""$c"} removes \*: '"|${a##"$b""$c"}|"' - matches \, then *'
echo '${a##"$b?"} removes nothing: '"|${a##"$b?"}|"' - second char is not ?'
# bash prints |bc|
echo '${a##"$b*"} removes \*: '"|${a##"$b*"}|"' - matches \, then *'
# bash prints ||
echo '${a##"$b$c"} removes \*: '"|${a##"$b$c"}|"' - matches \, then *'