Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt
-fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security
-fstack-clash-protection -fcf-protection -fno-omit-frame-pointer
-mno-omit-leaf-frame-pointer -flto=auto
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin'
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc'
-DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS
-std=gnu17
uname output: Linux vbox-virtualbox 6.12.28-1-MANJARO #1 SMP PREEMPT_DYNAMIC
Fri, 09 May 2025 10:53:27 +0000 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu
Bash Version: 5.2
Patch Level: 37
Release Status: release
Bash options: autocd off
assoc_expand_once off
cdable_vars off
cdspell off
checkhash off
checkjobs off
checkwinsize on
cmdhist on
compat31 off
compat32 off
compat40 off
compat41 off
compat42 off
compat43 off
compat44 off
complete_fullquote on
direxpand off
dirspell off
dotglob off
execfail off
expand_aliases on
extdebug off
extglob on
extquote on
failglob off
force_fignore on
globasciiranges on
globskipdots on
globstar off
gnu_errfmt off
histappend on
histreedit off
histverify off
hostcomplete off
huponexit off
inherit_errexit off
interactive_comments on
lastpipe off
lithist off
localvar_inherit off
localvar_unset off
login_shell off
mailwarn off
no_empty_cmd_completion off
nocaseglob off
nocasematch off
noexpand_translation off
nullglob off
patsub_replacement on
progcomp on
progcomp_alias off
promptvars on
restricted_shell off
shift_verbose off
sourcepath on
varredir_close off
xpg_echo off
(tried looking for any changes turning off extglob and extquote
but to no difference)
Description:
Bash's '=~' extended POSIX regex seems to behave very different to the
way grep's -E flag seems to deal with regular expressions.
I failed multiple times on getting similar results to what I was
expecting from using grep just using the [a-z] and [a-z]+ classes - expecting
multiple results from $BASH_REMATCH but it's only picking up 1 character at
most, while grep -E is able to pick up all the characters (which is weird,
since the class [a-z]+$ gives completely similar results).
So, I was wondering whether this was a bug or intended and I'm just
misinterpreting how bash does regular expressions. I tried reading the bash
manual on the '=~' operator,
-> https://www.gnu.org/software/bash/manual/bash.html#index-_005b_005b,
but as far as I know (and to the extent of my knowledge how regular expressions
work), this seems like unintended behavior.
Repeat-By:
grep:
`$ echo test-test | POSIXLY_CORRECT=1 grep -E [a-z]`
`^test^-^test^`
`$ echo test-tesst | POSIXLY_CORRECT=1 grep -E [a-z]+`
`^test^-^tesst^`
bash's '=~' and $BASH_REMATCH:
```
$ if [[ test-test =~ [a-z] ]]; then
for i in "${!BASH_REMATCH[@]}"; do
echo "$i: ${BASH_REMATCH[$i]}";
done
fi
```
`0: t`
```
$ if [[ test-tesst =~ [a-z]+ ]]; then
for i in "${!BASH_REMATCH[@]}"; do
echo "$i: ${BASH_REMATCH[$i]}";
done;
fi
```
`0: test`
(Similarly when test-test/test-tesst gets quoted or double quoted,
or if the regex gets put in a single quoted variable)
Fix:
In both cases, ${BASH_REMATCH[1]} should also have results stored.