Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I../. -I.././include -I.././lib -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wall -no-pie -Wno-parentheses -Wno-format-security uname output: Linux dilbert 4.10.0-41-generic #45~16.04.1-Ubuntu SMP Fri Nov 24 15:06:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu
Bash Version: 4.4 Patch Level: 12 Release Status: release Description: I'm sanitising urls from advertisement crap. As described below I'm getting a wrong resolution of parenthesised expression defined with non-greedy operator '?'. The test url is: http://toolbox.contentspread.net/container/medimops/track/xxxxxxxxxx.dyn?csRdu=https://www.medimops.de/?anid=M9999999999&cl=details&wdm=M9999999999&utm_source=CRM&utm_medium=email&utm_campaign=OS The regular expression is: https?:\/\/toolbox.contentspread.net\/(.*?)=(.+?)&.* As I understand the specification and verified with 'visual regexp' and https://regex101.com/ the result should be: 1 â container/medimops/track/xxxxxxxxxx.dyn?csRdu 2 â https://www.medimops.de/?anid=M9999999999 Running the script below I got instead: 1 â container/medimops/track/xxxxxxxxxx.dyn?csRdu=https://www.medimops.de/?anid=M9999999999&cl=details&wdm=M9999999999&utm_source=CRM&utm_medium 2 â email Repeat-By: Test script: #!/bin/bash url='http://toolbox.contentspread.net/container/medimops/track/xxxxxxxxxx.dyn?csRdu=https://www.medimops.de/?anid=M9999999999&cl=details&wdm=M9999999999&utm_source=CRM&utm_medium=email&utm_campaign=OS' re='https?:\/\/toolbox.contentspread.net\/(.*?)=(.+?)&.*' if [[ ${url} =~ ${re} ]] then echo "0 â ${BASH_REMATCH[0]}" echo "1 â ${BASH_REMATCH[1]}" echo "2 â ${BASH_REMATCH[2]}" fi