Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 -fdebug-prefix-map=/build/bash-Dl674z/bash-5.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Wno-parentheses -Wno-format-security uname output: Linux d-us6a-ubuntu-03 5.0.0-13-generic #14-Ubuntu SMP Mon Apr 15 14:59:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.0 Patch Level: 3 Release Status: release Description: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file Repeat-By: > touch a\$.class > for i in $(echo "a\\\$.class"); do echo "$i"; done a$.class > rm a\$.class > for i in $(echo "a\\\$.class"); do echo "$i"; done a\$.class The existence or not of the file should not have any effect.
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 5:43 AM, Greg Wooledge wrote: > On Wed, May 22, 2019 at 05:25:43PM +0700, Robert Elz wrote: >> Date:Tue, 21 May 2019 22:11:20 + >> From: Charles-Henri Gros >> Message-ID: >> >> >> | The existence or not of the file should not have any effect. >> >> But it does, and is intended to. If the mattern matches a file >> (when patyhname expanded as a result of the unquoted command substitution) >> you get the file name produced. If it does not match a file, >> the pattern is left untouched. That is the way that things are >> supposed to work. > With glob metacharacters, sure. But none of the characters in his > variable are glob metacharacters. > > There is definitely something weird happening here. > > wooledg:/tmp/x$ echo "$BASH_VERSION" > 5.0.3(1)-release > wooledg:/tmp/x$ touch 'a$.class' > wooledg:/tmp/x$ i='a\$.class'; echo {$i} "{$i}" > {a\$.class} {a\$.class} > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a$.class {a\$.class} > > Other versions of bash, plus ksh and dash, don't behave this way. > > wooledg:/tmp/x$ bash-2.05b > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ bash-4.4 > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ ksh > $ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ dash > $ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > It seems to be unique to bash 5. If it's a bug fix, then I'm not > understanding the rationale. Backslashes shouldn't be consumed during > glob expansion. > > This is also not limited to $ alone. It happens with letters too. > > wooledg:/tmp/x$ touch i > wooledg:/tmp/x$ i='\i' j='\j' > wooledg:/tmp/x$ echo $i $j > i \j > > Standard disclaimers apply. Stop using unquoted variables and these > bugs will stop affecting you. Nevertheless, Chet may want to take a > peek. What unquoted variables? Are you talking about the "$()" expansion? The problem I'm trying to solve is to iterate over regex-escaped file names obtained from a "find" command. I don't know how to make this work. It works with other versions of bash and with other shells. The original is closer to something like this: for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" someinput; done It used to work. Now it doesn't. I do not know how to make it work again. -- Charles-Henri Gros
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 10:47 AM, Greg Wooledge wrote: > On Wed, May 22, 2019 at 05:34:22PM +0000, Charles-Henri Gros wrote: >> On 5/22/19 5:43 AM, Greg Wooledge wrote: >>> Standard disclaimers apply. Stop using unquoted variables and these >>> bugs will stop affecting you. Nevertheless, Chet may want to take a >>> peek. >> What unquoted variables? Are you talking about the "$()" expansion? > Yes. I used a variable instead of a command substitution to make it > easier to reproduce the problem. Both have the same behavior in this > case. That's what I find a bit surprising (but shells are complicated, so maybe this is right. All I know is that the code used to work). I didn't think glob expansions applied to command expansions. All I want here is word split (which is why I can't use quotes) > >> The problem I'm trying to solve is to iterate over regex-escaped file >> names obtained from a "find" command. I don't know how to make this >> work. It works with other versions of bash and with other shells. > First step: do not "regex-escape" them, whatever that means. Just use > the actual filenames as printed by find -print0. > >> The original is closer to something like this: >> >> for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" >> someinput; done > Yeah, that's just the wrong approach. It's also the first thing on > the BashPitfalls page[1] (for a good reason). > > You have two choices here: > > 1) Use find -exec. > >find ... -exec grep -e someinput /dev/null {} + > > 2) Use find -print0 and a bash while read loop. (NOT a for loop.) > >find ... -print0 | >while IFS= read -rd '' file; do > something "$file" >done > >(A variant of this uses < <() instead of a pipeline, so that the while >loop runs in the main shell and variable assignments can persist.) > > Since you only show a simple grep as your action, find -exec is a better > choice for this problem. (Assuming you didn't fatally misrepresent the > problem.) Calling grep once for every file would be inefficient. I don't think I fatally misrepresented the problem, however I do think that you fatally misunderstood it (FWIW I know about -print0 and xargs -0) The file name is the regex (argument to "-e"), not the file "grep" reads. I want to check that some text file contains a reference to a file. But it looks like this would work: for file in $(find ...); do grep -e "$(echo -n "$file" | sed 's/\$/\\$/g')" someinput; done -- Charles-Henri Gros
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 3:13 PM, Robert Elz wrote: > Date:Wed, 22 May 2019 17:34:22 + > From: Charles-Henri Gros > Message-ID: > > > | The problem I'm trying to solve is to iterate over regex-escaped file > | names obtained from a "find" command. I don't know how to make this > | work. It works with other versions of bash and with other shells. > > You were relying upon a common bug, which has been fixed in bash, but > your technique is all wrong, you don't need any kind of loop at all, not > a for loop, and not the while read loop that Greg suggested. > > find -print produces a list of names, one per line. Those are simple > strings, which fgrep (or grep -F as Andreas suggested) can handle finding. > > What I'd do is > > fgrep "$(find -print)" wherever Interesting, I didn't realize you could pass newline-separated patterns to "grep" on the command line. Good to know for the future. But unfortunately, grep was just illustrative, I'm using another tool that takes a regex but has no "-F" option (though admittedly with some effort I could add one, I wrote the tool in question). > > (You can use grep -F if you have an aversion to using its traditional name, > but fgrep was once a different program to grep / egrep). > > This version will have a problem with filenames with embedded newlines, > but so did your original, so I am simply assuming that you have none of > those (using any variant of grep to search for strings containing newlines > tends to be "difficult" as grep is a line at a time tool). Yes I'm not expecting any special characters except "$". > > If you version of grep cannot handle the pattern list not having a > terminating \n (the $() removes it) then you can add it back > > fgrep "$(find ... -print)"$'\n' wherever. > > You're probably still going to need a | into sed inside the command > substitution, as I doubt that you actually want to look for filenames > in the format that find prints them (you have never shown your actual > command) and I suspect that you want to delete the pathname component > (a leading "./" or whatever) and it isn't clear what you want to > happen with filenames in subdirectories. But none of those manipulations > will affect anything. > > The other difference between this method and the one that you were > using, is that this one will mix up the output for all of the different > file names (it reads the target files just once, looking for all of the > filenames simultaneously) whereas your original scheme looked for each > file name in the target sequentially (re-reading the target file(s) over > and over again for each new file name). That would group output lines > for each file name together, whereas the technique above does not. -- Charles-Henri Gros