Re: Including \t, \T or \A at beginning of PS1 causes line wrap issues

2014-01-17 Thread David C. Rankin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/17/2014 12:17 AM, Chris Down wrote:
> +Cc: bug-bash
> 
> Please do not take discussions off-list, it decreases the value of
> conversations for future readers.

Chris, I apologize, I should have caught the reply-address was yours and not the
list's.

> 
> On 2014-01-16 23:13:35 -0600, David C. Rankin wrote:
>> A sanitized version w/o color.
> 
> That detail matters -- please include the full scope of the issue in
> future when reporting issues. My first thought when hearing about
> wrapping issues is always that there are improperly stated zero-width
> sequences in the prompt.
> 
>> The full prompt exported in .bashrc is:
>>
>> PS1="\[\e[0;37m\]\A\[\e[1;34m\] \h:\w> \[\e[0m\]"
>>
>>   also tested with:
>>
>> PS1="\[\e[0;37m\]\D{%R}\[\e[1;34m\] \h:\w> \[\e[0m\]"
> 
> Those actually look okay to me (and I can't seem to reproduce your
> issue, unless I misunderstood the steps), but maybe I'm missing
> something. Either way, just use "tput", it will save you from a world of
> pain, and it actually uses the terminfo database:
> 
> PS1='\[$(tput setaf 7)\]\A\[$(tput bold)$(tput setaf 4)\] \h:\w> \[$(tput 
> sgr0)\]'
> 
> You can also put those in variables to avoid calling "tput" every time
> the prompt is drawn.
> 

  Yes, I pick through the zero-width sequences and insure they always begin with
\[ and are terminated with \]. I have seen/read about using tput and assigning
color names to the tput calls, but had not tried it yet. I suspect it would not
have made any difference in this case. I believe we were dealing with a kernel
bug that was fixed upon updating to a new kernel today.

  I believe the problem started when I upgraded linux (3.12.5-1 -> 3.12.6-1).
Today I upgraded linux (3.12.6-1 -> 3.12.7-2) and the problem seems to have
resolved itself. Even using:

  PS1="\[\e[0;37m\]\D{%R}\[\e[1;34m\] \h:\w> \[\e[0m\]"

  As mentioned earlier, I have experienced these line-wrap issues intermittently
over the past decade. When they are reported (usually to the Linux distro bug
list, they have generally been resolved at that level). This time, I had not
seen a line-wrap issue on Archlinux for at leash the past 4 years. Looking at
the logs I saw the bash 4.2.045-5 install on 11/27 and concluded it was likely
connected. I still cannot confirm the kernel was 100% the problem, but
regardless, I am not seeing the issue after the update.

  Thank your for your time and the tput suggestion. You guys have a good 
weekend.

- -- 
David C. Rankin, J.D.,P.E.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLY9hIACgkQZMpuZ8Cyrci0iwCfUIxv0UFt0KHUZsChVHGzI8og
N0cAn0G3A2cWHg+ERLe2dtiStgY82BvH
=cxX0
-END PGP SIGNATURE-



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Greg Wooledge
On Thu, Jan 16, 2014 at 03:46:38PM -0800, Eduardo A. Bustamante López wrote:
> [[ $'\a' =~ $'\a' ]] -> 0
> [[ $'\a' =~ $'\\\a' ]] -> 0
> [[ $'\a' =~ $'\\[\a]' ]] -> 1
> ---
> [[ $'\177' =~ $'\177' ]] -> 0
> [[ $'\177' =~ $'\\\177' ]] -> 1
> [[ $'\177' =~ $'\\[\177]' ]] -> 1

> Notice that only $'\177' seems to fail in the middle case, while the
> others work just fine.

Maybe I'm missing something, but why do you consider that to be wrong?
I would expect [[ x =~ yx ]] to fail (return 1) every time.  In fact
I'd consider this result:

> [[ $'\a' =~ $'\\\a' ]] -> 0

to be wrong.  And in fact, on my version of bash I get a different result:

imadev:~$ [[ $'\a' =~ $'\\\a' ]] ; echo $?
1
imadev:~$ echo $BASH_VERSION
4.3.0(1)-beta2

I get a return value of 1 in every version of bash I tried except
bash 3.1, which returns 0, and bash 2.05b, which gives a syntax
error because it doesn't have =~ yet.



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Chet Ramey
On 1/16/14 6:46 PM, Eduardo A. Bustamante López wrote:
> The DEL ($'\177') character does not behave like the other control
> characters when used with the regex operator inside the test keyword.

This has to do with the expansion of $r and that $r includes a backslash.
When combined with the internal quoting bash does, and the fact that the
backslash is special to pattern matching, we end up with this problem.
I've only thought about it a little so far, but I don't know if there's a
quick or simple fix.  This may have to wait until after bash-4.3 is
released.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Chet Ramey
On 1/17/14 8:01 AM, Greg Wooledge wrote:
> On Thu, Jan 16, 2014 at 03:46:38PM -0800, Eduardo A. Bustamante López wrote:
>> [[ $'\a' =~ $'\a' ]] -> 0
>> [[ $'\a' =~ $'\\\a' ]] -> 0
>> [[ $'\a' =~ $'\\[\a]' ]] -> 1
>> ---
>> [[ $'\177' =~ $'\177' ]] -> 0
>> [[ $'\177' =~ $'\\\177' ]] -> 1
>> [[ $'\177' =~ $'\\[\177]' ]] -> 1
> 
>> Notice that only $'\177' seems to fail in the middle case, while the
>> others work just fine.
> 
> Maybe I'm missing something, but why do you consider that to be wrong?
> I would expect [[ x =~ yx ]] to fail (return 1) every time. 

There is a question about the correct behavior when y == '\', since the
backslash is special to pattern matching.  When matching a pattern or a
regexp, do you think x =~ \x should succeed, because the backslash acts
as an escape?

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Greg Wooledge
On Fri, Jan 17, 2014 at 08:53:07AM -0500, Chet Ramey wrote:
> On 1/17/14 8:01 AM, Greg Wooledge wrote:
> > I would expect [[ x =~ yx ]] to fail (return 1) every time. 
> 
> There is a question about the correct behavior when y == '\', since the
> backslash is special to pattern matching.  When matching a pattern or a
> regexp, do you think x =~ \x should succeed, because the backslash acts
> as an escape?

OK, I see your point.  Here are some more experiments:

imadev:~$ [[ x =~ \x ]] ; echo $?
0
imadev:~$ bs='\'
imadev:~$ [[ x =~ ${bs}x ]] ; echo $?
0
imadev:~$ [[ x =~ $'\\'x ]] ; echo $?
1

You get to decide which one(s) are bugs. ;-)

I chose the last one because $'...' is a form of quoting, and quoting
on the right hand side of =~ removes the specialness of things, which
muddies the waters greatly.  The use of $'\\...' in the original question
led me to this, which may or may not be a tangential issue.

In order to remove that issue, I would have written the original question
this way:

imadev:~$ x=x; bs='\'; [[ $x = $bs$x ]] ; echo $?
0
imadev:~$ x=$'\177'; bs='\'; [[ $x = $bs$x ]] ; echo $?
1



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Dan Douglas
On Fri, Jan 17, 2014 at 8:07 AM, Greg Wooledge  wrote:
> On Fri, Jan 17, 2014 at 08:53:07AM -0500, Chet Ramey wrote:
>> On 1/17/14 8:01 AM, Greg Wooledge wrote:
>> > I would expect [[ x =~ yx ]] to fail (return 1) every time.
>>
>> There is a question about the correct behavior when y == '\', since the
>> backslash is special to pattern matching.  When matching a pattern or a
>> regexp, do you think x =~ \x should succeed, because the backslash acts
>> as an escape?
>
> OK, I see your point.  Here are some more experiments:
>
> imadev:~$ [[ x =~ \x ]] ; echo $?
> 0
> imadev:~$ bs='\'
> imadev:~$ [[ x =~ ${bs}x ]] ; echo $?
> 0
> imadev:~$ [[ x =~ $'\\'x ]] ; echo $?
> 1
>
> You get to decide which one(s) are bugs. ;-)

Looks a lot like that other issue in which the function of escapes and quotes
are ambiguous depending on whether they're literal and during which expansion
phase and in which context they appear in.

I covered a lot of possibilities with this test but there could be more:
http://permalink.gmane.org/gmane.comp.shells.bash.bugs/20564

IIRC ksh eventually changed some of their behavior and bash might have too. It
applies to both pattern matching and pathname expansion.



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Eduardo A . Bustamante López
On Fri, Jan 17, 2014 at 08:43:46AM -0500, Chet Ramey wrote:
> On 1/16/14 6:46 PM, Eduardo A. Bustamante López wrote:
> > The DEL ($'\177') character does not behave like the other control
> > characters when used with the regex operator inside the test keyword.
> 
> This has to do with the expansion of $r and that $r includes a backslash.
> When combined with the internal quoting bash does, and the fact that the
> backslash is special to pattern matching, we end up with this problem.
> I've only thought about it a little so far, but I don't know if there's a
> quick or simple fix.  This may have to wait until after bash-4.3 is
> released.
> 
> Chet
I understand that the backslash preceding a character *could* make it
to not match, though $'\177' is the *only* non-graphic character that
has this behavior.

This should make it more clear:

ubuntu@ubuntu:~$ for c in $'\001' $'\r' $'\177' $'\277' $'\377'; do
> r="\\$c"; [[ $c =~ $r ]]; printf 'c=%q r=%q %d\n' "$c" "$r" "$?"
> done
c=$'\001' r=$'\\\001' 0
c=$'\r' r=$'\\\r' 0
c=$'\177' r=$'\\\177' 1
c=$'\277' r=$'\\\277' 0
c=$'\377' r=$'\\\377' 0

My issue is more regarding why $'\177' has a different behavior than the other
characters, than if the preceding backslash should make it match or
not.

That is, I would expect either these two outputs:

O1:
c=$'\001' r=$'\\\001' 1
c=$'\r' r=$'\\\r' 1
c=$'\177' r=$'\\\177' 1
c=$'\277' r=$'\\\277' 1
c=$'\377' r=$'\\\377' 1

O2:
c=$'\001' r=$'\\\001' 0
c=$'\r' r=$'\\\r' 0
c=$'\177' r=$'\\\177' 0
c=$'\277' r=$'\\\277' 0
c=$'\377' r=$'\\\377' 0

But the real output shows that the case for c=$'\177' is treated
special:

c=$'\001' r=$'\\\001' 0
c=$'\r' r=$'\\\r' 0
c=$'\177' r=$'\\\177' 1 <-- this one behaves differently.
c=$'\277' r=$'\\\277' 0
c=$'\377' r=$'\\\377' 0


---
Now, regarding the issue of whether the backslash should be treated
in a special way, or treated literally, the only thing I can
contribute is the behavior of GNU sed, which handles non-graphic
characters preceded by a backslash the same as the individual
character:

ubuntu@ubuntu:~$ cat sed
mapfile -t chars < <(
printf '\\x%x\n' {1..255} | while read -r c; do printf "$c"'\n'; done
);

for sed in sed 'sed -r'; do
printf -- '--- sed: %s ---\n' "$sed"
for c in "${chars[@]}"; do
printf '%q > %q\n' "$c" "$(printf %s\\n "$c" | $sed "s/\\$c//" 2>&1)"
done | grep -v "''\$"
done
ubuntu@ubuntu:~$ bash sed
--- sed: sed ---
'' > sed:\ -e\ expression\ #1\,\ char\ 5:\ unterminated\ \`s\'\ command
'' > sed:\ -e\ expression\ #1\,\ char\ 5:\ unterminated\ \`s\'\ command
\' > \'
\( > sed:\ -e\ expression\ #1\,\ char\ 6:\ Unmatched\ \(\ or\ \\\(
\) > sed:\ -e\ expression\ #1\,\ char\ 6:\ Unmatched\ \)\ or\ \\\)
1 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
2 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
3 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
4 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
5 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
6 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
7 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
8 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
9 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
\< > \<
\> > \>
B > B
W > W
\` > \`
a > a
b > b
c > sed:\ -e\ expression\ #1\,\ char\ 6:\ Trailing\ backslash
f > f
n > n
r > r
s > s
t > t
v > v
\{ > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ preceding\ regular\ 
expression
\| > \|
--- sed: sed -r ---
'' > sed:\ -e\ expression\ #1\,\ char\ 5:\ unterminated\ \`s\'\ command
'' > sed:\ -e\ expression\ #1\,\ char\ 5:\ unterminated\ \`s\'\ command
\' > \'
1 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
2 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
3 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
4 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
5 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
6 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
7 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
8 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
9 > sed:\ -e\ expression\ #1\,\ char\ 6:\ Invalid\ back\ reference
\< > \<
\> > \>
B > B
W > W
\` > \`
a > a
b > b

As you can see, all non-graphic characters are treated the same as the
non-graphic character preceded by a backslash. I do not know how
other regex engines treat this case.



---
In case you're interested on why I care about this issue ($'\177'),
see the special case I had to make in the ''requote2'' function for
it to work in this case:

https://github.com/lhunath/scripts/issues/3#issuecomment-32551132



---
Regarding the Cygwin issue:

$ for c in $'\177' $'\200' $'\277' $'\376' $'\377'; do
> r=$c; [[ $c =~ $r ]]; printf 'c=%q r=%q %d\n' "$c" "$r" "$?";
> done;
$ echo "$BASH_VERSION $OS"
c=$'\177' r=$'\177'

Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Chet Ramey
On 1/17/14 2:30 PM, Eduardo A. Bustamante López wrote:
> On Fri, Jan 17, 2014 at 08:43:46AM -0500, Chet Ramey wrote:
>> On 1/16/14 6:46 PM, Eduardo A. Bustamante López wrote:
>>> The DEL ($'\177') character does not behave like the other control
>>> characters when used with the regex operator inside the test keyword.
>>
>> This has to do with the expansion of $r and that $r includes a backslash.
>> When combined with the internal quoting bash does, and the fact that the
>> backslash is special to pattern matching, we end up with this problem.
>> I've only thought about it a little so far, but I don't know if there's a
>> quick or simple fix.  This may have to wait until after bash-4.3 is
>> released.
>>
>> Chet
> I understand that the backslash preceding a character *could* make it
> to not match, though $'\177' is the *only* non-graphic character that
> has this behavior.

No, I understand the problem.  The DEL is special to bash's internal
expansion and quoting (this just came up in another message earlier
this week).  That, combined with the slightly inconsistent word expansion
bash does for the operands in the [[ command and the fact that there is a
backslash in the expansion, results in a problem that seemingly affects
only DEL.  There are other characters that aren't handled exactly right,
including \001, but fortunate circumstances make those cases work.  And
it's not that the backslash gets passed to the regexp engine, but that its
presence causes bash to do the wrong thing when there is a DEL in the
string.

It's not understanding the problem, or the combination of things that
causes it, but figuring out the right solution.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: DEL character treated specially when preceded by a backslash when used in the RHS of the regex operator ([[ $'\177' =~ $'\\\177' ]])

2014-01-17 Thread Eduardo A . Bustamante López
> It's not understanding the problem, or the combination of things that
> causes it, but figuring out the right solution.
> 
> Chet
I see, my mistake. I thought I explained it wrong so I wanted to make
it clear. 

Thanks!

-- 
Eduardo Alan Bustamante López