from:"john"

bug report

2021-04-23 Thread john


From: john
To: bug-bash@gnu.org
Subject: ls dumps bash

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt 
-DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' 
-DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/b
ash.bashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' 
-DNON_INTERACTIVE_LOGIN_SHELLS
uname output: Linux john-arch 5.11.16-arch1-1 #1 SMP PREEMPT Wed, 21 Apr 
2021 17:22:13 + x86_64 GNU/Linux

Machine Type: x86_64-pc-linux-gnu

Bash Version: 5.1
Patch Level: 4
Release Status: release

Description:
    After the two commands as specified, the bash session ends 
unexpectedly


Repeat-By:
    set -e extglob
    ls ?(0)9,v

Issue with parameter and command substitution in case statement

2008-08-22 Thread John

Configuration Information [Automatically generated, do not change]:
Machine: i686
OS: linux-gnu
Compiler: i686-pc-linux-gnu-gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i686' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i686-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib   -msse3 -O2 -march=athlon64 
-pipe
uname output: Linux strythbox 2.6.24.3 #2 Thu Mar 13 10:21:15 EDT 2008 i686 AMD 
Athlon(tm) 64 Processor 3800+ AuthenticAMD GNU/Linux
Machine Type: i686-pc-linux-gnu

Bash Version: 3.2
Patch Level: 39
Release Status: release

Description:
the gnu bash manual indicates:
"The syntax of the case command is:
   case word in [ [(] pattern [| pattern]...) command-list ;;]... esac"
and further that:
"Each pattern undergoes tilde expansion, parameter expansion, command 
substitution, and arithmetic expansion."

While command substitution and parameter expansion occur, the resulting 
strings are not parsed in a fashion consistent with the normal behavior.

Take, for example, the string "1|2".  when used in the form case 2 in \ 
1|2) ... the function execute_case_command, in the strmatch line.  This line is 
executed twice, once being a strmatch (2,1) and the other being strmatch (2,2). 
 Obviously the 2,2 is the match.

On the other hand, when a variable VAR='1|2' is used (case 2 in $var 
... ) the comparison performed is strmatch (2,1|2) which fails.  1|2 is never 
broken into its components.
Repeat-By:

#! /bin/bash

VAR='1|2'
echo test is 'case 2 in $VAR=1|2'
case 2 in
  $VAR) echo good ;;
  *) echo fail ;;
esac

echo
echo test is 'case 2 in 1|2'
case 2 in
  1|2) echo good ;;
  *) echo fail ;;
esac

echo
VAR='2'
echo test is 'case 2 in $VAR=2'
case 2 in
  $VAR) echo good ;;
  *) echo fail ;;
esac

echo
echo test is 'case 2 in 2'
case 2 in
  2) echo good ;;
  *) echo fail ;;
esac

echo
echo test is "case 2 in \`echo '1|2'\`"
case 2 in
  `echo '1|2'`) echo good ;;
  *) echo fail ;;
esac

echo
echo test is 'case 2 in 1|2'
case 2 in
  1|2) echo good ;;
  *) echo fail ;;
esac

Bash history weirdness

2009-10-03 Thread John

Hi,

Bash's history command is behaving oddly.

If I do "history -w" then it writes the current history to
~/.bash_history as expected.

But if I do "history -a" then ~/.bash_history doesn't get changed, and
from the modification time it hasn't been touched at all.

Any ideas for what might be causing this? Any obvious shell options or
suchlike that I should check?

Cheers,
John

Re: Bash history weirdness

2009-10-04 Thread John

On  3 Oct 2009 at 20:19, Wanna-Be Sys Admin wrote:
> John wrote:
>> If I do "history -w" then it writes the current history to
>> ~/.bash_history as expected.
>> 
>> But if I do "history -a" then ~/.bash_history doesn't get changed, and
>> from the modification time it hasn't been touched at all.
>
> Where they "new" history lines, or did you run that right after -w?

Yes, there was history to write!

Actually what happened was that I noticed "history -a" wasn't working,
then tried "history -w" and was surprised to find that it *did* work.

Possible Typo for "Set" Section of Bash Reference Manual

2024-05-06 Thread John


Hi!

I believe the Bash Reference Manual is missing a key note for using "set 
-o".


On the man page for "bash" 
(https://tiswww.case.edu/php/chet/bash/bash.html), the following line is 
present


 * If *-o* is supplied with no /option-name/, the values of the current
   options are printed. If *+o* is supplied with no /option-name/, a
   series of *set* commands to recreate the current option settings is
   displayed on the standard output.

On the corresponding section for "Set" in the Bash Reference Manual 
(https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html), 
that line of text is *not present*.


Best regards!

Re: Prefer non-gender specific pronouns

2021-06-05 Thread John Passaro

I can see a couple reasons why it would be a good thing, and in the con
column only "I personally don't have time to go through the manual and make
these changes". but I'd happily upvote a patch from somebody that does.


On Sat, Jun 5, 2021, 09:24 Vipul Kumar 
wrote:

> Hi,
>
> Isn't it a good idea to prefer non-gender specific pronoun (like "their"
> instead of "his") at following places in the reference manual?
>
>  > leaving the user in a non-writable directory other than his home
> directory after login, [1]
>
>  > his home directory. The name of this file is taken from the value of
> the shell variable [2]
>
> [1]:
>
> https://www.gnu.org/software/bash/manual/html_node/The-Restricted-Shell.html#The-Restricted-Shell
> [2]:
>
> https://www.gnu.org/software/bash/manual/html_node/Readline-Init-File.html#Readline-Init-File
>
>
> Cheers,
> Vipul
>

Re: Prefer non-gender specific pronouns

2021-06-06 Thread John Passaro

Léa, I see that in the section Ilkka quoted you were using it in the
plural. However Ilkka is exactly right; despite "they" being technically
plural, using it for somebody of undetermined gender has been in the
mainstream since long before inclusive language. "Someone left *their*
book, there's no name and I don't know who to call."

The AP and Chicago style guides, hardly reckless proponents of any
progressive vanguard, endorse this usage, though they recommend working
around it if possible ("Somebody left *a* book"). However they do
unequivocally endorse using it for somebody who declares "they" to be their
pronoun (though for now that may not have much bearing on the manual).

On Sun, Jun 6, 2021, 07:49 Léa Gris  wrote:

> Le 06/06/2021 à 11:33, Ilkka Virta écrivait :
> > In fact, that generic 'they' is so common and accepted, that you just
> used
> > it yourself
> > in the part I quoted above.
>
> Either you're acting in bad faith, or you're so confused by your
> gender-neutral delusion that you don't remember that in normal people's
> grammar, "they" is a plural pronoun.
>
> --
> Léa Gris
>
>
>

Re: Light weight support for JSON

2022-08-28 Thread John Passaro

interfacing with an external tool absolutely seems like the correct answer
to me. a fact worth mentioning to back that up is that `jq` exists. billed
as a sed/awk for json, it fills all the functions you'd expect such an
external tool to have and many many more. interfacing from curl to jq to
bash is something i do on a near daily basis.

https://stedolan.github.io/jq/

On Sun, Aug 28, 2022, 09:25 Yair Lenga  wrote:

> Hi,
>
> Over the last few years, JSON data becomes a integral part of processing.
> In many cases, I find myself having to automate tasks that require
> inspection of JSON response, and in few cases, construction of JSON. So
> far, I've taken one of two approaches:
> * For simple parsing, using 'jq' to extract elements of the JSON
> * For more complex tasks, switching to python or Javascript.
>
> Wanted to get feedback about the following "extensions" to bash that will
> make it easier to work with simple JSON object. To emphasize, the goal is
> NOT to "compete" with Python/Javascript (and other full scale language) -
> just to make it easier to build bash scripts that cover the very common use
> case of submitting REST requests with curl (checking results, etc), and to
> perform simple processing of JSON files.
>
> Proposal:
> * Minimal - Lightweight "json parser" that will convert JSON files to bash
> associative array (see below)
> * Convert bash associative array to JSON
>
> To the extent possible, prefer to borrow from jsonpath syntax.
>
> Parsing JSON into an associative array.
>
> Consider the following, showing all possible JSON values (boolean, number,
> string, object and array).
> {
> "b": false,
> "n": 10.2,
> "s: "foobar",
>  x: null,
> "o" : { "n": 10.2,  "s: "xyz" },
>  "a": [
>  { "n": 10.2,  "s: "abc", x: false },
>  {  "n": 10.2,  "s": "def" x: true},
>  ],
> }
>
> This should be converted into the following array:
>
> -
>
> # Top level
> [_length] = 6# Number of keys in object/array
> [_keys] = b n s x o a# Direct keys
> [b] = false
> [n] = 10.2
> [s] = foobar
> [x] = null
>
> # This is object 'o'
> [o._length] = 2
> [o._keys] = n s
> [o.n] = 10.2
> [o.s] = xyz
>
> # Array 'a'
> [a._count] =  2   # Number of elements in array
>
> # Element a[0] (object)
> [a.0._length] = 3
> [a.0._keys] = n s x
> [a.0.n] = 10.2
> [a.0.s] = abc
> [a.0_x] = false
>
> -
>
> I hope that example above is sufficient. There are few other items that are
> worth exploring - e.g., how to store the type (specifically, separate the
> quoted strings vs value so that "5.2" is different than 5.2, and "null" is
> different from null.
>
> I will leave the second part to a different post, once I have some
> feedback. I have some prototype that i've written in python - POC - that
> make it possible to write things like
>
> declare -a foo
> curl http://www.api.com/weather/US/10013 | readjson foo
>
> printf "temperature(F) : %.1f Wind(MPH)=%d" ${foo[temp_f]}, ${foo[wind]}
>
> Yair
>

possible bash bug bringing job. to foreground

2024-02-17 Thread John Larew

Configuration Information [Automatically generated, do not change]:Machine: 
x86_64OS: linux-gnuCompiler: gccCompilation CFLAGS: -g -O2 -flto=auto 
-ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong 
-Wformat -Werror=format-security -Wall uname output: Linux 
HP-ProBook-6450b-500GB 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 
21:02:42 UTC 2023 x86_64 x86_64 x86_64 GNU/LinuxMachine Type: 
x86_64-pc-linux-gnu
Bash Version: 5.1Patch Level: 16Release Status: release
Description: This is an attempted portion of a larger script I am writing. All 
these attempts appear valid; none works.
 BTW, also fails on termux 0.118.0/bash v5.22.6 (HOSTTYPE=aarch64, 
MACHTYPE=aarch64-unknown-linux-android)
Repeat-By:  1: (sleep 15s; set -m; fg %%; exit ) & 2: (sleep 15s; set -m; fg 
%+; exit ) & 
Fix: (sleep 15s; set -m; kill $PPID) &     Not a preferred solution; I prefer a 
smaller hammer.

Re: possible bash bug bringing job. to foreground

2024-02-17 Thread John Larew

After further examination, the examples with "fg $$" and "fg $!" clearly do not 
bring the subshell into the foreground, as they are evaluated prior to the 
subshells background execution.
I'm trying to bring the subshell to the foreground to perform an exit, after a 
delay.
Ultimately, it will be used as part of a terminal emulator inactivity timeout.
I suspected there are advantages to exiting the emulator vs. killing the 
process.
Clearly, I misunderstood. Thanks again.

Re: possible bash bug bringing job. to foreground

2024-02-18 Thread John Larew

I was unaware of TMOUT. Now I have a backup as well. Thanks for tolerating my 
inexperience.

  On Sat, Feb 17, 2024 at 2:54 PM, Greg Wooledge wrote:   On 
Sat, Feb 17, 2024 at 07:41:43PM +, John Larew wrote:
> After further examination, the examples with "fg $$" and "fg $!" clearly do 
> not bring the subshell into the foreground, as they are evaluated prior to 
> the subshells background execution.
> I'm trying to bring the subshell to the foreground to perform an exit, after 
> a delay.
> Ultimately, it will be used as part of a terminal emulator inactivity timeout.

Bash already has a TMOUT variable which will cause an interactive shell
to exit after a specified length of inactivity.  Is that sufficient?
If not, how does your desired solution need to differ from TMOUT?

Re: Potential Bash Script Vulnerability

2024-04-07 Thread John Passaro

if you wanted this for your script - read all then start semantics, as
opposed to read-as-you-execute - would it work to rewrite yourself inside a
function?

function main() { ... } ; main

On Sun, Apr 7, 2024, 22:58 Robert Elz  wrote:

> Date:Mon, 8 Apr 2024 02:50:29 +0100
> From:Kerin Millar 
> Message-ID:  <20240408025029.e7585f2f52fe510d2a686...@plushkava.net>
>
>   | which is to read scripts in their entirety before trying to execute
>   | the resulting program. To go about it that way is not typical of sh
>   | implementations, for whatever reason.
>
> Shells interpret their input in much the same way, regardless of
> from where it comes.   Would you really want your login shell to
> just collect commands that you type (possibly objecting to those
> with syntax errors) but executing none of them (including "exit")
> until you log out (send EOF) ?
>
> kre
>
>
>
>

Permission denied to execute script that is world executable

2011-06-18 Thread John Williams

I find that I cannot execute world-executable scripts when they are in
a directory which is mounted on a drive on an HBA (host bus adapter
card) in pass-thru mode, but the exact same scripts are executable
when they are in a directory on my boot drive (connected to
motherboard SATA).

Is this a bash bug, or intentional behavior?

The drives connected through the HBA work well for all other purposes
I have tried, except executing scripts. The only oddity I have noticed
is that the identification info stored by the kernel driver
(megaraid_sas) during boot cannot be read by hdparm, although hdparm
has no problem reading the info directly. Apparently, the megaraid_sas
kernel driver is not perfect, but why should this affect bash?

In the output below, the behavior is identical if the first line calls
bash, sh (symlink to bash), or zsh. I left it at zsh (below) to
eliminate ambiguity between the active shell (bash) and the script
shell.

What causes this behavior?

***

cmd$ pwd
/tmp

cmd$ ls -al
total 16
drwxrwxrwt  3 rootroot  4096 Jun 18 13:13 .
drwxr-xr-x 25 rootroot  4096 May  1 21:26 ..
-rwxr-xr-x  1 joe users   23 Jun 18 13:12 okay
drwx--  3 joe users 4096 Jun 17 19:13 spool_joe

cmd$ cat okay
#!/bin/zsh
echo "okay"

cmd$ ./okay
okay

cmd$ /tmp/okay
okay

cmd$ cp okay /mnt/r6c4/

cmd$ /mnt/r6c4/okay
bash: /mnt/r6c4/okay: Permission denied

cmd$ cd /mnt/r6c4

cmd$ ls -al
total 8
drwxr-xr-x  2 joe users8 Jun 18 13:15 .
drwxr-xr-x 26 joe users 4096 Jun 16 22:44 ..
-rwxr-xr-x  1 joe users   23 Jun 18 13:15 okay

cmd$ ./okay
bash: ./okay: Permission denied

cmd$ /bin/bash okay
okay

cmd$ . okay
okay

cmd$ df /mnt/r6c4
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdh1   2.8T  350M  2.8T   1% /mnt/r6c4

$ sudo hdparm -i /dev/sdh

/dev/sdh:
 HDIO_GET_IDENTITY failed: Invalid argument

$ sudo hdparm -I /dev/sdh

/dev/sdh:

ATA device, with non-removable media
Model Number:   Hitachi HDS5C3030ALA630
Serial Number:  MJ1321YNG1ABJA
Firmware Revision:  MEAOA580
Transport:  Serial, ATA8-AST, SATA 1.0a, SATA II Extensions,
SATA Rev 2.5, SATA Rev 2.6; Revision: ATA8-AST T13 Project D1697
Revision 0b
Standards:
Used: unknown (minor revision code 0x0029)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders   16383   16383
heads   16  16
sectors/track   63  63
--
CHS current addressable sectors:   16514064
LBAuser addressable sectors:  268435455
LBA48  user addressable sectors: 5860533168
Logical  Sector size:   512 bytes
Physical Sector size:   512 bytes
device size with M = 1024*1024: 2861588 MBytes
device size with M = 1000*1000: 3000592 MBytes (3000 GB)
cache/buffer size  = 26129 KBytes (type=DualPortCache)
Form Factor: 3.5 inch
Nominal Media Rotation Rate: 5700
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16  Current = 0
Advanced power management level: disabled
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
 Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
 Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
Enabled Supported:
   *SMART feature set
Security Mode feature set
   *Power Management feature set
   *Write cache
   *Look-ahead
   *Host Protected Area feature set
   *WRITE_BUFFER command
   *READ_BUFFER command
   *NOP cmd
   *DOWNLOAD_MICROCODE
Advanced Power Management feature set
Power-Up In Standby feature set
   *SET_FEATURES required to spinup after power up
SET_MAX security extension
   *48-bit Address feature set
   *Device Configuration Overlay feature set
   *Mandatory FLUSH_CACHE
   *FLUSH_CACHE_EXT
   *SMART error logging
   *SMART self-test
Media Card Pass-Through
   *General Purpose Logging feature set
   *WRITE_{DMA|MULTIPLE}_FUA_EXT
   *64-bit World wide name
   *URG for READ_STREAM[_DMA]_EXT
   *URG for WRITE_STREAM[_DMA]_EXT
   *WRITE_UNCORRECTABLE_EXT command
   *{READ,WRITE}_DMA_EXT_GPL commands
   *Segmented DOWNLOAD_MICROCODE
unknown 119[7]
   *Gen1 signaling speed (1.5Gb/s)
   *Gen2 signaling speed (3.0Gb/s)
   *Gen3 signaling speed (6.0Gb/s)
   *Native Command Queueing (NCQ)

Re: Permission denied to execute script that is world executable

2011-06-18 Thread John Williams

On Sat, Jun 18, 2011 at 1:51 PM, Jan Schampera  wrote:

> Can you show the mount options of the filesystem?

Good call, I should have though of that. I had all the filesystems on
the non-boot drives mounted with the "user" option, which I just
learned from the mount man page also activate "noexec". I changed the
mount option to "user,exec" and now all is fine. Sorry for the false
alarm!

***

cmd$ mount -l
/dev/sdh1 on /mnt/r6c4 type jfs (rw,noexec,nosuid,nodev,noatime) [R6C4]

RFE: indirection_level_string() preserves 'x' flag [PATCH]

2011-09-08 Thread John Reiser

Please enhance the function indirection_level_string() in print_cmd.c
so that the value of the 'x' flag is preserved over the call to
decode_prompt_string().  This will permit the use of function
indirection_level_string() as a tool to track usage of system calls
by shell scripts.

For example, this code prints the script location of every fork().
-
// bash-syspose.c - interpose so that bash shell prints $PS4 upon syscall
// Copyright 2011 John Reiser, BitWagon Software LLC.  All rights reserved.
// Licensed under GNU General Public License, version 3 (GPLv3).
//
// Requires: change indirection_level_string() in bash/print_cmd.c
//to preserve 'x' flag around decode_prompt_string():
/*
--- bash-4.2/print_cmd.c2011-09-08 13:27:53.877244584 -0700
+++ new/print_cmd.c 2011-09-08 13:29:34.088991764 -0700
@@ -426,9 +426,9 @@
   if (ps4 == 0 || *ps4 == '\0')
 return (indirection_string);

-  change_flag ('x', FLAG_OFF);
+  int const old = change_flag ('x', FLAG_OFF);
   ps4 = decode_prompt_string (ps4);
-  change_flag ('x', FLAG_ON);
+  if (old) change_flag ('x', FLAG_ON);

   if (ps4 == 0 || *ps4 == '\0')
 return (indirection_string);
*/
// Build:  gcc -shared -o bash-syspose.so -g -O -fPIC bash-syspose.c
// Run:  LD_PRELOAD=./bash-syspose.so  bash  ...  # "unset LD_PRELOAD" ASAP!
// EXAMPLE: (insert near beginning of shell script)
//   unset LD_PRELOAD
//   export PS4=' ${FUNCNAME[0]}@$LINENO < 
${BASH_SOURCE[1]##*/}:${BASH_LINENO[0]} :'
//
// Obviously other syscalls could be handled, too.

#define _GNU_SOURCE 1  /* need RTLD_NEXT from dlfcn.h */
#include 
#include 
#include 

// weak: Test to prevent crash if no such function, such as when
// LD_PRELOAD remains set and a child process of bash invokes a
// random executable that does a fork().
extern char const *indirection_level_string(void) __attribute__((weak));

int fork(void)
{
static int (*real_fork)(void);

if (!real_fork)
real_fork = (int (*)(void)) dlsym(RTLD_NEXT, "fork");

// Avoid infinite recursion in case evaluating PS4 does a fork().
static int recur;
char const *ils = 0;
if (!recur++ && indirection_level_string) {
ils = indirection_level_string();  // evaluate $PS4
}
if (!--recur && ils) {
write(2, ils, strlen(ils));
write(2, "fork\n", 5);
}

// The child might want to remove us (bash-syspose) from LD_PRELOAD.
// But it is ugly, so rely on 'weak' for now.
return (*real_fork)();
}

// end-of-file bash-syspose.c
-

--

Re: How to match regex in bash? (any character)

2011-09-26 Thread John Reiser

Peng Yu wrote:
> I know that I should use =~ to match regex (bash version 4).
> 
> However, the man page is not very clear. I don't find how to match
> (matching any single character). For example, the following regex
> doesn't match txt. Does anybody know how to match any character
> (should be '.' in perl) in bash.
> 
> [[ "$1" =~ "xxx.txt" ]]

The manual page for bash says that the rules of regex(3) apply:

  An additional binary operator,  =~,  is  available,  with  the  
same
  precedence  as  == and !=.  When it is used, the string to the 
right
  of the operator is considered an  extended  regular  expression  
and
  matched  accordingly (as in regex(3)).  The return value is 0 if 
the
  string matches the pattern, and 1 otherwise.
and also:
   Any part of the pattern may be quoted to force it to be matched 
as a
  string.

Thus in the expression   [[ "$1" =~ "xxx.txt" ]]   the fact that the pattern
is quoted [here the whole pattern appears within double quotes] has turned the
dot '.' into a plain literal character, instead of a meta-character which 
matches
any single character.

The usual method of avoiding quotes in the pattern is to omit them:
  [[ $1 =~ xxx.txt ]]   # the dot '.' in the pattern is a 
meta-character
or to use a variable:
  pattern="xxx.txt"   # a 7-character string
  [[ $1 =~ $pattern ]]   # the dot '.' in $pattern is a 
meta-character
Example: using all literals in an instance of bash:
   $ [[ txt =~ xxx.txt ]] && echo true
   true
   $

Also notice that quotes are not needed around the left-hand side  $1 :
   Word 
split‐
  ting and pathname expansion are not performed on the  words  
between
  the  [[  and  ]] ...

Thus there is no need to use quotation marks to suppress word splitting
inside double brackets [[ ... ]].

--

UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
-DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
-I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
\u and \U incorrectly encode values between \u80 and \uff

Repeat-By:
  printf '%q\n' "$(printf '\uff')"
  printf '%q\n' $'\uff'
  # outputs $'\377' instead of $'\303\277'

Fix:
iff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..b155160 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,7 +859,7 @@ tescape (estart, cp, lenp, sawc)
 *cp = '\\';
 return 0;
   }
-if (uvalue <= UCHAR_MAX)
+if (uvalue <= CHAR_MAX)
   *cp = uvalue;
 else
   {
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..2e6e37b 100644
--- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -144,7 +144,7 @@ ansicstr (string, len, flags, sawc, rlen)
   *r++ = '\\';/* c remains unchanged */
   break;
 }
-  else if (v <= UCHAR_MAX)
+  else if (v <= CHAR_MAX)
 {
   c = v;
   break;

Re: UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I know
To be hones I get a bad feeling with that code, I'm guessing it was
done for performance reasons, Personally I'd just remove the special
handling of any values, and always call the encoding function, but was
trying for a minimalist solution.
I mean you could do something like

#define MAX_SINGLE_BYTE_UTF8 0x7F
if (uvalue <= MAX_SINGLE_BYTE_UTF8)

I'm guessing the code was done originally for UTF-2 encoding.

what I suggest will fix the UTF-8 case and not affect the UTF-2 case.


On 02/18/2012 11:11 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> Fix: iff --git a/builtins/printf.def b/builtins/printf.def index 
>> 9eca215..b155160 100644 --- a/builtins/printf.def +++ 
>> b/builtins/printf.def @@ -859,7 +859,7 @@ tescape (estart, cp, 
>> lenp, sawc) *cp = '\\'; return 0; } -if (uvalue <= UCHAR_MAX)
>> +if (uvalue <= CHAR_MAX)
> 
> CHAR_MAX has nothing at all to do with UTF-8.
> 
> Andreas.
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP3u8AAoJEKUDtR0WmS056GIH/1TO/A8RmRCfTU3llNG1tMJy
MJiby2gdvz2v/Q+Y83llCU01fcQ1tGpp2iOO7rbfYmfdqiJ8iMfNc1pK302Tb77u
HcZSSVQKnBwNpL6eeAhwLVzrpfdcKWY/diQknsiXLtrm0AcPhsrf5Bu/OgHjeu7m
3uyqlcQAvYVKj5Z4eV75Hn1+lrCp26fkjZSOZPN9AH8yv1chQXrYPB+/Wj82Cp/S
sSgupvpmAv3b4HaZhXsA2DPxEEb2ESj/ZaHMC4/AxyABJoub++erxm/k8r3iUDjc
rud6jWoVJcwt+UkVyqi8V8qIJ/urVG01FVoVXTYIiqA73ZdJ3fkLw0PCmliZMtA=
=pZin
-END PGP SIGNATURE-

Re: UTF-8 Encode problems with \u \U

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/18/2012 11:29 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> what I suggest will fix the UTF-8 case
> 
> No, it won't.
> 
>> and not affect the UTF-2 case.
> 
> That is impossible.
> 
> Andreas.
> 

Current code
if (uvalue <= UCHAR_MAX)
  *cp = uvalue;
else
  {
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;
  }

Robust Code
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;

Compromise solution
if (uvalue <= 0x7f)
  *cp = uvalue;
else
  {
temp = u32cconv (uvalue, cp);
cp[temp] = '\0';
if (lenp)
  *lenp = temp;
  }

How can doing a direct assignment, in less cases break anything, if it
does u32cconv is broken.

And it does work for me, so impossible seems to be overstating it.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP39rAAoJEKUDtR0WmS052JIH/09at08oGR16hvj2blL4YxWJ
V1Slbkh9O8pJ4DV9NOwEweIpjAxYUzRFzOEVV0tiYzeqISJ36uKnttewiP5VcRSv
heS6QwOl5R3wnx0ecNkpLMo2nT054Fqd+OHSHFOgkBeAM28PVwjT+GmfFyCp1f4K
hPevpejPLyxHYWaXJwy4+1XN0Wp/YatzEXr21pHgU7CPyMGYLbju4su0kNpYledj
5Zo3tT/cvoBGVysJo5AbQ8D07cG85eoARxz6erJatjKDKCUPl1kKdcikG3nGvnQc
66HdR/lJRShDh344uss6/4sw2R9LFut0QP+ChhJowQ9ZBI1uZo7/fn0gQv7gOdo=
=fXLm
-END PGP SIGNATURE-

Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
- -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
- -I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
Current u32toutf8 only encode values below 0x correctly.
wchar_t can be ambiguous size better in my opinion to use
unsigned long, or uint32_t, or something clearer.
Repeat-By:
  ---'

Fix:
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..3f7d378 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -54,7 +54,7 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+static int u32init = 0;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
@@ -115,26 +115,61 @@ u32tochar (wc, s)
 }

 int
- -u32toutf8 (wc, s)
- - wchar_t wc;
+u32toutf8 (c, s)
+ unsigned long c;
  char *s;
 {
   int l;

- -  l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3);
- -
- -  if (wc < 0x0080)
- -s[0] = (unsigned char)wc;
- -  else if (wc < 0x0800)
+  if (c <= 0x7F)
+{
+  s[0] = (char)c;
+  l = 1;
+}
+  else if (c <= 0x7FF)
+{
+  s[0] = (c >>   6)| 0xc0; /* 110x  */
+  s[1] = (c& 0x3f) | 0x80; /* 10xx  */
+  l = 2;
+}
+  else if (c <= 0x)
+{
+  s[0] =  (c >> 12) | 0xe0; /* 1110  */
+  s[1] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[2] =  (c& 0x3f) | 0x80; /* 10xx  */
+  l = 3;
+}
+  else if (c <= 0x1F)
 {
- -  s[0] = (wc >> 6) | 0xc0;
- -  s[1] = (wc & 0x3f) | 0x80;
+  s[0] =  (c >> 18) | 0xf0; /*  0xxx */
+  s[1] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 4;
+}
+  else if (c <= 0x3FF)
+{
+  s[0] =  (c >> 24) | 0xf8; /*  10xx */
+  s[1] = ((c >> 18) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[4] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 5;
+}
+  else if (c <= 0x7FFF)
+{
+  s[0] =  (c >> 30) | 0xfc; /*  110x */
+  s[1] = ((c >> 24) & 0x3f) | 0x80; /* 10xx  */
+  s[2] = ((c >> 18) & 0x3f) | 0x80; /* 10xx  */
+  s[3] = ((c >> 12) & 0x3f) | 0x80; /* 10xx  */
+  s[4] = ((c >>  6) & 0x3f) | 0x80; /* 10xx  */
+  s[5] = ( c& 0x3f) | 0x80; /* 10xx  */
+  l = 6;
 }
   else
 {
- -  s[0] = (wc >> 12) | 0xe0;
- -  s[1] = ((wc >> 6) & 0x3f) | 0x80;
- -  s[2] = (wc & 0x3f) | 0x80;
+  /* Error Invalid UTF-8 */
+  l = 0;
 }
   s[l] = '\0';
   return l;
@@ -150,7 +185,7 @@ u32cconv (c, s)
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP3/tAAoJEKUDtR0WmS059CcH/iIyBOGhf0IgSmnIFyw0YLpA
3ZWSaXWoEZodrDr1fX67hj2424icXm9fTZw70G+rS1YjtCfm86O/Qou4VNROylAv
TbjPUWkHRWVci7IqcDGb1tNWRrulxUvNFA/Uc1xBtKckAO6HHHRTYFa+sCkd5Fnx
dm7e0iMTqMMmL/dUwB+di+hSkGD+ZXS1vY76wizdwG7CteUxAVunse+ffP7TRYbn
K86Whc7p7llG12hruCPGArc9iS7YiBaC/XNIKXmN7fn93dhQTcdzzk/UTGmaZgDk
cQk4R7/NBljP4LtQtKwX4JYAi5XJM5TeSLykL97UFxW/5OGM+SmSVJbKLlHU/mQ=
=EJUb
-END PGP SIGNATURE-

Questionable code behavior in u32cconv?

2012-02-18 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
- -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
- -I../bash/lib   -g -O2 -Wall
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
Now I may be misreading the code but it looks like the code relating
to iconv is only checking the destination charset the first time, the
code is executed.

as such breaking the following functionality.
LC_CTYPE=C printf '\uff'
LC_CTYPE=C.UTF-8 printf '\uff'

Repeat-By:
haven't seen the problem.

Fix:
  Not so much a fix as a modification that should hopefully clarify my
concern.



diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..3f7d378 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -54,7 +54,7 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
@@ -115,26 +115,61 @@ u32tochar (wc, s)
 }

@@ -150,7 +185,7 @@ u32cconv (c, s)
   wchar_t wc;
   int n;
 #if HAVE_ICONV
- -  const char *charset;
+  const char *ncharset;
   char obuf[25], *optr;
   size_t obytesleft;
   const char *iptr;
@@ -171,20 +206,22 @@ u32cconv (c, s)
   codeset = nl_langinfo (CODESET);
   if (STREQ (codeset, "UTF-8"))
 {
   n = u32toutf8 (wc, s);
   return n;
 }
 #endif

 #if HAVE_ICONV
- -  /* this is mostly from coreutils-8.5/lib/unicodeio.c */
- -  if (u32init == 0)
- -{
 #  if HAVE_LOCALE_CHARSET
- -  charset = locale_charset ();   /* XXX - fix later */
+  ncharset = locale_charset ();/* XXX - fix later */
 #  else
- -  charset = stub_charset ();
+  ncharset = stub_charset ();
 #  endif
+  /* this is mostly from coreutils-8.5/lib/unicodeio.c */
+  if (STREQ (charset, ncharset))
+{
+  /* Free Old charset str ? */
+  charset=ncharset;
   if (STREQ (charset, "UTF-8"))
utf8locale = 1;
   else
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPP5SCAAoJEKUDtR0WmS05L8QH/RUz/X8QZk7HXDIFUTCd0Eah
MkfWpCtib9Jt5jUBcb+/UZKiwTSxYGm7D9X08Tpho+i7c+3kknWUGTkivqg7eVo4
TlRA+N4k3x8PdpbYPFNGxgy9LRSViQjqbbzNfYaX+Pbi2YIbZRuaPBipEdbvBqDG
bN7KaUM/97vZicZn5SOrhcDiq1RfJosdTkr7egEON4P4BBIXIVk4vRcCF/xXCw6M
w2BmvpavV3ra1TXhYN2C678qMyncq5kr8e0tvIl4EY6oCurMlvXhoNkOcz14fOMa
XrYJUu1dDNKXmTsJFjDGZhyzvTejLVezjn91/so2OINinqHW++2IMFim5ED9w28=
=rW+v
-END PGP SIGNATURE-

Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?

2012-02-19 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Can somebody explain to me what u32tochar is trying to do?

It seems like dangerous code?

from the context i'm guessing it trying to make a hail mary pass at
converting utf-32 to mb (not utf-8 mb)


int
u32tochar (x, s)
 unsigned long c;
 char *s;
{
  int l;

  l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4);

  if (x <= UCHAR_MAX)
s[0] = x & 0xFF;
  else if (x <= USHORT_MAX) /* assume unsigned short = 16 bits */
{
  s[0] = (x >> 8) & 0xFF;
  s[1] = x & 0xFF;
}
  else
{
  s[0] = (x >> 24) & 0xFF;
  s[1] = (x >> 16) & 0xFF;
  s[2] = (x >> 8) & 0xFF;
  s[3] = x & 0xFF;
}
  /* s[l] = '\0';  Overwrite Buffer?*/
  return l;
}

Couple problems with that though
firstly utf-32 doesn't map directly to non utf mb locals. So you need
a translation mechanism.
Secondly Normal CJK system are state based systems so mutibyte
sequences need to be escaped. Extended Unix Code would need encoding
somewhat like utf-8, in fact any variable multi byte encoding system
is going to need some context to recover the info this is unparsable
behavior,

what it is actually doing is taking utf-32 and depending on the size
encoding it as UTF-32 Big Endian , UTF-16 Big Endian, UTF-8, or
American EAscii codepage(values between 0x80 - 0xff). Choosing one of
these is however Dependant on LC_CTYPE not some arbitrary check.

So this function just seems plain crazy?
I  think that all it can safely do is this.
int
utf32tomb (x, s)
 unsigned long c;
 char *s;
{

  if (x <= 0x7f ) /* x>=0x80 = locale specific */
 {
 s[0] = x & 0xFF;
 return 1;
 }
  else
return 0
}



regarding naming convention u32 = unsigned 32 bit
might be a good idea to rename all the utf32 functions to utf32, would
I think save a lot of confusion in the code as to what is going on.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQXKxAAoJEKUDtR0WmS054sgH/R+qWtds9MMeN/y4n98wk83l
MAOVBXAn+m8IUf31VtSZ7nqEccJHDPDRMkg21sYNlozsxPVwCYOGZd7LL8wxlwEl
70mRu9cAQOXIAeF9b8ao0/nz6e6nC6FTk03FDhDo+V8RWt9MiQHF4YWRCCmSdmQv
GDM88XyXuQZaBwIHrXeCXRvuXTN8K5BrdbVFJ7OHRUytKNE6OccUDz/iaPCoPy5f
SehHTLJ6AqpYy7NgapyALTvo3/FlVUDc7vtYbCDF5Q0EMIlvjgEQ9Y7vJlKtuAop
9Up32sQSy8red6frOgZmvA5GLeD7Lp/gvfp/U5fQWIZTKKLgBee2mYVqPlLOKw4=
=nHdc
-END PGP SIGNATURE-

bug in stub_charset rollup diff of changes to unicode code.

2012-02-20 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
  stub_charset
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else if locale=UTF-8
   return UTF-8
  else
   return ASCII

should be
  if locale == '\0'
return ASCII
  else if locale=~m/.*\.(.*)(@.*)/
   return $1
  else
   return locale
 because its output is only being used in iconv, so let it decide if the
locale makes sense.




   I've attached a diff of all my changes to the unicode code.
   Including
   renamed u2cconv to utf32tomb
   move special handling of ascii charcter to start of function and
remove related call wrapper code.
   tried to reationalize the code in utf32tomb so its easier to read and
understand what is happening.
   added utf32toutf16
   use utf32toutf16 in case wchar_t=2 with wctomb

  removed dangerious code that was using iconv_open (charset, "ASCII");
as fallback. pointless anyway as we already assign a ascii value if posible.

  added warning message if encode fails

always terminate mb output string.


haven't started to test these changes yet firstly would like to know
if these changes are acceptable, any observations, I'm still reviewing
it myself for consistency.

Plus can somebody tell me how this was tested originally? I've got
some ideas myself but would like to know what has already been done in
that direction.

Repeat-By:
  .

Fix:
diff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..3680419 100644
- --- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc)
*cp = '\\';
return 0;
  }
- - if (uvalue <= UCHAR_MAX)
- -   *cp = uvalue;
- - else
- -   {
- - temp = u32cconv (uvalue, cp);
- - cp[temp] = '\0';
- - if (lenp)
- -   *lenp = temp;
- -   }
+   temp = utf32tomb (uvalue, cp);
+   if (lenp)
+ *lenp = temp;
break;
 #endif

diff --git a/externs.h b/externs.h
index 09244fa..8868b55 100644
- --- a/externs.h
+++ b/externs.h
@@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int,
unsigned int));
 extern unsigned int fsleep __P((unsigned int, unsigned int));

 /* declarations for functions defined in lib/sh/unicode.c */
- -extern int u32cconv __P((unsigned long, char *));
+extern int utf32tomb __P((unsigned long, char *));

 /* declarations for functions defined in lib/sh/winsize.c */
 extern void get_new_window_size __P((int, int *, int *));
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..495d9c4 100644
- --- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -144,16 +144,10 @@ ansicstr (string, len, flags, sawc, rlen)
  *r++ = '\\';  /* c remains unchanged */
  break;
}
- -   else if (v <= UCHAR_MAX)
- - {
- -   c = v;
- -   break;
- - }
  else
{
- -   temp = u32cconv (v, r);
- -   r += temp;
- -   continue;
+ r += utf32tomb (v, r);
+ break;
}
 #endif
case '\\':
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..9a557a9 100644
- --- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -36,14 +36,6 @@

 #include 

- -#ifndef USHORT_MAX
- -#  ifdef USHRT_MAX
- -#define USHORT_MAX USHRT_MAX
- -#  else
- -#define USHORT_MAX ((unsigned short) ~(unsigned short)0)
- -#  endif
- -#endif
- -
 #if !defined (STREQ)
 #  define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0)
 #endif /* !STREQ */
@@ -54,13 +46,14 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif

- -static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
 #endif

 #ifndef HAVE_LOCALE_CHARSET
+static char CType[40]={0};
 static char *
 stub_charset ()
 {
@@ -69,6 +62,7 @@ stub_charset ()
   locale = get_locale_var ("LC_CTYPE");
   if (locale == 0 || *locale == 0)
 return "ASCII";
+  strcpy(CType, locale);
   s = strrchr (locale, '.');
   if (s)
 {
@@ -77,159 +71,230 @@ stub_charset ()
*t = 0;
   return ++s;
 }
- -  else if (STREQ (locale, "UTF-8"))
- -return "UTF-8";
   else
- -return "ASCII";
+return CType;
 }
 #endif

- -/* u32toascii ? */
 int
- -u32tochar (wc, s)
- - wchar_t wc;
+utf32_2_utf8 (c, s)
+ unsigned lon

Bug? in bash setlocale implementation

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
- -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
- -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/s
uname output: Linux DETH00 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20
17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
  Basically if setting the locale fails variable should not be changed.

 Consider


export LC_CTYPE=

bash -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"'
bash: warning: setlocale: LC_CTYPE: cannot change locale (ISO-8859-1):
No such file or directory
ISO-8859-1

ksh93 -c 'LC_CTYPE=ISO-8859-1 eval printf "\${LC_CTYPE:-unset}"'
ISO-8859-1: unknown locale
unset
ksh93 -c 'LC_CTYPE=C.UTF-8 eval printf "\${LC_CTYPE:-unset}"'
C.UTF-8

  the advantage being you can check in the script if the local change
worked.
  e.g.
  LC_CTYPE=ISO-8859-1
  [ "${LC_CTYPE:-}" = "ISO-8859-1" ] || error exit
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQ1sbAAoJEKUDtR0WmS05dDEH+wf+Gix7NnSZ6WvwOt6ZRmlv
/BXr94coQ1I6ODCXXAG0ExgqNs81gJ58N1xw0nBO/qMpJ1CWv+t5Gc+FP37RK9GK
aZbrT6yYAueg/lz58o7hg76oRKVmOpzaYxdquC4dMKa8K1kEdxNyyO4Qxa8a/TNP
qLC79kvBl/23CESRomZdhUpOOjTdzhiEo6njLxDmluhzA+U/WsMD1Zp7TJih30gu
okkJESAwSsEoo8QIeFbzOFa/qEZQH05SwY0CoYO+OPC0qlNR/Jar9cAJhTpHfxjg
bLYXSNlqs5ZCgbmUCypnOWpOktUVPNxpXabNTjWPwAnekEY8Ms4BR6XkG+yuclk=
=+Z4p
-END PGP SIGNATURE-

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/21/2012 01:34 PM, Eric Blake wrote:
> On 02/20/2012 07:42 PM, Chet Ramey wrote:
>> On 2/18/12 5:39 AM, John Kearney wrote:
>> 
>>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>>> 
>>> Description: Current u32toutf8 only encode values below 0x
>>> correctly. wchar_t can be ambiguous size better in my opinion
>>> to use unsigned long, or uint32_t, or something clearer.
>> 
>> Thanks for the patch.  It's good to have a complete
>> implementation, though as a practical matter you won't see UTF-8
>> characters longer than four bytes.  I agree with you about the
>> unsigned 32-bit int type; wchar_t is signed, even if it's 32
>> bits, on several systems I use.
> 
> Not only can wchar_t can be either signed or unsigned, you also
> have to worry about platforms where it is only 16 bits, such as
> cygwin; on the other hand, wint_t is always 32 bits, but you still
> have the issue that it can be either signed or unsigned.
> 
signed / unsigend isn't really the problem anyway utf-8 only encodes
up to 0x7fff  and utf-16 only encodes up to 0x0010 .

In my latest version I've pretty much removed all reference to wchar_t
in unicode.c. It was unnecessary.

However I would be interested in something like utf16_t or uint16_t
currently using unsigned short which is intelligent but works.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPQ593AAoJEKUDtR0WmS05g0wH/RPQMl1mfUdJBfzv5QkUtVSG
ibezTe3/b7/9h8SG3LLrv2FiPS+FtcCbE4n8tUror3V1BHomsQHZdlj/Zshi8W/n
YDl5ac5nc0rrOlw+SJxyCAJl9vHeEAXavjGw8m0KUv/vn0tZyWNM0RYXc7tRxJU2
uqY7G5sGLUt8uGuswCmSmucKjoB7guiUbsmTR+OzgDgKxuuSeQBr6/oIImo721pk
nI5TYdqerPGCIMJoYPeZChCBAZ/WhK9i3C3/SxKme4zWnjySaDw3NH0yfqFHl4Ts
IIOT4fYpm0h62U76+NJSPGWfadTd8UL4A/Jy4I3IwUS+mflwdU0Pu2zmwb8I+Xk=
=pkAF
-END PGP SIGNATURE-

Initial test code for \U

2012-02-21 Thread John Kearney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Initial code for testing \u functionality.
basically uses arrays that look like this
jp_JP_SHIFT_JIS=(
  #Unicode="expected bmstring"
  [0x0001]=$'\x01' #  START OF HEADING
  [0x0002]=$'\x02' #  START OF TEXT
...
)
TestCodePage ja_JP.SHIFT_JIS jp_JP_SHIFT_JIS

in error output looks like this
Error Encoding U+00FB to C.UTF-8 [ "$'\303\273'" != "$'\373'" ]
Error Encoding U+00FC to C.UTF-8 [ "$'\303\274'" != "$'\374'" ]
Error Encoding U+00FD to C.UTF-8 [ "$'\303\275'" != "$'\375'" ]
Error Encoding U+00FE to C.UTF-8 [ "$'\303\276'" != "$'\376'" ]
Error Encoding U+00FF to C.UTF-8 [ "$'\303\277'" != "$'\377'" ]
Failed 128 of 1378 Unicode tests

or if its all ok like this
Passed all 1378 Unicode tests


should make it relatively easy to verify functionality on different
targets etc.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPRBWIAAoJEKUDtR0WmS05WigH/1iXidormw3aj+bBJZDEYv33
BL98n1irF4C9ZNNPc95UfPvDjqVUhpQrWx+/Pa6BH9m9zSd5cSqZ7xmgUH9mzg2p
JkqbiTzg0+lb714BBopTyZMRajqXMrQGx5nJTzOwuMNhs7cgrHPtPdOUrkcB0OJ2
UR5e0T1MWx8RR6lOgkXu0Gt3nQqYtnes+8y8fGGbbHfFrxJMaOjegjdN87+Q6N0U
Cl0uVH9JT8V6IEU1Q4EddjuuqyBr1c8soXd9XjeCPXVdc3XSJ5b/XB8Sdh7uW8FW
x3UbaNrhaReX8XF0xHMoPvIJQFmQE469RpXERWmZzWpGnXrXCvEpxmVQXK2CWhY=
=Cm29
-END PGP SIGNATURE-
ErrorCnt=0
TestCnt=0

  function check_valid_var_name {
case "${1:?Missing Variable Name}" in
  [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;;
esac
  }
  # get_array_element VariableName ArrayName ArrayElement
  function get_array_element {
check_valid_var_name "${1:?Missing Variable Name}" || return $?
check_valid_var_name "${2:?Missing Array Name}" || return $?
eval "${1}"'="${'"${2}"'["${3:?Missing Array Index}"]}"'
  }
  # unset_array_element VarName ArrayName
  function get_array_element_cnt {
check_valid_var_name "${1:?Missing Variable Name}" || return $?
check_valid_var_name "${2:?Missing Array Name}" || return $?
eval "${1}"'="${#'"${2}"'[@]}"'
  }


function TestCodePage {
local TargetCharset="${1:?Missing Test charset}"
local EChar RChar TCnt
get_array_element_cnt TCnt "${2:?Missing Array Name}"
for (( x=1 ; x<${TCnt} ; x++ )); do
  get_array_element EChar "${2}"  ${x}
  if [ -n "${EChar}" ]; then
	let TestCnt+=1
	printf -v UVal '\\U%08x' "${x}"
	LC_CTYPE=${TargetCharset} printf -v RChar "${UVal}"
	if [ "${EChar}" != "${RChar}" ]; then
	  let ErrorCnt+=1
	  printf "Error Encoding U+%08X to ${TL} [ \"%q\" != \"%q\" ]\n" "${x}" "${EChar}" "${RChar}"
	fi
  fi
done
}


#for ((x=1;x<255;x++)); do printf ' [0x%04x]=$'\''\%03o'\' $x $x ; [ $(($x%5)) = 0 ] && echo; done
fr_FR_ISO_8859_1=(
 [0x0001]=$'\001' [0x0002]=$'\002' [0x0003]=$'\003' [0x0004]=$'\004' [0x0005]=$'\005'
 [0x0006]=$'\006' [0x0007]=$'\007' [0x0008]=$'\010' [0x0009]=$'\011' [0x000a]=$'\012'
 [0x000b]=$'\013' [0x000c]=$'\014' [0x000d]=$'\015' [0x000e]=$'\016' [0x000f]=$'\017'
 [0x0010]=$'\020' [0x0011]=$'\021' [0x0012]=$'\022' [0x0013]=$'\023' [0x0014]=$'\024'
 [0x0015]=$'\025' [0x0016]=$'\026' [0x0017]=$'\027' [0x0018]=$'\030' [0x0019]=$'\031'
 [0x001a]=$'\032' [0x001b]=$'\033' [0x001c]=$'\034' [0x001d]=$'\035' [0x001e]=$'\036'
 [0x001f]=$'\037' [0x0020]=$'\040' [0x0021]=$'\041' [0x0022]=$'\042' [0x0023]=$'\043'
 [0x0024]=$'\044' [0x0025]=$'\045' [0x0026]=$'\046' [0x0027]=$'\047' [0x0028]=$'\050'
 [0x0029]=$'\051' [0x002a]=$'\052' [0x002b]=$'\053' [0x002c]=$'\054' [0x002d]=$'\055'
 [0x002e]=$'\056' [0x002f]=$'\057' [0x0030]=$'\060' [0x0031]=$'\061' [0x0032]=$'\062'
 [0x0033]=$'\063' [0x0034]=$'\064' [0x0035]=$'\065' [0x0036]=$'\066' [0x0037]=$'\067'
 [0x0038]=$'\070' [0x0039]=$'\071' [0x003a]=$'\072' [0x003b]=$'\073' [0x003c]=$'\074'
 [0x003d]=$'\075' [0x003e]=$'\076' [0x003f]=$'\077' [0x0040]=$'\100' [0x0041]=$'\101'
 [0x0042]=$'\102' [0x0043]=$'\103' [0x0044]=$'\104' [0x0045]=$'\105' [0x0046]=$'\106'
 [0x0047]=$'\107' [0x0048]=$'\110' [0x0049]=$'\111' [0x004a]=$'\112' [0x004b]=$'\113'
 [0x004c]=$'\114' [0x004d]=$'\115' [0x004e]=$'\116' [0x004f]=$'\117' [0x0050]=$'\120'
 [0x0051]=$'\121' [0x0052]=$'\122' [0x0053]=$'\123' [0x0054]=$'\124' [0x0055]=$'\125'
 [0x0056]=$'\126' [0x0057]=$'\127' [0x0058]=$'\130' [0x0059]=$'\131' [0x005a]=$'\132'
 [0x005b]=$'\133' [0x005c]=$'\134' [0x005d]=$'\135' [0x005e]=$'\136' [0x005f]=$'\137'
 [0x0060]=$'\140' [0x0061]=$'\141' [0x0062]=$'\142' [0x0063]=$'\143' [0x0064]=$'\144'
 [0x0065]=$'\145' [0x0066]=$'\146' [0x0067]=$'\147' [0x0068]=$'\150' [0x0069]=$'\151'
 [0x006a]=$'\152' [0x006b]=$'\153' [0x006c]=$'\154' [0x006d]=$'\155' [0x006e]=$'\156'
 [0x006f]=$'\157' [0x0070]=$'\160' [0x0071]=$'\161' [0x0072]=$'\162' [0x0073]=$'\163'
 [0x0074]=$'\164' [0x0075]=$'\165' [0x0076]=$'\166' [0x0077]=$'\167' [0x0078]=$'\170'
 [0x0079]=$'\171' [0x007a]=$'\172' [0x007b]=$'\173' [0x007c]=$'\174' [0x007d]=$'\175'
 [0x007e]=$'\176' [0x007f]=$'\177' [0x0080]=$'\200' [0x0081]=$'\

Here is a diff of all the changed to the unicode

2012-02-21 Thread John Kearney



Here is a diff of all the changed to the unicode

This seems to work ok for me. but still needs further testing.

My major goal was to make the code easier to follow and clearer.

but also generally fixed and improved it.

Added warning message
./bash -c 'printf "string 1\\U8fffStromg 2"'
./bash: line 0: printf: warning: U+8fff unsupported in destination
charset ".UTF-8"
string 1Stromg 2


added utf32toutf16 and utf32towchar to allow usage of wcstombs both when
wchar_t=2 or 4

generally reworked so consistent with function argument convention i.e.
destination then source.
diff --git a/builtins/printf.def b/builtins/printf.def
index 9eca215..77a8159 100644
--- a/builtins/printf.def
+++ b/builtins/printf.def
@@ -859,15 +859,9 @@ tescape (estart, cp, lenp, sawc)
 	*cp = '\\';
 	return 0;
 	  }
-	if (uvalue <= UCHAR_MAX)
-	  *cp = uvalue;
-	else
-	  {
-	temp = u32cconv (uvalue, cp);
-	cp[temp] = '\0';
-	if (lenp)
-	  *lenp = temp;
-	  }
+	temp = utf32tomb (cp, uvalue);
+	if (lenp)
+	  *lenp = temp;
 	break;
 #endif
 	
diff --git a/externs.h b/externs.h
index 09244fa..ff3f344 100644
--- a/externs.h
+++ b/externs.h
@@ -460,7 +460,7 @@ extern unsigned int falarm __P((unsigned int, unsigned int));
 extern unsigned int fsleep __P((unsigned int, unsigned int));
 
 /* declarations for functions defined in lib/sh/unicode.c */
-extern int u32cconv __P((unsigned long, char *));
+extern int utf32tomb __P((char *, unsigned long));
 
 /* declarations for functions defined in lib/sh/winsize.c */
 extern void get_new_window_size __P((int, int *, int *));
diff --git a/lib/sh/strtrans.c b/lib/sh/strtrans.c
index 2265782..e410cff 100644
--- a/lib/sh/strtrans.c
+++ b/lib/sh/strtrans.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 
+#include 
 #include "shell.h"
 
 #ifdef ESC
@@ -140,21 +141,10 @@ ansicstr (string, len, flags, sawc, rlen)
 	  for (v = 0; ISXDIGIT ((unsigned char)*s) && temp--; s++)
 		v = (v * 16) + HEXVALUE (*s);
 	  if (temp == ((c == 'u') ? 4 : 8))
-		{
 		  *r++ = '\\';	/* c remains unchanged */
-		  break;
-		}
-	  else if (v <= UCHAR_MAX)
-		{
-		  c = v;
-		  break;
-		}
 	  else
-		{
-		  temp = u32cconv (v, r);
-		  r += temp;
-		  continue;
-		}
+		  r += utf32tomb (r, v);
+	  break;
 #endif
 	case '\\':
 	  break;
diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index d34fa08..5cc96bf 100644
--- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -36,13 +36,7 @@
 
 #include 
 
-#ifndef USHORT_MAX
-#  ifdef USHRT_MAX
-#define USHORT_MAX USHRT_MAX
-#  else
-#define USHORT_MAX ((unsigned short) ~(unsigned short)0)
-#  endif
-#endif
+#include "bashintl.h"
 
 #if !defined (STREQ)
 #  define STREQ(a, b) ((a)[0] == (b)[0] && strcmp ((a), (b)) == 0)
@@ -54,13 +48,14 @@ extern const char *locale_charset __P((void));
 extern char *get_locale_var __P((char *));
 #endif
 
-static int u32init = 0;
+const char *charset;
 static int utf8locale = 0;
 #if defined (HAVE_ICONV)
 static iconv_t localconv;
 #endif
 
 #ifndef HAVE_LOCALE_CHARSET
+static char charset_buffer[40]={0};
 static char *
 stub_charset ()
 {
@@ -68,168 +63,267 @@ stub_charset ()
 
   locale = get_locale_var ("LC_CTYPE");
   if (locale == 0 || *locale == 0)
-return "ASCII";
-  s = strrchr (locale, '.');
-  if (s)
 {
-  t = strchr (s, '@');
-  if (t)
-	*t = 0;
-  return ++s;
+  strcpy(charset_buffer, "ASCII");
 }
-  else if (STREQ (locale, "UTF-8"))
-return "UTF-8";
   else
-return "ASCII";
+{
+  s = strrchr (locale, '.');
+  if (s)
+	{
+	  t = strchr (s, '@');
+	  if (t)
+	*t = 0;
+	  strcpy(charset_buffer, s);
+	}
+  else
+	{
+	  strcpy(charset_buffer, locale);
+	}
+  /* free(locale)  If we can Modify the buffer surely we need to free it?*/
+}
+  return charset_buffer;
 }
 #endif
 
-/* u32toascii ? */
+
+#if 0
 int
-u32tochar (wc, s)
- wchar_t wc;
+utf32tobig5 (s, c)
  char *s;
+ unsigned long c;
 {
-  unsigned long x;
   int l;
 
-  x = wc;
-  l = (x <= UCHAR_MAX) ? 1 : ((x <= USHORT_MAX) ? 2 : 4);
-
-  if (x <= UCHAR_MAX)
-s[0] = x & 0xFF;
-  else if (x <= USHORT_MAX)	/* assume unsigned short = 16 bits */
+  if (c <= 0x7F)
 {
-  s[0] = (x >> 8) & 0xFF;
-  s[1] = x & 0xFF;
+  s[0] = (char)c;
+  l = 1;
+}
+  else if ((c >= 0x8000) && (c <= 0x))
+{
+  s[0] = (char)(c>>8);
+  s[1] = (char)(c  &0xFF);
+  l = 2;
 }
   else
 {
-  s[0] = (x >> 24) & 0xFF;
-  s[1] = (x >> 16) & 0xFF;
-  s[2] = (x >> 8) & 0xFF;
-  s[3] = x & 0xFF;
+  /* Error Invalid UTF-8 */
+  l = 0;
 }
   s[l] = '\0';
-  return l;  
+  return l;
 }
-
+#endif
 int
-u32toutf8 (wc, s)
- wchar_t wc;
+utf32toutf8 (s, c)
  char *s;
+ unsigned long c;
 {
   int l;
 
-  l = (wc < 0x0080) ? 1 : ((wc < 0x0800) ? 2 : 3);
-
-  if (wc < 0x0080)
-s[0] = (unsigned char)wc;
-  else if (wc < 0x0800)
+  if (c <= 0x7F)
 {
-  s[0] = (wc >> 6) | 0xc0;
-  s[1] = (

Re: Bug? in bash setlocale implementation

2012-02-21 Thread John Kearney

On 02/22/2012 01:52 AM, Chet Ramey wrote:
> On 2/21/12 3:51 AM, John Kearney wrote:
> 
>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>> 
>> Description: Basically if setting the locale fails variable
>> should not be changed.
> 
> I disagree.  The assignment was performed correctly and as the
> user specified.  The fact that a side effect of the assignment
> failed should not mean that the assignment should be undone.
> 
> I got enough bug reports when I added the warning.  I'd get at
> least as many if I undid a perfectly good assignment statement.
> 
> I could see setting $? to a non-zero value if the setlocale() call
> fails, but not when the shell is in posix mode.
> 
> Chet
> 
ok I guess that makes sense, just ksh93 behavior also makes sense, I
guess I can just use some command to check the charset is present
before I assign it.

printf "%q" "~" not escaped?

2012-02-21 Thread John Kearney

Bash Version: 4.2
Patch Level: 10
Release Status: release

Description:
printf "%q" "~" not escaped?

which means that this
eval echo $(printf "%q" "~")
results in your home path not a ~
unlike
eval echo $(printf "%q" "*")

as far as I can see its the only character that isn't treated as I
expected.

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

On 02/22/2012 01:59 PM, Eric Blake wrote:
> On 02/22/2012 05:19 AM, Linda Walsh wrote:
>>
>>
>> Eric Blake wrote:
>>
>>
>>> Not only can wchar_t can be either signed or unsigned, you also have to
>>> worry about platforms where it is only 16 bits, such as cygwin; on the
>>> other hand, wint_t is always 32 bits, but you still have the issue that
>>> it can be either signed or unsigned.
>>
>>
>>
>> What platform uses unsigned wide ints?  Is that even posix compat?
> 
> Yes, it is posix compatible to have wint_t be unsigned.  Not only that,
> but both glibc (32-bit wchar_t) and cygwin (16-bit wchar_t) use a 32-bit
> unsigned int for wint_t.  Any code that expects WEOF to be less than 0
> is broken.
> 
But if what you want is a uint32  use a uint32_t ;)

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

^ caviot you can represent the full 0x10 in UTF-16, you just need 2
UTF-16 characters. check out the latest version of unicode.c for an
example how.

On 02/22/2012 11:32 PM, Eric Blake wrote:
> On 02/22/2012 03:01 PM, Linda Walsh wrote:
>> My question had to do with an unqualified wint_t not
>> unsigned wint_t and what platform existed where an 'int' type or
>> wide-int_t, was, without qualifiers, unsigned.  I still would like
>> to know -- and posix allows int/wide-ints to be unsigned without
>> the unsigned keyword?
> 
> 'int' is signed, and at least 16 bits (these days, it's usually 32).  It
> can also be written 'signed int'.
> 
> 'unsigned int' is unsigned, and at least 16 bits (these days, it's
> usually 32).
> 
> 'wchar_t' is an arbitrary integral type, either signed or unsigned, and
> capable of holding the value of all valid wide characters.   It is
> possible to define a system where wchar_t and char are identical
> (limiting yourself to 256 valid characters), but that is not done in
> practice.  More common are platforms that use 65536 characters (only the
> basic plane of Unicode) for 16 bits, or full Unicode (0 to 0x10) for
> 32 bits.  Platforms that use 65536 characters and 16-bit wchar_t must
> have wchar_t be unsigned; whereas platforms that have wchar_t wider than
> the largest valid character can choose signed or unsigned with no impact.
> 
> 'wint_t' is an arbitrary integral type, either signed or unsigned, at
> least as wide as wchar_t, and capable of holding the value of all valid
> wide characters and the sentinel WEOF.  Like wchar_t, it may hold values
> that are neither WEOF or valid characters; and in fact, it is more
> likely to do so, since either wchar_t is saturated (all bit values are
> valid characters) and thus wint_t is a wider type, or wchar_t is sparse
> (as is the case with 32-bit wchar_t encoding Unicode), and the addition
> of WEOF to the set does not plug in the remaining sparse values; but
> using such values has unspecified results on any interface that takes a
> wint_t.  WEOF only has to be distinct, it does not have to be negative.
> 
> Don't think of it as 'wide-int', rather, think of it as 'the integral
> type that both contains wchar_t and WEOF'.  You cannot write 'signed
> wint_t' nor 'unsigned 'wint_t'.
>

Re: Fix u32toutf8 so it encodes values > 0xFFFF correctly.

2012-02-22 Thread John Kearney

And on the up side if they do ever give in and allow registration of
family name characters we may get a wchar_t, schar_t lwchar_t and a
llwchar_t
:)
just imagine a variable length 64bit char system.

Everything from Sumerian to Klingon in Unicode, though I think they
already are, though not officially, or are being done,

Oh god what I really want now is bash in klingon.

:))
just imagine black blackround glaring green text.
know what I'm doing tonight.

check out ( shakes head in disbelief, while chuckling )
Ubuntu Klingon Translators https://launchpad.net/~ubuntu-l10n-tlh
Expansion: Ubuntu Font should support pIqaD (Klingon)
https://bugs.launchpad.net/ubuntu/+source/ubuntu-font-family-sources/+bug/650729



On 02/23/2012 04:54 AM, Eric Blake wrote:
> On 02/22/2012 07:43 PM, John Kearney wrote:
>> ^ caviot you can represent the full 0x10 in UTF-16, you just
>> need 2 UTF-16 characters. check out the latest version of
>> unicode.c for an example how.
> 
> Yes, and Cygwin actually does this.
> 
> A strict reading of POSIX states that wchar_t must be wide enough
> for all supported characters, technically limiting things to just
> the basic plane if you have 16-bit wchar_t and a POSIX-compliant
> app.  But cygwin has exploited a loophole in the POSIX wording -
> POSIX does not require that all bit patterns are valid characters.
> So the actual Cygwin implementation is that on paper, rather than
> representing all 65536 patterns as valid characters, the values
> used in surrogate halves (0xd800 to 0xdfff) are listed as
> non-characters (so the use of them triggers undefined behavior per
> POSIX), but actually using them treats them as surrogate pairs
> (leading to the full Unicode character set, but reintroducing the
> headaches that multibyte characters had with 'char', but now with
> wchar_t, where you are back to dealing with variable-sized 
> character elements).
> 
> Furthermore, the mess of 16-bit vs. 32-bit wchar_t is one of the
> reasons why C11 has introduced two new character types, 16-bit and
> 32-bit characters, designed to fully map to the full Unicode set,
> regardless of what size wchar_t is.  It will be interesting to see
> how the next version of POSIX takes the additions of C11 and
> retrofits the other wide-character functions in POSIX but not C99
> to handle the new character types.
>

Re: shopt can't set extglob in a sub-shell?

2012-02-26 Thread John Kearney

On 02/25/2012 09:42 PM, Davide Baldini wrote:
> Configuration Information [Automatically generated, do not
> change]: Machine: i486 OS: linux-gnu Compiler: gcc Compilation
> CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i486'
> -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i486-pc-linux-gnu'
> -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
> -DSHELL -DHAVE_CONFIG_H   -I.  -I../bash -I../bash/include
> -I../bash/lib   -g -O2 -Wall uname output: Linux debianBunker
> 2.6.26-2-686 #1 SMP Wed Sep 21 04:35:47 UTC 2011 i686 GNU/Linux 
> Machine Type: i486-pc-linux-gnu
> 
> Bash Version: 4.1 Patch Level: 5 Release Status: release

Ok so had a play around with it. Its not specific to sub shells its
commands.

so the following also doesn't work.

shopt -u extglob
if true; then
  shopt -s extglob
  echo !(x)
fi

this is because bash treats the entire if statement as a command. so
the second shopt isn't evaluate before the !(x) is parsed. therefore
the error message. The error message is a parsing error not an
expansion error I think.

so bash sees the above as
shopt -u extglob
if true; then   shopt -s extglob;   echo !(x); fi

as a workaround you could try/use is this it delays parsing the !(x)
until after the shopt is evaluated.

(
shopt -s extglob
eval 'echo !(x)'
)

Not sure if this is expected behavior.
hth
deth.

Re: shopt can't set extglob in a sub-shell?

2012-02-26 Thread John Kearney

I updated that wiki page
Hopefully its clearer now.
http://mywiki.wooledge.org/glob#extglob


On 02/26/2012 12:06 PM, Dan Douglas wrote:
> On Saturday, February 25, 2012 09:42:29 PM Davide Baldini wrote:
> 
>> Description: A 'test.sh` script file composed exclusively of the
>> following text fails execution: #!/bin/bash ( shopt -s extglob 
>> echo !(x) ) giving the output: $ ./test.sh ./test.sh: line 4:
>> syntax error near unexpected token `(' ./test.sh: line 4: `
>> echo !(x)' Moving the shopt line above the sub-shell parenthesis
>> makes the script work.
>> 
>> The debian man pages give no explanation.
>> 
>> Thank you.
> 
> Non-eval workaround if you're desperate:
> 
> #!/usr/bin/env bash ( shopt -s extglob declare -a a='( !(x) )' echo
> "${a[@]}" )
> 
> You may be aware extglob is special and affects parsing in other
> ways. Quoting Greg's wiki (http://mywiki.wooledge.org/glob):
> 
>> Likewise, you cannot put shopt -s extglob inside a function that
>> uses extended globs, because the function as a whole must be
>> parsed when it's defined; the shopt command won't take effect
>> until the function is called, at which point it's too late.
> 
> This appears to be a similar situation. Since parentheses are
> "metacharacters" they act strongly as word boundaries without a
> special exception for extglobs.
> 
> I just tested a bunch of permutations. I was a bit surprised to see
> this one fail:
> 
> f() if [[ $FUNCNAME != ${FUNCNAME[1]} ]]; then trap 'shopt -u
> extglob' RETURN shopt -s extglob f else f()( shopt -s extglob echo
> !(x) ) f fi
> 
> f
> 
> I was thinking there might be a general solution via the RETURN
> trap where you could just set "trace" on functions where you want
> it, but looks like even "redefinitions" break recursively, so
> you're stuck. Fortunately, there aren't a lot of good reasons to
> have extglob disabled to begin with (if any).

Re: Initial test code for \U

2012-02-26 Thread John Kearney

On 02/22/2012 08:59 PM, Eric Blake wrote:
> On 02/22/2012 12:55 PM, Chet Ramey wrote:
>> On 2/21/12 5:07 PM, John Kearney wrote:
>>> 
>>> Initial code for testing \u functionality.
>> 
>> Thanks; this is really good work.  In the limited testing I've
>> done, ja_JP.SHIFT_JIS is rare and C.UTF-8 doesn't exist
>> anywhere.
> 
> C.UTF-8 exists on Cygwin.  But you are correct that...
> 
>> en_US.UTF-8 seems to perform acceptably for the latter.
> 
Also on Ubuntu. I only really started using it because it is
consistent with C
i.e.
LC_CTYPE=C
LC_CTYPE=C.UTF-8

Actually this was the reason I made the comment about not being able
to detect setlocale error in bash. wanted to use a fallback list of
the locale synonyms.

The primary problem with this test is you need the locales installed.

thoretical plan
1. compile list of destination code sets.
3. some method to auto install codesets.
3. Get Unicode mappings for said code sets.
2. use iconv to generate bash test tables
4. start crying at all the error messages ;(

now locale -m gives charsets.
Any ideas about finding unicode mappings for said charsets?
I've been looking through the iconv code but all seems a bit laborious.
What charsets would make sense to test?

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

Actually this is something that still really confuses me as well.
In the end I gave up and just did this.

local LName="'\\''"
echo -n "'${1//"'"/${LName}}' "


I still don't really understand why this wont work
echo -n "'${1//"'"/"'\''"}' "
echo -n "'${1//\'/\'\\\'\'}' "


The standard work around you see is
echo -n \'${1//\'/\'\\\'\'}\'" "
 but its not the same thing

I guess what I don't understand is why quoting the variable affects the
substitutions string. I mean I guess I can  see how it could happen but
it does seem inconsistent, in fact it feels like a bug.

And even if it does affect it the effect seems to be weird.
i.e.

given
test="weferfds'dsfsdf"


# why does this work, this list was born of frustration, I tried
everything I could think of.
echo \'${test//"'"/\'\\\'\'}\'" "
'weferfds'\''dsfsdf'

#but none of the following
echo "'${test//'/}'"   # hangs waiting for '

echo "'${test//"'"/}'"
'weferfdsdsfsdf'

echo "'${test//"'"/"'\\''"}'"
'weferfds"'\''"dsfsdf'

echo "'${test//"'"/'\\''}'" # ahngs waiting or '

echo "'${test//"'"/\'\\'\'}'"
'weferfds\'\'\'dsfsdf'


leaving me doing something like
local LName="'\\''"
echo -n "'${1//"'"/${LName}}' "



I mean its a silly thing but it confuses me.


On 02/28/2012 03:47 PM, Roman Rakus wrote:
> On 02/28/2012 02:36 PM, Chet Ramey wrote:
>> On 2/28/12 4:17 AM, lhun...@lyndir.com wrote:
>>> Configuration Information [Automatically generated, do not
>>> change]: Machine: i386 OS: darwin11.2.0 Compiler:
>>> /Developer/usr/bin/clang Compilation CFLAGS:  -DPROGRAM='bash'
>>> -DCONF_HOSTTYPE='i386' -DCONF_OSTYPE='darwin11.2.0' 
>>> -DCONF_MACHTYPE='i386-apple-darwin11.2.0'
>>> -DCONF_VENDOR='apple' -DLOCALEDIR='/opt/local/share/locale'
>>> -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DMACOSX   -I.  -I.
>>> -I./include -I./lib -I/opt/local/include -pipe -O2 -arch
>>> x86_64 uname output: Darwin mbillemo.lin-k.net 11.3.0 Darwin
>>> Kernel Version 11.3.0: Thu Jan 12 18:47:41 PST 2012; 
>>> root:xnu-1699.24.23~1/RELEASE_X86_64 x86_64 Machine Type:
>>> i386-apple-darwin11.2.0
>>> 
>>> Bash Version: 4.2 Patch Level: 20 Release Status: release
>>> 
>>> Description: The handling of backslash and quotes is completely
>>> inconsistent, counter-intuitive and in violation of how the
>>> syntax works elsewhere in bash.
>>> 
>>> ' appears to introduce a single-quoted context and \ appears
>>> to escape special characters.  That's good. A substitution
>>> pattern of ' causes bash to be unable to find the closing
>>> quote.  That's good. A substitution pattern of '' SHOULD equal
>>> an empty quoted string.  The result, however, is ''.  That's
>>> NOT good.  Suddenly the quotes are literal? A substitution
>>> pattern of '$var' SHOULD disable expansion inside the quotes.
>>> The result, however, is '[contents-of-var]'.  That's NOT good.
>>> In fact, it looks like quoting doesn't work here at all. \\ is
>>> a disabled backslash, and the syntactical backslash is removed.
>>> The result is \.  That's good. \' is a disabled single quote,
>>> but the syntactical backslash is NOT removed.  The result is
>>> \'.  That's NOT good.
>>> 
>>> It mostly looks like all the rules for handling quoting and 
>>> escaping are out the window and some random and utterly
>>> inconsistent set of rules is being applied instead.
>>> 
>>> Fix: Change parsing of the substitution pattern so that it
>>> abides by all the standard documented rules regarding quotes
>>> and escaping.
>> It would go better if you gave some examples of what you
>> consider incorrect behavior.  This description isn't helpful as
>> it stands.
>> 
> Maybe something like this:
> 
> # ttt=ggg # ggg="asd'ddd'g" # echo "'${!ttt//\'/'\''}'"
>> ^C
> # echo "'${!ttt//\'/\'\\\'\'}'" 'asd\'\\'\'ddd\'\\'\'g'
> 
> 
> 
> Anyway, I thought that single quote retains its special meaning in 
> double quotes. $ echo "'a'" 'a'
> 
> RR
>

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

this all started with a wish to single quote a variable. Doesn't matter
why I have multiple solutions to that now.

But it it an interesting problem for exploring how escaping works in
variable expansion.

so for the test case the goal is to take a string like
kljlksdjflsd'jkjkljl
wrap it with single quotes and globally replace all single quotes in the
string with '\''


its a workaround because it doesn't work all the time you would need
something more like this
IFS= echo \'${test//"'"/\'\\\'\'}\'" "
'weferfds'\''dsfsdf'



On 02/28/2012 05:01 PM, Greg Wooledge wrote:
> On Tue, Feb 28, 2012 at 04:52:48PM +0100, John Kearney wrote:
>> The standard work around you see is
>>  echo -n \'${1//\'/\'\\\'\'}\'" "
>>  but its not the same thing
> 
> Workaround for what?  Not the same thing as what?  What is this pile
> of punctuation attempting to do?
> 
>> # why does this work, this list was born of frustration, I tried
>> everything I could think of.
>> echo \'${test//"'"/\'\\\'\'}\'" "
>> 'weferfds'\''dsfsdf'
> 
> Are you trying to produce "safely usable" strings that can be fed to
> eval later?  Use printf %q for that.
> 
> imadev:~$ input="ain't it * a \"pickle\"?"
> imadev:~$ printf '%q\n' "$input"
> ain\'t\ it\ \*\ a\ \"pickle\"\?
> 
> printf -v evalable_input %q "$input"
> 
> Or, y'know, avoid eval.
> 
> Or is this something to do with sed?  Feeding strings to sed when you
> can't choose a safe delimiter?  That would involve an entirely different
> solution.  It would be nice to know what the problem is.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 05:22 PM, Roman Rakus wrote:
> On 02/28/2012 05:10 PM, John Kearney wrote:
>> wrap it with single quotes and globally replace all single quotes
>> in the string with '\''
> single quote and slash have special meaning so they have to be
> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so it
> undergoes word splitting. To avoid it quote it in double quotes,
> however it changes how slash and single quote is treated. 
> "'${var//\'/\'}'"
> 
> Wasn't it already discussed on the list?
> 
> RR
> 
It was discussed but not answered in a way that helped.


Look consider this
test=teststring


echo "${test//str/""}"
test""ing

echo ${test//str/""}
testing


echo ${test//str/"'"}
test'ing

echo "${test//str/"'"}"
test"'"ing

echo "${test//str/'}"   # hangs


now consider this case

test=test\'string

echo "${test//"'"/"'"}"
test"'"string


the match string and the replace string are exhibiting 2 different
behaviors.



Now I'm not looking foe a workaround, I want to understand it.
Now you say they are treated special what does that mean and how can I
escape that specialness.

Or show me how without using variables
to do this
test=test\'string

[ "${test}" = "${test//"'"/"'"}" ] || exit 999




Note this isn't the answer
[ "${test}" = "${test//'/'}" ] || exit 999

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:05 PM, Steven W. Orr wrote:
> On 2/28/2012 11:54 AM, John Kearney wrote:
>> On 02/28/2012 05:22 PM, Roman Rakus wrote:
>>> On 02/28/2012 05:10 PM, John Kearney wrote:
>>>> wrap it with single quotes and globally replace all single
>>>> quotes in the string with '\''
>>> single quote and slash have special meaning so they have to be 
>>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so
>>> it undergoes word splitting. To avoid it quote it in double
>>> quotes, however it changes how slash and single quote is
>>> treated. "'${var//\'/\'}'"
>>> 
>>> Wasn't it already discussed on the list?
>>> 
>>> RR
>>> 
>> It was discussed but not answered in a way that helped.
>> 
>> 
>> Look consider this test=teststring
>> 
>> 
>> echo "${test//str/""}"
> 
> This makes no sense.
> 
> "${test//str/" is a string.  is anudder string "}" is a 3rd
> string
> 
> echo "${test//str/\"\"}"
> 
> is perfectly legal.
> 
> 
But that isn't how it behaves.
"${test//str/""}"

because str is replaced with '""' as such it is treating the double
quotes as string literals.

however at the same time these literal double quotes escape/quote a
single quote between them.
As such they are treated both as literals and as quotes as such
inconsistently.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:16 PM, Eric Blake wrote:
> On 02/28/2012 09:54 AM, John Kearney wrote:
>> On 02/28/2012 05:22 PM, Roman Rakus wrote:
>>> On 02/28/2012 05:10 PM, John Kearney wrote:
>>>> wrap it with single quotes and globally replace all single
>>>> quotes in the string with '\''
>>> single quote and slash have special meaning so they have to be 
>>> escaped, that's it. \'${var//\'/\\\'}\' it is not quoted, so
>>> it undergoes word splitting. To avoid it quote it in double
>>> quotes, however it changes how slash and single quote is
>>> treated. "'${var//\'/\'}'"
>>> 
>>> Wasn't it already discussed on the list?
>>> 
>>> RR
>>> 
>> It was discussed but not answered in a way that helped.
> 
> POSIX already says that using " inside ${var+value} is
> non-portable; you've just proven that using " inside the bash
> extension of ${var//pat/sub} is likewise not useful.
I'm just going for understandable/predictable right now.


> 
>> 
>> Now I'm not looking foe a workaround, I want to understand it. 
>> Now you say they are treated special what does that mean and how
>> can I escape that specialness.
> 
> By using temporary variables.  That's the only sane approach.
I do its just always bugged.

> 
>> 
>> Or show me how without using variables to do this 
>> test=test\'string
>> 
>> [ "${test}" = "${test//"'"/"'"}" ] || exit 999
> 
> exit 999 is pointless.  It is the same as exit 231 on some shells,
> and according to POSIX, it is allowed to be a syntax error in other
> shells.
> 
I was going for || exit "Doomsday" i,e. 666 = 999 = Apocalypse.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:31 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus wrote:
 And that means, there isn't way to substitute "something" to
 ' (single quote) when you want to not perform word splitting.
 I would consider it as a bug.
>>> 
>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$ echo
>>> "${input//something/$q}" foo'bar
>> 
>> I meant without temporary variable.
>> 
>> RR
> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c

( x=abc; echo "${x/b/$'\''}" )
-bash: bad substitution: no closing `}' in "${x/b/'}"


you forgot the double quotes ;)


I really did spend like an hour or 2 one day trying to figure it out
and gave up.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:43 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>> wrote:
>>>>>> And that means, there isn't way to substitute "something"
>>>>>> to ' (single quote) when you want to not perform word
>>>>>> splitting. I would consider it as a bug.
>>>>> 
>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$
>>>>> echo "${input//something/$q}" foo'bar
>>>> 
>>>> I meant without temporary variable.
>>>> 
>>>> RR
>>> 
>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>> 
>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no
>> closing `}' in "${x/b/'}"
>> 
>> 
>> you forgot the double quotes ;)
>> 
>> 
>> I really did spend like an hour or 2 one day trying to figure it
>> out and gave up.
> 
> Hm good catch. Thought there might be a new quoting context over
> there.
I think we can all agree its inconsistent, just not so sure we care??
i.e. we know workarounds that aren't so bad variables etc.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 06:52 PM, John Kearney wrote:
> On 02/28/2012 06:43 PM, Dan Douglas wrote:
>> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus wrote:
>>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>>> wrote:
>>>>>>> And that means, there isn't way to substitute "something"
>>>>>>> to ' (single quote) when you want to not perform word
>>>>>>> splitting. I would consider it as a bug.
>>>>>>
>>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar" imadev:~$
>>>>>> echo "${input//something/$q}" foo'bar
>>>>>
>>>>> I meant without temporary variable.
>>>>>
>>>>> RR
>>>>
>>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>>>
>>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no
>>> closing `}' in "${x/b/'}"
>>>
>>>
>>> you forgot the double quotes ;)
>>>
>>>
>>> I really did spend like an hour or 2 one day trying to figure it
>>> out and gave up.
>>
>> Hm good catch. Thought there might be a new quoting context over
>> there.
> I think we can all agree its inconsistent, just not so sure we care??
> i.e. we know workarounds that aren't so bad variables etc.
> 
> 
> 
> 
> 
To sum up


bash treats replacement strings inconsistently in double quoted variable
expansion.

example double quote is treated both as literal and as quote character.
( test=test123test ; echo "${test/123/"'"}" )
test"'"test
vs
( test=test123test ; echo "${test/123/'}" )  which hangs waiting for '

treated as literal because it is printed
treated as quote char because otherwise it should hang waiting for '

now teh single quote and backslash characters all seem to exhibit this
dual nature in the replacement string. search string behaves
consistantly. i.e. treats characters either as special or literal, not
as both at teh same time.

this has got to be a bug guys.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 07:00 PM, Dan Douglas wrote:
> On Tuesday, February 28, 2012 06:52:13 PM John Kearney wrote:
>> On 02/28/2012 06:43 PM, Dan Douglas wrote:
>>> On Tuesday, February 28, 2012 06:38:22 PM John Kearney wrote:
>>>> On 02/28/2012 06:31 PM, Dan Douglas wrote:
>>>>> On Tuesday, February 28, 2012 05:53:32 PM Roman Rakus
>>>>> wrote:
>>>>>> On 02/28/2012 05:49 PM, Greg Wooledge wrote:
>>>>>>> On Tue, Feb 28, 2012 at 05:36:47PM +0100, Roman Rakus
>>>>>>> 
>>>>>>> wrote:
>>>>>>>> And that means, there isn't way to substitute
>>>>>>>> "something" to ' (single quote) when you want to not
>>>>>>>> perform word splitting. I would consider it as a
>>>>>>>> bug.
>>>>>>> 
>>>>>>> imadev:~$ q=\' imadev:~$ input="foosomethingbar"
>>>>>>> imadev:~$ echo "${input//something/$q}" foo'bar
>>>>>> 
>>>>>> I meant without temporary variable.
>>>>>> 
>>>>>> RR
>>>>> 
>>>>> ormaaj@ormaajbox ~ $ ( x=abc; echo ${x/b/$'\''} ) a'c
>>>> 
>>>> ( x=abc; echo "${x/b/$'\''}" ) -bash: bad substitution: no 
>>>> closing `}' in "${x/b/'}"
>>>> 
>>>> 
>>>> you forgot the double quotes ;)
>>>> 
>>>> 
>>>> I really did spend like an hour or 2 one day trying to figure
>>>> it out and gave up.
>>> 
>>> Hm good catch. Thought there might be a new quoting context
>>> over there.
>> 
>> I think we can all agree its inconsistent, just not so sure we
>> care?? i.e. we know workarounds that aren't so bad variables
>> etc.
> 
> Eh, it's sort of consistent. e.g. this doesn't work either:
> 
> unset x; echo "${x:-$'\''}"
> 
> and likewise a backslash escape alone won't do the trick. I'd
> assume this applies to just about every expansion.
> 
> I didn't think too hard before posting that. :)


My favorite type of bug one thats consistently inconsistent :)




now that I have a beter idea of what weird I'll take a look later after
the gym.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney


On 02/28/2012 10:05 PM, Chet Ramey wrote:
> On 2/28/12 12:26 PM, John Kearney wrote:
> 
>> But that isn't how it behaves.
>> "${test//str/""}"
>>
>> because str is replaced with '""' as such it is treating the double
>> quotes as string literals.
>>
>> however at the same time these literal double quotes escape/quote a
>> single quote between them.
>> As such they are treated both as literals and as quotes as such
>> inconsistently.
> 
> I don't have a lot of time today, but I'm going to try and answer bits
> and pieces of this discussion.
> 
> Yes, bash opens a new `quoting context' (for lack of a better term) inside
> ${}.  Posix used to require it, though after lively discussion it turned
> into "well, we said that but it's clearly not what we meant."
> 
> There are a couple of places in the currently-published version of the
> standard, minus any corregendia, that specify this.  The description of
> ${parameter} reads, in part,
> 
> "The matching closing brace shall be determined by counting brace levels,
> skipping over enclosed quoted strings, and command substitutions."
> 
> The section on double quotes reads, in part:
> 
> "Within the string of characters from an enclosed "${" to the matching
> '}', an even number of unescaped double-quotes or single-quotes, if any,
> shall occur."
> 
> Chet

yhea but I think the point is that the current behavior is useless.
there is no case where I want a " to be printed and start a double
quoted string? and thats the current behavior.


Not so important how you treat it just need to pick 1. then you can at
least work with it. Now you have to use a temp variable.


as a side note ksh93 is pretty good, intuitive
ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"'
testdd ddtest
ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )'
testdd 'ddtest

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:07 PM, Chet Ramey wrote:
> On 2/28/12 4:28 PM, John Kearney wrote:
>> 
>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>> 
>>>> But that isn't how it behaves. "${test//str/""}"
>>>> 
>>>> because str is replaced with '""' as such it is treating
>>>> the double quotes as string literals.
>>>> 
>>>> however at the same time these literal double quotes
>>>> escape/quote a single quote between them. As such they are
>>>> treated both as literals and as quotes as such 
>>>> inconsistently.
>>> 
>>> I don't have a lot of time today, but I'm going to try and
>>> answer bits and pieces of this discussion.
>>> 
>>> Yes, bash opens a new `quoting context' (for lack of a better
>>> term) inside ${}.  Posix used to require it, though after
>>> lively discussion it turned into "well, we said that but it's
>>> clearly not what we meant."
>>> 
>>> There are a couple of places in the currently-published version
>>> of the standard, minus any corregendia, that specify this.  The
>>> description of ${parameter} reads, in part,
>>> 
>>> "The matching closing brace shall be determined by counting
>>> brace levels, skipping over enclosed quoted strings, and
>>> command substitutions."
>>> 
>>> The section on double quotes reads, in part:
>>> 
>>> "Within the string of characters from an enclosed "${" to the
>>> matching '}', an even number of unescaped double-quotes or
>>> single-quotes, if any, shall occur."
>>> 
>>> Chet
>> 
>> yhea but I think the point is that the current behavior is
>> useless. there is no case where I want a " to be printed and
>> start a double quoted string? and thats the current behavior.
>> 
>> 
>> Not so important how you treat it just need to pick 1. then you
>> can at least work with it. Now you have to use a temp variable.
>> 
>> 
>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>> testdd 'ddtest
> 
> The real question is whether or not you do quote removal on the
> stuff inside the braces when they're enclosed in double quotes.
> Double quotes usually inhibit quote removal.
> 
> The Posix "solution" to this is to require quote removal if a
> quote character (backslash, single quote, double quote) is used to
> escape or quote another character.  Somewhere I have the reference
> to the Austin group discussion on this.
> 

1${A:-B}2

Logically for consistancy having double quotes at position 1 and 2
should have no effect on how you treat string B.


or
consider this
1${A/B/C}2

in this case its even weirder double quotes at 1 and 2 has no effect
on A or B but modifies how string C behaves.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:15 PM, Chet Ramey wrote:
> On 2/28/12 5:07 PM, Chet Ramey wrote:
> 
>>> yhea but I think the point is that the current behavior is useless.
>>> there is no case where I want a " to be printed and start a double
>>> quoted string? and thats the current behavior.
>>>
>>>
>>> Not so important how you treat it just need to pick 1. then you can at
>>> least work with it. Now you have to use a temp variable.
>>>
>>>
>>> as a side note ksh93 is pretty good, intuitive
>>> ksh93 -c 'test=teststrtest ; echo "${test//str/"dd dd"}"'
>>> testdd ddtest
>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )'
>>> testdd 'ddtest
>>
>> The real question is whether or not you do quote removal on the stuff
>> inside the braces when they're enclosed in double quotes.  Double
>> quotes usually inhibit quote removal.
>>
>> The Posix "solution" to this is to require quote removal if a quote
>> character (backslash, single quote, double quote) is used to escape
>> or quote another character.  Somewhere I have the reference to the
>> Austin group discussion on this.
> 
> http://austingroupbugs.net/view.php?id=221
> 
> Chet

This however doesn't make reference to changing that behavior if you
enclose the entire thing in double quotes.

${a//a/"a"} should behave the same as "${a//a/"a"}"

I mean the search and replace should behave the same. Currently they dont

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:23 PM, Chet Ramey wrote:
> On 2/28/12 5:18 PM, John Kearney wrote:
>> On 02/28/2012 11:07 PM, Chet Ramey wrote:
>>> On 2/28/12 4:28 PM, John Kearney wrote:
>>>>
>>>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>>>
>>>>>> But that isn't how it behaves. "${test//str/""}"
>>>>>>
>>>>>> because str is replaced with '""' as such it is treating
>>>>>> the double quotes as string literals.
>>>>>>
>>>>>> however at the same time these literal double quotes
>>>>>> escape/quote a single quote between them. As such they are
>>>>>> treated both as literals and as quotes as such 
>>>>>> inconsistently.
>>>>>
>>>>> I don't have a lot of time today, but I'm going to try and
>>>>> answer bits and pieces of this discussion.
>>>>>
>>>>> Yes, bash opens a new `quoting context' (for lack of a better
>>>>> term) inside ${}.  Posix used to require it, though after
>>>>> lively discussion it turned into "well, we said that but it's
>>>>> clearly not what we meant."
>>>>>
>>>>> There are a couple of places in the currently-published version
>>>>> of the standard, minus any corregendia, that specify this.  The
>>>>> description of ${parameter} reads, in part,
>>>>>
>>>>> "The matching closing brace shall be determined by counting
>>>>> brace levels, skipping over enclosed quoted strings, and
>>>>> command substitutions."
>>>>>
>>>>> The section on double quotes reads, in part:
>>>>>
>>>>> "Within the string of characters from an enclosed "${" to the
>>>>> matching '}', an even number of unescaped double-quotes or
>>>>> single-quotes, if any, shall occur."
>>>>>
>>>>> Chet
>>>>
>>>> yhea but I think the point is that the current behavior is
>>>> useless. there is no case where I want a " to be printed and
>>>> start a double quoted string? and thats the current behavior.
>>>>
>>>>
>>>> Not so important how you treat it just need to pick 1. then you
>>>> can at least work with it. Now you have to use a temp variable.
>>>>
>>>>
>>>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>>>> testdd 'ddtest
>>>
>>> The real question is whether or not you do quote removal on the
>>> stuff inside the braces when they're enclosed in double quotes.
>>> Double quotes usually inhibit quote removal.
>>>
>>> The Posix "solution" to this is to require quote removal if a
>>> quote character (backslash, single quote, double quote) is used to
>>> escape or quote another character.  Somewhere I have the reference
>>> to the Austin group discussion on this.
>>>
>>
>> 1${A:-B}2
>>
>> Logically for consistancy having double quotes at position 1 and 2
>> should have no effect on how you treat string B.
> 
> Maybe, but that's not how things work in practice.  Should the following
> expansions output the same thing?  What should they output?
> 
> bar=abc
> echo ${foo:-'$bar'}
> echo "${foo:-'$bar'}"
> 
> Chet
my first intuition on this whole thing was
§(varename arg1 arg2)

I.E. conceptually treat it like a function the options are arguments.
That is then consistant, and intuative. Don'tget confused by the syntax.
If I want 'as' i'll type \'as\' or some such.


the outermost quotes only effect how the final value is handled.
same as §()


having special behaviour model for that string makes it imposible to
work with really.

this should actually make it easier for the parser.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:44 PM, Chet Ramey wrote:
> echo "$(echo '$bar')"

actually these both output the same in bash
echo "$(echo '$bar')"
echo $(echo '$bar')

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-28 Thread John Kearney

On 02/28/2012 11:23 PM, Chet Ramey wrote:
> On 2/28/12 5:18 PM, John Kearney wrote:
>> On 02/28/2012 11:07 PM, Chet Ramey wrote:
>>> On 2/28/12 4:28 PM, John Kearney wrote:
>>>>
>>>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>>>
>>>>>> But that isn't how it behaves. "${test//str/""}"
>>>>>>
>>>>>> because str is replaced with '""' as such it is treating
>>>>>> the double quotes as string literals.
>>>>>>
>>>>>> however at the same time these literal double quotes
>>>>>> escape/quote a single quote between them. As such they are
>>>>>> treated both as literals and as quotes as such 
>>>>>> inconsistently.
>>>>>
>>>>> I don't have a lot of time today, but I'm going to try and
>>>>> answer bits and pieces of this discussion.
>>>>>
>>>>> Yes, bash opens a new `quoting context' (for lack of a better
>>>>> term) inside ${}.  Posix used to require it, though after
>>>>> lively discussion it turned into "well, we said that but it's
>>>>> clearly not what we meant."
>>>>>
>>>>> There are a couple of places in the currently-published version
>>>>> of the standard, minus any corregendia, that specify this.  The
>>>>> description of ${parameter} reads, in part,
>>>>>
>>>>> "The matching closing brace shall be determined by counting
>>>>> brace levels, skipping over enclosed quoted strings, and
>>>>> command substitutions."
>>>>>
>>>>> The section on double quotes reads, in part:
>>>>>
>>>>> "Within the string of characters from an enclosed "${" to the
>>>>> matching '}', an even number of unescaped double-quotes or
>>>>> single-quotes, if any, shall occur."
>>>>>
>>>>> Chet
>>>>
>>>> yhea but I think the point is that the current behavior is
>>>> useless. there is no case where I want a " to be printed and
>>>> start a double quoted string? and thats the current behavior.
>>>>
>>>>
>>>> Not so important how you treat it just need to pick 1. then you
>>>> can at least work with it. Now you have to use a temp variable.
>>>>
>>>>
>>>> as a side note ksh93 is pretty good, intuitive ksh93 -c
>>>> 'test=teststrtest ; echo "${test//str/"dd dd"}"' testdd ddtest 
>>>> ksh93 -c '( test=teststrtest ; echo ${test//str/"dd '\''dd"} )' 
>>>> testdd 'ddtest
>>>
>>> The real question is whether or not you do quote removal on the
>>> stuff inside the braces when they're enclosed in double quotes.
>>> Double quotes usually inhibit quote removal.
>>>
>>> The Posix "solution" to this is to require quote removal if a
>>> quote character (backslash, single quote, double quote) is used to
>>> escape or quote another character.  Somewhere I have the reference
>>> to the Austin group discussion on this.
>>>
>>
>> 1${A:-B}2
>>
>> Logically for consistancy having double quotes at position 1 and 2
>> should have no effect on how you treat string B.
> 
> Maybe, but that's not how things work in practice.  Should the following
> expansions output the same thing?  What should they output?
> 
> bar=abc
> echo ${foo:-'$bar'}
> echo "${foo:-'$bar'}"
> 
> Chet


and truthfully with thr current behavior Id' almost expect this  behavior.

$bar
'$bar'
but to be honest without trying it out I have no idea and that is the
problem now.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney


It isn't just the quote removal that is confusing.

The escape character is also not removed and has its special meaning.


and this also confuses me
take the following 2 cases
echo ${a:-$'\''}
'
echo "${a:-$'\''}"
bash: bad substitution: no closing `}' in "${a:-'}"


and take the following 3 cases
echo "${a:-$(echo $'\'')}"
bash: command substitution: line 38: unexpected EOF while looking for
matching `''
bash: command substitution: line 39: syntax error: unexpected end of file

echo ${a:-$(echo $'\'')}
'
echo "${a:-$(echo \')}"
'

This can not be logical behavior.

On 02/29/2012 11:26 PM, Chet Ramey wrote:
> On 2/28/12 10:52 AM, John Kearney wrote:
>> Actually this is something that still really confuses me as
>> well.
> 
> The key is that bash doesn't do quote removal on the `string' part
> of the "${param/pat/string}" expansion.  The double quotes are key;
> quote removal happens when the expansion is unquoted.
> 
> Double quotes are supposed to inhibit quote removal, but bash's
> hybrid behavior of allowing quotes to escape characters but not
> removing them is biting us here.
>

Re: RFE: allow bash to have libraries (was bash 4.2 breaks source finding libs in lib/filename...)

2012-02-29 Thread John Kearney

On 02/29/2012 11:53 PM, Linda Walsh wrote:
> 
> 
> Eric Blake wrote:
> 
>> On 02/29/2012 12:26 PM, Linda Walsh wrote:
>> 
 Any pathname that contains a / should not be subject to PATH
 searching.
>> 
>> Agreed - as this behavior is _mandated_ by POSIX, for both sh(1)
>> and for execlp(2) and friends.
> 
> 
> Is it that you don't read english as a first language, or are you
> just trying to be argumentative?'
> 
> I said:  Original Message  Subject: bash 4.2 breaks
> source finding libs in lib/filename... Date: Tue, 28 Feb 2012
> 17:34:21 -0800 From: Linda Walsh To: bug-bash
> 
> Why was this functionality removed in non-posix mode?
> 
> So, your arguments are all groundless and pointless, as your entire
> arguments stem from posix .. which I specifically said I'm NOT
> specifying.   If I want posix behavior, I can flick a switch and
> have such compatibility.
> 
> however, Bash was designed to EXceeed the limitations and features
> of POSIX, so the fact that posix is restrained in this area, is a
> perfect reason to allow it -- as it makes it
> 
> 
>> 
>>> Pathnames that *start* with '/' are called an "absolute"
>>> pathnames,
>>> 
>>> while paths not starting with '/' are relative.
>> 
>> And among the set of relative pathnames, there are two further 
>> divisions: anchored (contains at least one '/') and unanchored
>> (no '/'). PATH lookup is defined as happening _only_ for
>> unanchored names.
>> 
>>> Try 'C', if you include a include file with "/", it scans for
>>> it in each .h root.
>> 
>> The C compiler _isn't_ doing a PATH search, so it follows
>> different rules.
>> 
>>> Almost all normal utils take their 'paths to be the 'roots' of 
>>> trees that contain files.  Why should bash be different?
>> 
>> Because that's what POSIX says.
> 
> --- Posix says to ground paths with "/" in them at the root's of
> their paths?   But it says differently for BASH?   you aren't
> making sense.
> 
> All the utils.
> 
> What does man do?... it looks for a "/" separated hierarchy under 
> EACH entry of MANPATH.
> 
> What does Perl do?  It looks for a "/" separated hierarchy under
> each entry in lib.
> 
> What does vim do?  It looks for a vim-hierarchy under each entry
> of it's list of vim-runtimes.
> 
> what does ld do?  What does C do?  What does C++ do?   They all
> look for "/" separated hierarchies under a PATH-like root.
> 
> 
> You claim that behavior is mandated by posix?   I didn't know
> posix specified perl standards.  or vim... but say they do 
> then why wouldn't you also look for a "/" separated hierarchy under
> PATH?
> 
> What does X do?   -- a "/" separated hierarchy?
> 
> 
> What does Microsoft do for registry locations?   a "\" separated 
> hierarchy under 64 or 32-bit registry areas.
> 
> Where do demons look for files?  Under a "/" separated hierarchy
> that may be root or a pseudo-root...
> 
> All of these utils use "/" separated hierarchies -- none of them
> refuse to do a path lookup with "/" is in the file name.   The
> entire concept of libraries would fail -- as they are organized
> hierarchically.   but you may not know the library location until
> runtime, so you have a path and a hierarchical lookup.
> 
> So why shouldn't Bash be able to look for 'library' functions in a 
> hierarchy?
> 
> Note -- as we are talking about non-posix mode of BASH, you can't
> use POSIX as a justification.
> 
> 
> As for making another swithc -- there is already a switch --
> 'posix' for posix behavior.
> 
> I'm not asking for a change in posix behavior, so you can continue
> using posix mode ...
> 
> 
> 
> 
>> 
>>> It goes against 'common sense' and least surprise -- given it's
>>> the norm in so many other applications.
>> 
>> About the best we can do is accept a patch (are you willing to
>> write it? if not, quit complaining)
> 
> 
>> that would add a new shopt, off by default,
> 
> 
> ---
> 
> I would agree to it being off in posix mode, by default, and on,
> by default when not in posix mode...
> 
> 
> 
>> allow your desired alternate behavior.  But I won't write such a
>> patch, and if such a patch is written, I won't use it, because
>> I'm already used to the POSIX behavior.
> 
> --- How do you use the current behavior that doesn't do a path
> lookup if you include a / in the path (not at the beginning), that
> you would be able to make use of if you added "." to the beginning
> of your path (either temporarily or permanently...)?
> 
> 
> How do you organize your hierarchical libraries with bash so they
> don't have hard coded paths?
> 
> 
> 
why not just do something like this?
# FindInPathVarExt
 [ [  [  ]]]
function FindInPathVarExt {
  local -a PathList
  IFS=":" read -a PathList <<< "${2}"
  for CPath in "${PathList[@]}" ; do
for CTest in "${@:4}"; do
  test "${CTest}" "${CPath}/${3}" || continue 2
done
printf -v "${1}" "${CPath}/${3}"
return 0
  done
  printf -v "${1}" "Not Found"
  ret

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney

On 03/01/2012 12:12 AM, Andreas Schwab wrote:
> John Kearney  writes:
> 
>> It isn't just the quote removal that is confusing.
>> 
>> The escape character is also not removed and has its special
>> meaning.
> 
> The esacape character is also a quote character, thus also subject
> to quote removal.
> 
> Andreas.
> 
oh wasn't aware of that distinction thx.

Re: Inconsistent quote and escape handling in substitution part of parameter expansions.

2012-02-29 Thread John Kearney

On 02/29/2012 11:55 PM, Chet Ramey wrote:
> On 2/28/12 4:28 PM, John Kearney wrote:
>>
>> On 02/28/2012 10:05 PM, Chet Ramey wrote:
>>> On 2/28/12 12:26 PM, John Kearney wrote:
>>>
>>>> But that isn't how it behaves.
>>>> "${test//str/""}"
>>>>
>>>> because str is replaced with '""' as such it is treating the double
>>>> quotes as string literals.
>>>>
>>>> however at the same time these literal double quotes escape/quote a
>>>> single quote between them.
>>>> As such they are treated both as literals and as quotes as such
>>>> inconsistently.
>>>
>>> I don't have a lot of time today, but I'm going to try and answer bits
>>> and pieces of this discussion.
>>>
>>> Yes, bash opens a new `quoting context' (for lack of a better term) inside
>>> ${}.  Posix used to require it, though after lively discussion it turned
>>> into "well, we said that but it's clearly not what we meant."
>>>
>>> There are a couple of places in the currently-published version of the
>>> standard, minus any corregendia, that specify this.  The description of
>>> ${parameter} reads, in part,
>>>
>>> "The matching closing brace shall be determined by counting brace levels,
>>> skipping over enclosed quoted strings, and command substitutions."
>>>
>>> The section on double quotes reads, in part:
>>>
>>> "Within the string of characters from an enclosed "${" to the matching
>>> '}', an even number of unescaped double-quotes or single-quotes, if any,
>>> shall occur."
>>>
>>> Chet
>>
>> yhea but I think the point is that the current behavior is useless.
>> there is no case where I want a " to be printed and start a double
>> quoted string? and thats the current behavior.
> 
> Maybe you don't, but there are several cases in the test suite that do
> exactly that, derived from an old bug report.
> 
> We don't have to keep the bash-4.2 behavior, but we need to acknowledge
> that it's not backwards-compatible.
> 
> 

Personally vote for ksf93 like behavior, was more intuitive for me, not
that I've tested it all that much but the first impression was a good
one. seriously try it out an see which behavior you want to use.

As for backward compatibility. to be honest I think that anybody who
relied on this behavior should be shot ;) Like someone already said the
only sane way to use it now is with a variable.

Re: RFE: allow bash to have libraries

2012-03-01 Thread John Kearney

:) :))
Personal best wrote about 1 lines of code  which finally became
about 200ish to implement a readkey function.

Actually ended up with 2 solutions 1 basted on a full bash script
vt100 parser weighing in a about 500 lines including state tables and
a s00 line hack.

Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash

Personally I'd have to say using path to source a moduel is a massive
securtiy risk but thats just me.
I actually have a pretty complex bash modules hierarchy solution.
If anybodys interested I guess I could upload it somewhere if anybodys
interested, its just a play thing for me really but its couple 1000
lines of code proabely more like 1+.
Its kinda why I started updating Gregs wiwi I noticed I'd found
different/better ways of dealing with a lot of problems.

Thiing like secured copy/move funtions. Task Servers.
Generic approach to user interface interactions. i.e. supports both
gui and console input in my scripts.
Or I even started a bash based ncurses type system :), like I say some
fune still got some performance issues with that one.

Or improves select function that supports arrow keys and mouse
selection, written in bash.

Anybody interested in this sort of thing?

On 03/01/2012 11:48 PM, Linda Walsh wrote:
> John Kearney wrote: ... [large repetitive included text elided...]
> 
>> why not just do something like this?
>> 
>  <26 line suggested 'header' elided...>
>> gives you more control anyway, pretty quick and simple.
>> 
>> 
> At least 30% of the point of this is to take large amounts of
> common initialization code that ends up at the front of  many or
> most of my scripts and have it hidden in a side file where it can 
> just be 'included'...
> 
> Having to add 26 lines of code just to include 20 common lines
> doesn't sound like a net-gain...
> 
> 
> I thought of doing something similar until I realized I'd end up 
> with some path-search routine written in shell at the beginning of
> each program just to enable bash to have structured & hierarchical
> libraries like any other programming language except maybe BASIC
> (or other shells)
> 
> My problem is I keep thinking problems can be solvable in a few
> lines of shell code.   Then they grow...   *sigh*...
> 
>

Re: RFE: allow bash to have libraries

2012-03-01 Thread John Kearney

https://github.com/dethrophes/Experimental-Bash-Module-System/blob/master/bash/template.sh
So can't repeat this enough !play code!!.
However suggestions are welcome. If this sort of thing is of
interesting I could maintain it online I guess.

basically I wan kinda thinking perl/python module libary when I started


So what I like

trap error etc and print error mesages
set nounset
Try to keep the files in 2 parts source part and run part.

Have a common args handler routine.

rediculously comples log output etc timestamped, line file function etc...

stack trace on errors

color output red for errors etc.

silly comples userinterface routines :)

I guess just have a look see and try it out.


Also note I think a lot of the files are empty/or silly files that
should actually be deleted don't have time to go through them now though.

I'd also advise using ctags, tagging it and navigating so, its what I do.


On 03/02/2012 03:54 AM, Clark J. Wang wrote:
> On Fri, Mar 2, 2012 at 08:20, John Kearney 
> wrote:
> 
>> :) :)) Personal best wrote about 1 lines of code  which
>> finally became about 200ish to implement a readkey function.
>> 
>> Actually ended up with 2 solutions 1 basted on a full bash
>> script vt100 parser weighing in a about 500 lines including state
>> tables and a s00 line hack.
>> 
>> Check out http://mywiki.wooledge.org/ReadingFunctionKeysInBash
>> 
>> 
>> Personally I'd have to say using path to source a moduel is a
>> massive securtiy risk but thats just me. I actually have a pretty
>> complex bash modules hierarchy solution. If anybodys interested I
>> guess I could upload it somewhere if anybodys interested,
> 
> 
> I just found https://gist.github.com/ a few days ago :)
> 
> Gist is a simple way to share snippets and pastes with others. All
> gists are git repositories, so they are automatically versioned,
> forkable and usable as a git repository.
> 
> 
>> its just a play thing for me really but its couple 1000 lines of
>> code proabely more like 1+. Its kinda why I started updating
>> Gregs wiwi I noticed I'd found different/better ways of dealing
>> with a lot of problems.
>> 
>> Thiing like secured copy/move funtions. Task Servers. Generic
>> approach to user interface interactions. i.e. supports both gui
>> and console input in my scripts. Or I even started a bash based
>> ncurses type system :), like I say some fune still got some
>> performance issues with that one.
>> 
>> Or improves select function that supports arrow keys and mouse 
>> selection, written in bash.
>> 
>> Anybody interested in this sort of thing?
>> 
> 
> I'm interested.

Re: bash 4.2 breaks source finding libs in lib/filename...

2012-03-03 Thread John Kearney

On 03/03/2012 09:43 AM, Stefano Lattarini wrote:
> On 03/03/2012 08:28 AM, Pierre Gaston wrote:
>> On Fri, Mar 2, 2012 at 9:54 AM, Stefano Lattarini wrote:
>>
>>> Or here is a what it sounds as a marginally better idea to me: Bash could
>>> start supporting a new environment variable like "BASHLIB" (a' la'
>>> PERL5LIB)
>>> or "BASHPATH" (a' la' PYTHONPATH) holding a colon separated (or semicolon
>>> separated on Windows) list of directories where bash will look for sourced
>>> non-absolute files (even if they contain a pathname separator) before
>>> (possibly) performing a lookup in $PATH and then in the current directory.
>>> Does this sounds sensible, or would it add too much complexity and/or
>>> confusion?
>>
>> It could be even furthermore separated from the traditional "source" and a
>> new keyword introduced like "require"
>>
> This might be a slightly better interface, yes.
Agreed though include might be a better name than require. and if your
at it why not include <> and include ""

> 
>> a la lisp which would be able to do things like:
>>
>> 1) load the file, searching in the BASH_LIB_PATH (or other variables) for a
>> file with optionally the extension .sh or .bash
>> 2) only load the file if the "feature" as not been provided, eg only load
>> the file once
>>
> These sound good :-)
No I don't like that. if you want something like that just use inclusion
protection like every other language.
if [ -z "${__file_sh__:-}" ]; then
  __file_sh__=1


fi


and my source wrapper function actually checks for that variable b4
sourcing the file.
off the top of my head something like this.
[ -n "${!__$(basename "${sourceFile}" .sh)_sh__}" ] || source
"${sourceFile}"

> 
>> 3) maybe optionally only load the definition and not execute commands
>> (something I've seen people asking for on several occasions on IRC), for
>> instance that would allow to have test code inside the lib file or maybe
>> print a warning that it's a library not to be executed. (No so important
>> imo)
>>
> ... and even python don't do that!  If people care about making the test
> code in the module "automatically executable" when the module is run as
> a script, they could use an idiom similar to the python one:
> 
>   # For python.
>   if __name__ == "__main__":
> test code ...
> 
> i.e.:
> 
>   # For bash.
>   if [[ -n $BASH_SOURCE ]]; then
> test code ...
>   fi
> 
Only works if you source from thh command line, not execute.
what you actually have to do is something like this
  # For bash.
  if [[ "$(basename "${0}")" = scriptname.sh ]]; then
test code ...
  fi


>> I think this would benefit the bash_completion project and help them to
>> split the script so that the completion are only loaded on demand.
>> (one of the goal mentionned at http://bash-completion.alioth.debian.org/ is
>> "make bash-completion dynamically load completions")
>> My understanding is that the
>> http://code.google.com/p/bash-completion-lib/project did something
>> like this but that it was not  working entirely as
>> they wanted.
>> (I hope some of the devs reads this list)
>>
>> On the other hand, there is the possibility to add FPATH and autoload like
>> in ksh93 ...
>> I haven't think to much about it but my guess is that it would really be
>> easy to implement a module system with that.
>>
>> my 2 centsas I don't have piles of bash lib.
>>
> Same here -- it was more of a "theoretical suggestion", in the category of
> "hey, you know what would be really cool to have?" :-)  But I don't deeply
> care about it, personally.

What would be really useful (dreamy eyes) would be namespace support :)

something like this
 { # codeblock
   namespace namespace1
   testvar=s

   { # codeblock
 namespace namespace2
 testvar=s

   }
 }

 treated like this
 namespace1.testvar=s
 namespace1.namespace2.testvar=s



 although non posix this is already kinda supported because you can do
 function test1.ert.3 {
 }

 I mean all you would do is treat the namespace as a variable preamble
 so you'd have something like this to find the function etc
 if [ type "${varname}" ]
 elif [ type "${namespace}${varname}" ]
 else error not found

 wouldn't actually break anything afaik.


> 
> Regards,
>   Stefano
>

Re: Can somebody explain to me what u32tochar in /lib/sh/unicode.c is trying to do?

2012-03-06 Thread John Kearney

You really should stop using this function. It is just plain wrong, and
is not predictable.

It may enocde BIG5 and SJIS but is is more by accident that intent.

If you want to do something like this then do it properly.

basically all of the multibyte system have to have a detection method
for multibyte characters, most of them rely on bit7 to indicate a
multibyte sequence or use vt100 SS3 escape sequences. You really can't
just inject random data into a txt buffer. even returning UTF-8 as a
fallback is a bug. The most that should be done is return ASCII in error
case and I mean U+0-U+7f only and ignore or warn about any unsupported
characters.

Using this function is dangerous and pointless.

I mean seriously in what world does it make sense to inject utf-8 into a
big5 string? Or indead into a ascii string. Code should behave like an
adult, not like a frightened kid. By which I mean it shouldn't pretend
it knows what its doing when it doesn't, it should admit the problem so
that the problem can be fixed.

On 02/21/2012 04:28 AM, Chet Ramey wrote:
> On 2/19/12 5:07 PM, John Kearney wrote:
>> Can somebody explain to me what u32tochar is trying to do?
>>
>> It seems like dangerous code?
>>
>> from the context i'm guessing it trying to make a hail mary pass at
>> converting utf-32 to mb (not utf-8 mb)
> 
> Pretty much.  It's a big-endian representation of a 32-bit integer
> as a character string.  It's what you get when you don't have iconv
> or iconv fails and the locale isn't UTF-8.  It may not be useful,
> but it's predictable.  If we have a locale the system doesn't know
> about or can't translate, there's not a lot we can do.
> 
> Chet

Please remove iconv_open (charset, "ASCII"); from unicode.c

2012-03-07 Thread John Kearney

Hi chet can you please remove the following from the unicode.c file

localconv = iconv_open (charset, "ASCII");

This is invalid fall back. zhis creates a translation config. The
primary attempt is utf-8 to destination codeset. If that conversion
fails this tries selecting ASCII to codeset. ! But the code still
inputs utf-8 as input to the icconv. this means that this is less
likely to successfully encode than a simple assignment. consider
U+80 becomes utf-8 "\xc2\x80" which because we tell iconv this is
ascii becomes ascii "\xc2\x80".

do this line takes a U+80 and turns it into a U+c3 and a U+80.

The way i rewrote the icconv code made it cleaner, safer and quicker,
please consider using it. I avoided the need for the strcpy among
other things.

On 02/21/2012 03:42 AM, Chet Ramey wrote:
> On 2/18/12 5:39 AM, John Kearney wrote:
> 
>> Bash Version: 4.2 Patch Level: 10 Release Status: release
>> 
>> Description: Current u32toutf8 only encode values below 0x
>> correctly. wchar_t can be ambiguous size better in my opinion to
>> use unsigned long, or uint32_t, or something clearer.
> 
> Thanks for the patch.  It's good to have a complete
> implementation, though as a practical matter you won't see UTF-8
> characters longer than four bytes.  I agree with you about the
> unsigned 32-bit int type; wchar_t is signed, even if it's 32 bits,
> on several systems I use.
> 
> Chet
>

Re: Is it possible or RFE to expand ranges of arrays

2012-04-25 Thread John Kearney


Am 26.04.2012 06:26, schrieb Linda Walsh:

I know I can get
a="abcdef" echo "${a[2:4]}" = cde

 how do I do:
typeset -a a=(apple berry cherry date); then get:

echo ${a[1:2]} = "berry" "cherry"  ( non-grouped args)

I tried to do it in a function and hurt myself.




echo ${a[@]:1:2}

Re: Is it possible or RFE to expand ranges of arrays

2012-04-28 Thread John Kearney

Am 28.04.2012 05:05, schrieb Linda Walsh:

Maarten Billemont wrote:

On 26 Apr 2012, at 06:30, John Kearney wrote:

Am 26.04.2012 06:26, schrieb Linda Walsh:

I know I can get
a="abcdef" echo "${a[2:4]}" = cde

how do I do:
typeset -a a=(apple berry cherry date); then get:

echo ${a[1:2]} = "berry" "cherry"  ( non-grouped args)

I tried to do it in a function and hurt myself.

echo ${a[@]:1:2}

I see little reason to ask bash to wordsplit the elements after 
expanding them. 

You ought to quote that expansion.

---
Good point.

Since if you do:

> a=( 'apple pie' 'berry pie' 'cherry cake' 'dates divine')
> b=( ${a[@]:1:2} )
> echo ${#b[*]}
4
#yikes!
> b=( "${a[@]:1:2}" )
2
#woo!

I'd guess the original poster probably figured, I'd figure out the 
correct

form pretty quickly in usage.  but thanks for your insight.

( (to all)*sigh*)

I "always" quote not sure why I didn't that time.
Except that it was just a quick response to a simple question.
but of course your right.

Re: Fwd: Bash bug interpolating delete characters

2012-05-03 Thread John Kearney

Am 03.05.2012 15:01, schrieb Greg Wooledge:
>> Yours, Rüdiger.
>> a=x
>> del="$(echo -e "\\x7f")"
>>
>> echo "$del${a#x}" | od -ta
>> echo "$del ${a#x}" | od -ta
>> echo " $del${a#x}" | od -ta
> Yup, confirmed that it breaks here, and only when the # parameter expansion
> is included.
>
> imadev:~$ del=$'\x7f' a=x b=
> imadev:~$ echo " $del$b" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${b}" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${b#x}" | od -ta
> 000   sp del  nl
> 003
> imadev:~$ echo " $del${a#x}" | od -ta
> 000   sp  nl
> 002
>
> Bash 4.2.24.
>
Also Confirmed, but my output is a bit wackier.
printf %q seems to get confused, and do invalid things as well.

the \x7f becomes a \

function printTests {
while [ $# -gt 0 ]; do
printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")"
shift
done
}

a=x
del=$'\x7f'
printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"'
printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"'
printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '"
${del:0:1}${a:0:0}"'
printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"'
printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"'

output
"$del${a#x}"=[$'\177']
"$del ${a#x}"   =[\ ]
" $del${a#x}"   =[\ ]
" $del${a%x}"   =[\ ]
" $del${a:0:0}" =[\ ]
" $del"${a:0:0} =[$' \177']
" $del""${a:0:0}"   =[$' \177']
" $del${a}" =[$' \177x']
" $del" =[$' \177']
" ${del}${a:0:0}"   =[\ ]
" ${del:0:1}${a:0:0}"=[\ ]
" ${del:0:1}${a}"   =[$' \177x']
"${del:0:1}${a#d}"  =[$'\177x']
"${del:0:1}${a#x}"  =[$'\177']
" ${del:0:1} ${a}"  =[$' \177 x']
"${del:0:1} ${a#d}" =[$'\177 x']
"${del:0:1} ${a#x}" =[\ ]

Re: Fwd: Bash bug interpolating delete characters

2012-05-03 Thread John Kearney

Am 03.05.2012 19:41, schrieb John Kearney:
> Am 03.05.2012 15:01, schrieb Greg Wooledge:
>>> Yours, Rüdiger.
>>> a=x
>>> del="$(echo -e "\\x7f")"
>>>
>>> echo "$del${a#x}" | od -ta
>>> echo "$del ${a#x}" | od -ta
>>> echo " $del${a#x}" | od -ta
>> Yup, confirmed that it breaks here, and only when the # parameter expansion
>> is included.
>>
>> imadev:~$ del=$'\x7f' a=x b=
>> imadev:~$ echo " $del$b" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${b}" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${b#x}" | od -ta
>> 000   sp del  nl
>> 003
>> imadev:~$ echo " $del${a#x}" | od -ta
>> 000   sp  nl
>> 002
>>
>> Bash 4.2.24.
>>
> Also Confirmed, but my output is a bit wackier.
> printf %q seems to get confused, and do invalid things as well.
>
> the \x7f becomes a \
disregard the comment about printf its just escaping the space.
>
> function printTests {
> while [ $# -gt 0 ]; do
> printf"%-20s=[%q]\n""${1}" "$(eval echo "${1}")"
> shift
> done
> }
>
> a=x
> del=$'\x7f'
> printTests '"$del${a#x}"' '"$del ${a#x}"' '" $del${a#x}"' '" $del${a%x}"'
> printTests '" $del${a:0:0}"' '" $del"${a:0:0}' '" $del""${a:0:0}"'
> printTests '" $del${a}"' '" $del"' '" ${del}${a:0:0}"' '"
> ${del:0:1}${a:0:0}"'
> printTests '" ${del:0:1}${a}"' '"${del:0:1}${a#d}"' '"${del:0:1}${a#x}"'
> printTests '" ${del:0:1} ${a}"' '"${del:0:1} ${a#d}"' '"${del:0:1} ${a#x}"'
>
> output
> "$del${a#x}"=[$'\177']
> "$del ${a#x}"   =[\ ]
> " $del${a#x}"   =[\ ]
> " $del${a%x}"   =[\ ]
> " $del${a:0:0}" =[\ ]
> " $del"${a:0:0} =[$' \177']
> " $del""${a:0:0}"   =[$' \177']
> " $del${a}" =[$' \177x']
> " $del" =[$' \177']
> " ${del}${a:0:0}"   =[\ ]
> " ${del:0:1}${a:0:0}"=[\ ]
> " ${del:0:1}${a}"   =[$' \177x']
> "${del:0:1}${a#d}"  =[$'\177x']
> "${del:0:1}${a#x}"  =[$'\177']
> " ${del:0:1} ${a}"  =[$' \177 x']
> "${del:0:1} ${a#d}" =[$'\177 x']
> "${del:0:1} ${a#x}" =[\ ]
>
>
>
>
>
>

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

I tend to do something more like this


function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
"${@}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
echo WrapJob "${1}" "${2}"
WrapJob "${1}" "${2}" &
shift 2
done
sleep 1
done
}
function testProcess {
echo "${*}"
sleep 1
}
runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
"jiji#"

tends to work well enough.
it gets a bit more complex if you want to recover output but not too much.

Am 03.05.2012 21:21, schrieb Elliott Forney:
> Here is a construct that I use sometimes... although you might wind up
> waiting for the slowest job in each iteration of the loop:
>
>
> maxiter=100
> ncore=8
>
> for iter in $(seq 1 $maxiter)
> do
>   startjob $iter &
>
>   if (( (iter % $ncore) == 0 ))
>   then
> wait
>   fi
> done
>
>
> On Thu, May 3, 2012 at 12:49 PM, Colin McEwan  wrote:
>> Hi there,
>>
>> I don't know if this is anything that has ever been discussed or
>> considered, but would be interested in any thoughts.
>>
>> I frequently find myself these days writing shell scripts, to run on
>> multi-core machines, which could easily exploit lots of parallelism (eg. a
>> batch of a hundred independent simulations).
>>
>> The basic parallelism construct of '&' for async execution is highly
>> expressive, but it's not useful for this sort of use-case: starting up 100
>> jobs at once will leave them competing, and lead to excessive context
>> switching and paging.
>>
>> So for practical purposes, I find myself reaching for 'make -j' or GNU
>> parallel, both of which destroy the expressiveness of the shell script as I
>> have to redirect commands and parameters to Makefiles or stdout, and
>> wrestle with appropriate levels of quoting.
>>
>> What I would really *like* would be an extension to the shell which
>> implements the same sort of parallelism-limiting / 'process pooling' found
>> in make or 'parallel' via an operator in the shell language, similar to '&'
>> which has semantics of *possibly* continuing asynchronously (like '&') if
>> system resources allow, or waiting for the process to complete (';').
>>
>> Any thoughts, anyone?
>>
>> Thanks!
>>
>> --
>> C.
>>
>> https://plus.google.com/109211294311109803299
>> https://www.facebook.com/mcewanca

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

Am 03.05.2012 22:30, schrieb Greg Wooledge:
> On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote:
>> function runJobParrell {
>> local mjobCnt=${1} && shift
>> jcnt=0
>> function WrapJob {
>> "${@}"
>> kill -s USR2 $$
>> }
>> function JobFinised {
>> jcnt=$((${jcnt}-1))
>> }
>> trap JobFinised USR2
>> while [ $# -gt 0 ] ; do
>> while [ ${jcnt} -lt ${mjobCnt} ]; do
>> jcnt=$((${jcnt}+1))
>> echo WrapJob "${1}" "${2}"
>> WrapJob "${1}" "${2}" &
>> shift 2
>> done
>> sleep 1
>> done
>> }
>> function testProcess {
>> echo "${*}"
>> sleep 1
>> }
>> runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
>> "jiji#"
>>
>> tends to work well enough.
>> it gets a bit more complex if you want to recover output but not too much.
> The real issue here is that there is no generalizable way to store an
> arbitrary command for later execution.  Your example assumes that each
> pair of arguments constitutes one simple command, which is fine if that's
> all you need it to do.  But the next guy asking for this will want to
> schedule arbitrarily complex shell pipelines and complex commands with
> here documents and brace expansions and 
>


:)
A more complex/flexible example. More like what I actually use.




  CNiceLevel=$(nice)
declare -a JobArray
function PushAdvancedCmd {
local IFS=$'\v'
JobArray+=("${*}")
}
function PushSimpleCmd {
PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
}
function PushNiceCmd {
PushAdvancedCmd  WrapJob "${@}"
}
function UnpackCmd {
local IFS=$'\v'
set -o noglob
_RETURN=( .${1}. )  
set +o noglob
_RETURN[0]="${_RETURN[0]#.}"
local -i le=${#_RETURN[@]}-1
_RETURN[${le}]="${_RETURN[${le}]%.}"
}
function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
[ ${1} -le ${CNiceLevel} ] || renice -n ${1}
local Buffer=$("${@:2}")
echo "${Buffer}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
UnpackCmd "${1}"
"${_RETURN[@]}" &
shift
done
sleep 1
done
}



function testProcess {
echo "${*}"
sleep 1
}
#  So standard variable args can be handled in 2 ways 1
#  encode them as such
PushSimpleCmd testProcess "jiji#" dfds dfds dsfsd
PushSimpleCmd testProcess "jiji#" dfds dfds
PushNiceCmd 20 testProcess "jiji#" dfds
PushSimpleCmd testProcess "jiji#"
PushSimpleCmd testProcess "jiji#" "*" s
# more complex things just wrap them in a function and call it
function DoComplexMagicStuff1 {
echo "${@}" >&2
}
# Or more normally just do a hybrid of both.
PushSimpleCmd DoComplexMagicStuff1 "jiji#"

#
   
runJobParrell 1 "${JobArray[@]}"



Note there is another level of complexity where I start a JobQueue
Process and issues it commands using a fifo.

Re: Parallelism a la make -j / GNU parallel

2012-05-03 Thread John Kearney

This version might be easier to follow. The last version was more for
being able to issue commands via a fifo to a job queue server.

  function check_valid_var_name {
case "${1:?Missing Variable Name}" in
  [!a-zA-Z_]* | *[!a-zA-Z_0-9]* ) return 3;;
esac
  }


  CNiceLevel=$(nice)
declare -a JobArray
function PushAdvancedCmd {
local le="tmp_array${#JobArray[@]}"
JobArray+=("${le}")
eval "${le}"'=("${@}")'
}
function PushSimpleCmd {
PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
}
function PushNiceCmd {
PushAdvancedCmd  WrapJob "${@}"
}
function UnpackCmd {
check_valid_var_name ${1} || return $?
eval _RETURN=('"${'"${1}"'[@]}"')
unset "${1}[@]"
}
function runJobParrell {
local mjobCnt=${1} && shift
jcnt=0
function WrapJob {
[ ${1} -le ${CNiceLevel} ] || renice -n ${1}
local Buffer=$("${@:2}")
echo "${Buffer}"
kill -s USR2 $$
}
function JobFinised {
jcnt=$((${jcnt}-1))
}
trap JobFinised USR2
while [ $# -gt 0 ] ; do
while [ ${jcnt} -lt ${mjobCnt} ]; do
jcnt=$((${jcnt}+1))
if UnpackCmd "${1}" ; then
"${_RETURN[@]}" &
else
continue
fi
shift
    done
sleep 1
done
}





Am 03.05.2012 23:23, schrieb John Kearney:
> Am 03.05.2012 22:30, schrieb Greg Wooledge:
>> On Thu, May 03, 2012 at 10:12:17PM +0200, John Kearney wrote:
>>> function runJobParrell {
>>> local mjobCnt=${1} && shift
>>> jcnt=0
>>> function WrapJob {
>>> "${@}"
>>> kill -s USR2 $$
>>> }
>>> function JobFinised {
>>> jcnt=$((${jcnt}-1))
>>> }
>>> trap JobFinised USR2
>>> while [ $# -gt 0 ] ; do
>>> while [ ${jcnt} -lt ${mjobCnt} ]; do
>>> jcnt=$((${jcnt}+1))
>>> echo WrapJob "${1}" "${2}"
>>> WrapJob "${1}" "${2}" &
>>> shift 2
>>> done
>>> sleep 1
>>> done
>>> }
>>> function testProcess {
>>> echo "${*}"
>>> sleep 1
>>> }
>>> runJobParrell 2  testProcess "jiji#" testProcess "jiji#" testProcess
>>> "jiji#"
>>>
>>> tends to work well enough.
>>> it gets a bit more complex if you want to recover output but not too much.
>> The real issue here is that there is no generalizable way to store an
>> arbitrary command for later execution.  Your example assumes that each
>> pair of arguments constitutes one simple command, which is fine if that's
>> all you need it to do.  But the next guy asking for this will want to
>> schedule arbitrarily complex shell pipelines and complex commands with
>> here documents and brace expansions and 
>>
>
> :)
> A more complex/flexible example. More like what I actually use.
>
>
>
>
>   CNiceLevel=$(nice)
> declare -a JobArray
> function PushAdvancedCmd {
> local IFS=$'\v'
> JobArray+=("${*}")
> }
> function PushSimpleCmd {
> PushAdvancedCmd  WrapJob ${CNiceLevel} "${@}"
> }
> function PushNiceCmd {
> PushAdvancedCmd  WrapJob "${@}"
> }
> function UnpackCmd {
> local IFS=$'\v'
> set -o noglob
> _RETURN=( .${1}. )  
> set +o noglob
> _RETURN[0]="${_RETURN[0]#.}"
> local -i le=${#_RETURN[@]}-1
> _RETURN[${le}]="${_RETURN[${le}]%.}"
> }
> function runJobParrell {
> local mjobCnt=${1} && shift
> jcnt=0
> function WrapJob {
> [ ${1} -le ${CNiceLevel} ] || renice -n ${1}
> local Buffer=$("${@:2}")
> echo "${Buffer}"
> kill -s USR2 $$
> }
> function JobFinised {
> jcnt=$((${jcnt}-1))
> }
> trap JobFinised USR2
> while [ $# -gt 0 ] ; do
> while [ ${jcnt} -lt ${mjobCnt} ]; do
> jcnt=$((${jcnt}+1))
&

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 20:53, schrieb Mike Frysinger:
> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>> Mike Frysinger  writes:
>>> i wish there was a way to use `wait` that didn't block until all the pids
>>> returned.  maybe a dedicated option, or a shopt to enable this, or a new
>>> command.
>>>
>>> for example, if i launched 10 jobs in the background, i usually want to
>>> wait for the first one to exit so i can queue up another one, not wait
>>> for all of them.
>> If you set -m you can trap on SIGCHLD while waiting.
> awesome, that's a good mitigation
>
> #!/bin/bash
> set -m
> cnt=0
> trap ': $(( --cnt ))' SIGCHLD
> for n in {0..20} ; do
>   (
>   d=$(( RANDOM % 10 ))
>   echo $n sleeping $d
>   sleep $d
>   ) &
>   : $(( ++cnt ))
>   if [[ ${cnt} -ge 10 ]] ; then
>   echo going to wait
>   wait
>   fi
> done
> trap - SIGCHLD
> wait
>
> it might be a little racy (wrt checking cnt >= 10 and then doing a wait), but 
> this is good enough for some things.  it does lose visibility into which pids 
> are live vs reaped, and their exit status, but i more often don't care about 
> that ...
> -mike
That won't work I don't think.
I think you meant something more like this?

set -m
cnt=0
trap ': $(( --cnt ))' SIGCHLD
set -- {0..20}
while [ $# -gt 0 ]; do
if [[ ${cnt} -lt 10 ]] ; then

(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
) &
: $(( ++cnt ))
shift
fi
echo going to wait
sleep 1
done


which is basically what I did in my earlier example except I used USR2
instead of SIGCHLD and put it in a function to make it easier to use.

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 21:13, schrieb Mike Frysinger:
> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>> Mike Frysinger  writes:
>>>>> i wish there was a way to use `wait` that didn't block until all the
>>>>> pids returned.  maybe a dedicated option, or a shopt to enable this,
>>>>> or a new command.
>>>>>
>>>>> for example, if i launched 10 jobs in the background, i usually want to
>>>>> wait for the first one to exit so i can queue up another one, not wait
>>>>> for all of them.
>>>> If you set -m you can trap on SIGCHLD while waiting.
>>> awesome, that's a good mitigation
>>>
>>> #!/bin/bash
>>> set -m
>>> cnt=0
>>> trap ': $(( --cnt ))' SIGCHLD
>>> for n in {0..20} ; do
>>>
>>> (
>>> 
>>> d=$(( RANDOM % 10 ))
>>> echo $n sleeping $d
>>> sleep $d
>>> 
>>> ) &
>>> 
>>> : $(( ++cnt ))
>>> 
>>> if [[ ${cnt} -ge 10 ]] ; then
>>> 
>>> echo going to wait
>>> wait
>>> 
>>> fi
>>>
>>> done
>>> trap - SIGCHLD
>>> wait
>>>
>>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait),
>>> but this is good enough for some things.  it does lose visibility into
>>> which pids are live vs reaped, and their exit status, but i more often
>>> don't care about that ...
>> That won't work I don't think.
> seemed to work fine for me
>
>> I think you meant something more like this?
> no.  i want to sleep the parent indefinitely and fork a child asap (hence the 
> `wait`), not busy wait with a one second delay.  the `set -m` + SIGCHLD 
> interrupted the `wait` and allowed it to return.
> -mike
The functionality of the code doesn't need SIGCHLD, it still waits till
all the 10 processes are finished before starting the next lot.

it only interrupts the wait to decrement the counter.

to do what your talking about you would have to start the new subprocess
in the SIGCHLD trap.


try this out it might make it clearer what I mean

set -m
cnt=0
trap ': $(( --cnt )); echo SIGCHLD' SIGCHLD
for n in {0..20} ; do
(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
echo $n exiting $d
) &
: $(( ++cnt ))
if [[ ${cnt} -ge 10 ]] ; then
echo going to wait
wait
fi
done
trap - SIGCHLD
wait

Re: Parallelism a la make -j / GNU parallel

2012-05-04 Thread John Kearney

Am 04.05.2012 21:11, schrieb Greg Wooledge:
> On Fri, May 04, 2012 at 09:02:27PM +0200, John Kearney wrote:
>> set -m
>> cnt=0
>> trap ': $(( --cnt ))' SIGCHLD
>> set -- {0..20}
>> while [ $# -gt 0 ]; do
>>  if [[ ${cnt} -lt 10 ]] ; then
>>
>>  (
>>  d=$(( RANDOM % 10 ))
>>  echo $n sleeping $d
>>  sleep $d
>>  ) &
>>  : $(( ++cnt ))
>>  shift
>>  fi
>>  echo going to wait
>>  sleep 1
>> done
> You're busy-looping with a 1-second sleep instead of using wait and the
> signal handler, which was the whole purpose of the previous example (and
> of the set -m that you kept in yours).  And $n should probably be $1 there.
>
see my response to mike.


what you are thinking about is either what I suggested or something like
this

function TestProcess_22 {
local d=$(( RANDOM % 10 ))
echo $1 sleeping $d
sleep $d
echo $1 exiting $d
}
function trap_SIGCHLD {
echo "SIGCHLD";
if [ $cnt -gt 0 ]; then
: $(( --cnt ))
TestProcess_22 $cnt  &
fi
}
set -m
cnt=20
maxJobCnt=10
trap 'trap_SIGCHLD' SIGCHLD
for (( x=0; x

Re: Parallelism a la make -j / GNU parallel

2012-05-05 Thread John Kearney

Am 05.05.2012 06:35, schrieb Mike Frysinger:
> On Friday 04 May 2012 15:25:25 John Kearney wrote:
>> Am 04.05.2012 21:13, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>>>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>>>> Mike Frysinger  writes:
>>>>>>> i wish there was a way to use `wait` that didn't block until all the
>>>>>>> pids returned.  maybe a dedicated option, or a shopt to enable this,
>>>>>>> or a new command.
>>>>>>>
>>>>>>> for example, if i launched 10 jobs in the background, i usually want
>>>>>>> to wait for the first one to exit so i can queue up another one, not
>>>>>>> wait for all of them.
>>>>>> If you set -m you can trap on SIGCHLD while waiting.
>>>>> awesome, that's a good mitigation
>>>>>
>>>>> #!/bin/bash
>>>>> set -m
>>>>> cnt=0
>>>>> trap ': $(( --cnt ))' SIGCHLD
>>>>> for n in {0..20} ; do
>>>>>   (
>>>>>   d=$(( RANDOM % 10 ))
>>>>>   echo $n sleeping $d
>>>>>   sleep $d
>>>>>   ) &
>>>>>   : $(( ++cnt ))
>>>>>   if [[ ${cnt} -ge 10 ]] ; then
>>>>>   echo going to wait
>>>>>   wait
>>>>>   fi
>>>>> done
>>>>> trap - SIGCHLD
>>>>> wait
>>>>>
>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>> wait), but this is good enough for some things.  it does lose
>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>> but i more often don't care about that ...
>>>> That won't work I don't think.
>>> seemed to work fine for me
>>>
>>>> I think you meant something more like this?
>>> no.  i want to sleep the parent indefinitely and fork a child asap (hence
>>> the `wait`), not busy wait with a one second delay.  the `set -m` +
>>> SIGCHLD interrupted the `wait` and allowed it to return.
>> The functionality of the code doesn't need SIGCHLD, it still waits till
>> all the 10 processes are finished before starting the next lot.
> not on my system it doesn't.  maybe a difference in bash versions.  as soon 
> as 
> one process quits, the `wait` is interrupted, a new one is forked, and the 
> parent goes back to sleep until another child exits.  if i don't `set -m`, 
> then i see what you describe -- the wait doesn't return until all 10 children 
> exit.
> -mike
Just to clarify what I see with your code, with the extra echos from me
and less threads so its shorter.
set -m
cnt=0
trap ': $(( --cnt )); echo "SIGCHLD"' SIGCHLD
for n in {0..10} ; do
(
d=$(( RANDOM % 10 ))
echo $n sleeping $d
sleep $d
echo $n exiting $d
) &
: $(( ++cnt ))
if [[ ${cnt} -ge 5 ]] ; then
echo going to wait
wait
echo Back from wait
fi
done
trap - SIGCHLD
wait
   
gives
0 sleeping 9
2 sleeping 4
going to wait
4 sleeping 7
3 sleeping 4
1 sleeping 6
2 exiting 4
SIGCHLD
3 exiting 4
SIGCHLD
1 exiting 6
SIGCHLD
4 exiting 7
SIGCHLD
0 exiting 9
SIGCHLD
Back from wait
5 sleeping 5
6 sleeping 5
going to wait
8 sleeping 1
9 sleeping 1
7 sleeping 3
9 exiting 1
8 exiting 1
SIGCHLD
SIGCHLD
7 exiting 3
SIGCHLD
6 exiting 5
SIGCHLD
5 exiting 5




now
this code
function TestProcess_22 {
local d=$(( RANDOM % 10 ))
echo $1 sleeping $d
sleep $d
echo $1 exiting $d
}
function trap_SIGCHLD {
echo "SIGCHLD";
if [ $cnt -gt 0 ]; then
: $(( --cnt ))
TestProcess_22 $cnt  &
fi
}
set -m
cnt=10
maxJobCnt=5
trap 'trap_SIGCHLD' SIGCHLD
for (( x=0; xhttp://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

uname -a
Linux DETH00 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

Re: Parallelism a la make -j / GNU parallel

2012-05-05 Thread John Kearney

Am 05.05.2012 06:28, schrieb Mike Frysinger:
> On Friday 04 May 2012 16:17:02 Chet Ramey wrote:
>> On 5/4/12 2:53 PM, Mike Frysinger wrote:
>>> it might be a little racy (wrt checking cnt >= 10 and then doing a wait),
>>> but this is good enough for some things.  it does lose visibility into
>>> which pids are live vs reaped, and their exit status, but i more often
>>> don't care about that ...
>> What version of bash did you test this on?  Bash-4.0 is a little different
>> in how it treats the SIGCHLD trap.
> bash-4.2_p28.  wait returns 145 (which is SIGCHLD).
>
>> Would it be useful for bash to set a shell variable to the PID of the just-
>> reaped process that caused the SIGCHLD trap?  That way you could keep an
>> array of PIDs and, if you wanted, use that variable to keep track of live
>> and dead children.
> we've got associative arrays now ... we could have one which contains all the 
> relevant info:
>   declare -A BASH_CHILD_STATUS=(
>   ["pid"]=1234
>   ["status"]=1# WEXITSTATUS()
>   ["signal"]=13   # WTERMSIG()
>   )
> makes it easy to add any other fields people might care about ...
> -mike
Is there actually a guarantee that there will be 1 SIGCHLD for every
exited process.
Isn't it actually a race condition?
what happens if 2 subprocesses exit simultaneously.
or if a process exits while already in the SIGCHLD trap.
I mean my normal interpretation of a interrupt/event/trap is just a
notification that I need to check what has happened. Or that there was
an event not the extent of the event?
I keep feeling that the following is bad practice

trap ': $(( --cnt ))' SIGCHLD

and would be better something like this

trap 'cnt=$(jobs -p | wc -w)' SIGCHLD


as such you would need something more like.
declare -a BASH_CHILD_STATUS=([1234]=1 [1235]=1 [1236]=1)

declare -a BASH_CHILD_STATUS_SIGNAL=([1234]=13 [1235]=13 [1236]=13)

Re: Parallelism a la make -j / GNU parallel

2012-05-06 Thread John Kearney

Am 06.05.2012 08:28, schrieb Mike Frysinger:
> On Saturday 05 May 2012 23:25:26 John Kearney wrote:
>> Am 05.05.2012 06:28, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 16:17:02 Chet Ramey wrote:
>>>> On 5/4/12 2:53 PM, Mike Frysinger wrote:
>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>> wait), but this is good enough for some things.  it does lose
>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>> but i more often don't care about that ...
>>>> What version of bash did you test this on?  Bash-4.0 is a little
>>>> different in how it treats the SIGCHLD trap.
>>> bash-4.2_p28.  wait returns 145 (which is SIGCHLD).
>>>
>>>> Would it be useful for bash to set a shell variable to the PID of the
>>>> just- reaped process that caused the SIGCHLD trap?  That way you could
>>>> keep an array of PIDs and, if you wanted, use that variable to keep
>>>> track of live and dead children.
>>> we've got associative arrays now ... we could have one which contains all
>>> the relevant info:
>>> declare -A BASH_CHILD_STATUS=(
>>> ["pid"]=1234
>>> ["status"]=1# WEXITSTATUS()
>>> ["signal"]=13   # WTERMSIG()
>>> )
>>>
>>> makes it easy to add any other fields people might care about ...
>> Is there actually a guarantee that there will be 1 SIGCHLD for every
>> exited process.
>> Isn't it actually a race condition?
> when SIGCHLD is delivered doesn't matter.  the child stays in a zombie state 
> until the parent calls wait() on it and gets its status.  so you can have 
> `wait` return one child's status at a time.
> -mike
but I think my point still stands
trap ': $(( cnt-- ))' SIGCHLD
is a bad idea, you actually need to verify how many jobs are running not
just arbitrarily decrement a counter, because your not guaranteed a trap
for each process. I mean sure it will normally work, but its not
guaranteed to work.

Also I think the question would be is there any point in forcing bash to
issue 1 status at a time? It seems to make more sense to issue them in
bulk.
So bash could populate an array of all reaped processes in one trap
rather than having to execute multiple traps. This is what bash does
internally anyway?

Re: Parallelism a la make -j / GNU parallel

2012-05-06 Thread John Kearney

Am 06.05.2012 08:28, schrieb Mike Frysinger:
> On Saturday 05 May 2012 04:28:50 John Kearney wrote:
>> Am 05.05.2012 06:35, schrieb Mike Frysinger:
>>> On Friday 04 May 2012 15:25:25 John Kearney wrote:
>>>> Am 04.05.2012 21:13, schrieb Mike Frysinger:
>>>>> On Friday 04 May 2012 15:02:27 John Kearney wrote:
>>>>>> Am 04.05.2012 20:53, schrieb Mike Frysinger:
>>>>>>> On Friday 04 May 2012 13:46:32 Andreas Schwab wrote:
>>>>>>>> Mike Frysinger writes:
>>>>>>>>> i wish there was a way to use `wait` that didn't block until all
>>>>>>>>> the pids returned.  maybe a dedicated option, or a shopt to enable
>>>>>>>>> this, or a new command.
>>>>>>>>>
>>>>>>>>> for example, if i launched 10 jobs in the background, i usually
>>>>>>>>> want to wait for the first one to exit so i can queue up another
>>>>>>>>> one, not wait for all of them.
>>>>>>>> If you set -m you can trap on SIGCHLD while waiting.
>>>>>>> awesome, that's a good mitigation
>>>>>>>
>>>>>>> #!/bin/bash
>>>>>>> set -m
>>>>>>> cnt=0
>>>>>>> trap ': $(( --cnt ))' SIGCHLD
>>>>>>> for n in {0..20} ; do
>>>>>>>
>>>>>>> (
>>>>>>> 
>>>>>>> d=$(( RANDOM % 10 ))
>>>>>>> echo $n sleeping $d
>>>>>>> sleep $d
>>>>>>> 
>>>>>>> ) &
>>>>>>> 
>>>>>>> : $(( ++cnt ))
>>>>>>> 
>>>>>>> if [[ ${cnt} -ge 10 ]] ; then
>>>>>>> 
>>>>>>> echo going to wait
>>>>>>> wait
>>>>>>> 
>>>>>>> fi
>>>>>>>
>>>>>>> done
>>>>>>> trap - SIGCHLD
>>>>>>> wait
>>>>>>>
>>>>>>> it might be a little racy (wrt checking cnt >= 10 and then doing a
>>>>>>> wait), but this is good enough for some things.  it does lose
>>>>>>> visibility into which pids are live vs reaped, and their exit status,
>>>>>>> but i more often don't care about that ...
>>>>>> That won't work I don't think.
>>>>> seemed to work fine for me
>>>>>
>>>>>> I think you meant something more like this?
>>>>> no.  i want to sleep the parent indefinitely and fork a child asap
>>>>> (hence the `wait`), not busy wait with a one second delay.  the `set
>>>>> -m` + SIGCHLD interrupted the `wait` and allowed it to return.
>>>> The functionality of the code doesn't need SIGCHLD, it still waits till
>>>> all the 10 processes are finished before starting the next lot.
>>> not on my system it doesn't.  maybe a difference in bash versions.  as
>>> soon as one process quits, the `wait` is interrupted, a new one is
>>> forked, and the parent goes back to sleep until another child exits.  if
>>> i don't `set -m`, then i see what you describe -- the wait doesn't
>>> return until all 10 children exit.
>> Just to clarify what I see with your code, with the extra echos from me
>> and less threads so its shorter.
> that's not what i was getting.  as soon as i saw the echo of SIGCHLD, a new 
> "sleeping" would get launched.
> -mike
Ok then, thats weird because it doesn't really make sense to me why a
SIGCHLD would interrupt the wait command. Oh well.

Re: Bash bug interpolating delete characters

2012-05-07 Thread John Kearney

Am 07.05.2012 22:46, schrieb Chet Ramey:
> On 5/3/12 5:53 AM, Ruediger Kuhlmann wrote:
>> Hi,
>>
>> please try the following bash script:
>>
>> a=x
>> del="$(echo -e "\\x7f")"
>>
>> echo "$del${a#x}" | od -ta
>> echo "$del ${a#x}" | od -ta
>> echo " $del${a#x}" | od -ta
>>
>> Using bash 3.2, the output is:
>>
>> 000 del  nl
>> 002
>> 000 del  sp  nl
>> 003
>> 000  sp del  nl
>> 003
>>
>> however with bash 4.1 and bash 4.2.20, the output is only:
>>
>> 000 del  nl
>> 002
>> 000  sp  nl
>> 002
>> 000  sp  nl
>> 002
>>
>> ... so in the second and third line, the delete character magically
>> disappears. Neither OS nor locale seem to influence this. Using a delete
>> character directly in the script instead of $del also has no impact, either.
> It's a case of one part of the code violating assumptions made by (and
> conditions imposed by) another.  Try the attached patch; it fixes the
> issue for me.
>
> Chet
>
It also works for me.

  "$del${a#x}" =[$'\177']
  " $del${a%x}"=[$' \177']
  " $del""${a:0:0}"=[$' \177']
  " ${del}${a:0:0}"=[$' \177']
  "${del:0:1}${a#d}"   =[$'\177x']
  "${del:0:1} ${a#d}"  =[$'\177 x']
  "${del:0:1} ${a:+}"  =[$'\177 ']
  "$del ${a#x}"=[$'\177 ']
  " $del${a:0:0}"  =[$' \177']
  " $del${a}"  =[$' \177x']
  " ${del:0:1}${a:0:0}"=[$' \177']
  "${del:0:1}${a#x}"   =[$'\177']
  "${del:0:1} ${a#x}"  =[$'\177 ']
  " $del${a#x}"=[$' \177']
  " $del"${a:0:0}  =[$' \177']
  " $del"  =[$' \177']
  " ${del:0:1}${a}"=[$' \177x']
  "${del:0:1} ${a}"=[$'\177 x']
  "${del:0:1} ${a:-}"  =[$'\177 x']

lib/sh/mktime.c VMS specific code is not needed.

2012-06-02 Thread John Malmberg

The lib/sh/mktime.c module has a VMS specific include of  
to pick up time_t.


On VMS, the time_t type is defined in the  module.

So this VMS specific include can be removed.

Regards,
-John

Re: bash tab variable expansion question?

2012-06-11 Thread John Embretsen

On 27 Feb 2011 18:18:24 -0500, Chet Ramey wrote:
>> On Sat, Feb 26, 2011 at 10:49 PM, gnu.bash.bug wrote:
>> A workaround is fine but is the 4.2 behavior bug or not?
>
>It's a more-or-less unintended consequence of the requested change Eric
>Blake referred to earlier in the thread.

http://lists.gnu.org/archive/html/bug-bash/2011-02/msg00275.html

(...)

> The question is how to tell readline that the `$' should be quoted
> under some circumstances but not others.  There's no hard-and-fast
> rule that works all the time, though I suppose a call to stat(2)
> before quoting would solve part of the problem.  I will have to give
> it more thought.
>
> Chet

Any updates on this issue?

The workarounds for $PWD and $OLDPWD is not enough, and the "workaround" 
of going back on the command line and removing escape characters is not 
acceptable in my humble opinion.

I often use environment variable-based tab completion to navigate to the 
correct directory on my system, but with bash 4.2 this is no longer an 
option. For example,

with CODE=/path/to/dir and /path/to/dir/ containing test1/ and test2/,

cd $CODE/test should give a list of $CODE/test1 $CODE/test2 
when those directories exist, not "cd \$CODE/test".

If there is no fix in sight for this issue, can someone point me to a 
guide for downgrading bash in recent popular Linux distros?

thanks,

--
John

Re: bash tab variable expansion question?

2012-06-11 Thread John Embretsen


On 06/11/2012 10:10 AM, Pierre Gaston wrote:

On Mon, Jun 11, 2012 at 10:59 AM, John Embretsen  wrote:

On 27 Feb 2011 18:18:24 -0500, Chet Ramey wrote:

On Sat, Feb 26, 2011 at 10:49 PM, gnu.bash.bugwrote:
A workaround is fine but is the 4.2 behavior bug or not?


It's a more-or-less unintended consequence of the requested change Eric
Blake referred to earlier in the thread.


http://lists.gnu.org/archive/html/bug-bash/2011-02/msg00275.html

(...)


The question is how to tell readline that the `$' should be quoted
under some circumstances but not others.  There's no hard-and-fast
rule that works all the time, though I suppose a call to stat(2)
before quoting would solve part of the problem.  I will have to give
it more thought.

Chet


Any updates on this issue?


(...)



There have been many updates on this. There have been a "fix"
available for some time now on this list .

It is now available as an official patch
ftp://ftp.gnu.org/gnu/bash/bash-4.2-patches/bash42-029


Thank you. For someone like me who is not active in this project it 
seems it is not very easy to keep track of bugs, patches and releases. 
Is there an issue number/bug number of some sort for this particular 
issue that I could use for this purpose?


I hope the patch works well enough and that it will end up in Linux 
distros shortly.



--
John

Bash 4.1 doesn't behave as I think it should: arrays and the environment

2012-08-17 Thread John Summerfield


GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>


This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

I am running on 64-bit CentOS 6.

I have been writing a script that reads a text file and munges it to 
create a shell command to run a script.


In two cases I wish to pass an array in the environment, like so:
14:28 john@Boomer$ STUFF[1]=one STUFFX=stuffx env | grep ^ST
STUFFX=stuffx
STUFF[1]=one
14:28 john@Boomer$

Of course, my script is not called "env," but this shows the sort of 
thing that is clearly visible within my script.


The symptom I observed is that the elements of STUFF are visible in the 
environment, but expand to NULL in my script. However, things like 
STUFFX are visible in the environment and their values are accessible in 
the script.


The man page for bash contains a para entitled ENVIRONMENT which doesn't 
mention arrays, leaving the reader to assume they are not different from 
other shell variables.


It was some time (hours) before I tried this:
14:28 john@Boomer$ export STUFF[1]=one
bash: export: `STUFF[1]': not a valid identifier
14:33 john@Boomer$

I have found no documentation (and I have searched 
http://www.gnu.org/software/bash/manual/bashref.html ) that clarifies 
what is going on.


Now, passing arrays via the environment is useful, I found a use for it.

It is possible (more or less) to get elements of the array into the 
environment. It's unclear to me, but I suppose the above example may 
have created a variable "STUFF[1]."


I note that IEEE Std 1003.1, 2004 Edition 6 allows implementations to 
expand the base rules regarding variable names so that "STUFF[1]" is 
permissible in the environment.


I suggest bash be enhanced to allow the export statement, or something 
analogous applying to arrays, so that array elements can be passed in 
the same way as ordinary variables.


If there exists  a way to do what I have been trying, then it needs to 
be documented somewhere and mentioned in the ENVIRONMENT para and in 
documentation of the set command and arrays, and maybe other places too.


--
John Summerfield

Re: Regular expression matching fails with string RE

2012-10-16 Thread John Kearney

Am 17.10.2012 03:13, schrieb Clark WANG:
> On Wed, Oct 17, 2012 at 5:18 AM,  wrote:
>
>> Bash Version: 4.2
>> Patch Level: 37
>>
>> Description:
>>
>> bash -c 're=".*([0-9])"; if [[ "foo1" =~ ".*([0-9])" ]]; then echo
>> ${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]};
>> fi'
>>
>> This should output foo1. It instead outputs bar2, as the first match fails.
>>
>>
>> From bash's man page:
>[[ expression ]]
>   ... ...
>   An additional binary operator, =~, is available, with  the
> same
>   ... ...
>   alphabetic characters.  Any part of the pattern may be quoted
> to
>   force  it  to  be  matched  as  a string.  Substrings matched
> by
>   ... ...
Drop the quotes on the regex

bash -c 're=".*([0-9])"; if [[ "foo1" =~ .*([0-9]) ]]; then echo
${BASH_REMATCH[0]}; elif [[ "bar2" =~ $re ]]; then echo ${BASH_REMATCH[0]};
fi'

outputs foo1

Re: output of `export -p' seems misleading

2012-11-10 Thread John Kearney

Am 09.11.2012 17:21, schrieb Greg Wooledge:
> On Fri, Nov 09, 2012 at 11:18:24AM -0500, Greg Wooledge wrote:
>> restore_environment() {
>>   set -o posix
>>   eval "$saved_output_of_export_dash_p"
>>   set +o posix
>> }
> Err, what I meant was:
>
> save_environment() {
>   set -o posix
>   saved_env=$(export -p)
>   set +o posix
> }
>
> restore_environment() {
>   eval "$saved_env"
> }
>
or I guess you could also do something like

save_environment() {
  saved_env=$(export -p)
}

restore_environment() {
  echo "${saved_env//declare -x /declare -g -x }"
}


or

save_environment() {
  saved_env=$(set -o posix; export -p)
}

Is direxpand available yet (to fix dirspell)?

2013-01-08 Thread John Caruso

In bash 4.1, if you do "shopt +s dirspell" and type "ls /ect/passwd"
it's corrected to "ls /etc/passwd".  In bash 4.2 with dirspell enabled,
the correction doesn't happen.

Some searching shows that the bash 4.1 behavior can apparently be enabled
again in bash 4.2 with a patch that adds a new "direxpand" option.  Is this
the case?  If so, is direxpand actually available in any released version
of bash yet (and if so which version)?

- John

Re: Is direxpand available yet (to fix dirspell)?

2013-01-08 Thread John Caruso

In article , John 
Caruso wrote:
> In bash 4.1, if you do "shopt +s dirspell" and type "ls /ect/passwd"
> it's corrected to "ls /etc/passwd".  In bash 4.2 with dirspell enabled,
> the correction doesn't happen.

I forgot to mention that I've tested this with bash 4.2.10 and 4.2.24,
and neither of them appear to have the direxpand option.  I checked the
bash source but couldn't suss out (in a brief look) how minor bash
versions are accounted--there's no 4.2.10 or 4.2.24 source, just 4.2
source plus a bunch of patches, and it's not clear if those patches have
made it into an official bash release or which release number that is.

- John

Re: Is direxpand available yet (to fix dirspell)?

2013-01-08 Thread John Caruso

In article , Chet Ramey wrote:
> On 1/8/13 5:18 PM, John Caruso wrote:
>> In bash 4.1, if you do "shopt +s dirspell" and type "ls /ect/passwd"
>> it's corrected to "ls /etc/passwd".  In bash 4.2 with dirspell enabled,
>> the correction doesn't happen.
>>[...] 
> 
> That functionality came in as part of bash-4.2 patch 29.  Please try it
> and let me know what it's missing.

Thanks for the quick reply. I just tested that patch, and it does indeed
fix the bug.  The distribution I'm using is only up to version 4.2.24,
unfortunately, so I'll have to wait to get a working dirspell option
again under bash 4.2.

So just to verify: there's no way in bash 4.2.0 through 4.2.28 to make
dirspell work correctly?  The only fix is the direxpand option?

- John

Re: Is direxpand available yet (to fix dirspell)?

2013-01-09 Thread John Caruso

In article , Chet Ramey wrote:
> On 1/8/13 5:38 PM, John Caruso wrote:
>> So just to verify: there's no way in bash 4.2.0 through 4.2.28 to make
>> dirspell work correctly?  The only fix is the direxpand option?
> 
> Yes.  Through 4.2.28, the dirspell option will cause the filename to be
> rewritten with spelling correction internally, but the corrected filename
> will not be rewritten on the command line.

Huh--that's not what I'm seeing.  This is what I get from stock bash 4.2
with no patches (and bash 4.2.10 and 4.2.24 behaved the same way):

   bash-4.2$ shopt -s dirspell
   bash-4.2$ ls /ect/passwd
   /ect/passwd: No such file or directory

The TAB there produces a space (and no bell) as though dirspell is in
fact acting on the filename, but executing the command shows that bash
is still using the misspelled filename internally.  So as far as I can
tell (and unless there's some other use case I'm missing), dirspell no
longer worked from bash 4.2 through 4.2.28.  And direxpand does indeed
make dirspell work again (thanks for that)--in fact it seems like it
doesn't do anything other than make dirspell work again.

Having dirspell/direxpand brings bash almost up to par with tcsh's
killer spelling correction.  I think all that's missing now is command
spelling correction like the following in tcsh:

   tcsh-6.12$ set correct=cmd
   tcsh-6.12$ set prompt3="Don't you mean: %R? "
   tcsh-6.12$ chmd 666 beelzebub

   Don't you mean: chmod 666 beelzebub? yes
   tcsh-6.12$ 

Part of the reason I was asking about dirspell in the first place is that
path and command spelling correction are the places where tcsh still has
an edge over other shells.

- John

Re: Is direxpand available yet (to fix dirspell)?

2013-01-10 Thread John Caruso

In article , Chet Ramey wrote:
> On 1/9/13 1:27 PM, John Caruso wrote:
>> In article , Chet Ramey wrote:
>>> Yes.  Through 4.2.28, the dirspell option will cause the filename to be
>>> rewritten with spelling correction internally, but the corrected filename
>>> will not be rewritten on the command line.
[...]
>>  This is what I get from stock bash 4.2
>> with no patches (and bash 4.2.10 and 4.2.24 behaved the same way):
>> 
>>bash-4.2$ shopt -s dirspell
>>bash-4.2$ ls /ect/passwd
>>/ect/passwd: No such file or directory
>> 
>> The TAB there produces a space (and no bell) as though dirspell is in
>> fact acting on the filename, but executing the command shows that bash
>> is still using the misspelled filename internally.  
> 
> We're saying the same thing, differently.  The reason there is no bell and
> the space is appended is because readline thinks the completion has
> succeeded.  The reason it thinks the completion has succeeded is because it
> has -- bash has corrected ect to etc and told readline so.  The difference
> is that readline doesn't think it has to rewrite what appears in the actual
> input buffer, because bash has not told it to do so.  So readline gets
> /etc/passwd back as a valid completion, verifies that's not a directory
> and so appends a space and goes on.  The input buffer doesn't get changed
> and hitting return passes the still-uncorrected filename to ls.

Ok.  When you said the filename was "rewritten with spelling correction
internally" I took your meaning to be that although it may look wrong
on the screen it was correct internally and so it would produce correct
results.  But it sounds like you were really saying that although the
correct filename existed in some internal bash data structures, these
data structures never saw the light of day.  From a user perspective that
seems like an invisible distinction, which is why I assumed you meant
the former and not the latter.

> Having direxpand enabled tells bash (and,
> indirectly, readline) that it's ok to rewrite what appears in the input
> buffer to the form it uses internally to decide what to do about
> completions.  In the most common case, that means things like $PWD will
> be expanded to the current directory, or $HOME will be expanded to your
> home directory.

For what it's worth, if you type $HOME/ in tcsh it will
complete  relative to $HOME but leave $HOME untouched.  I
can see arguments for both ways, though I like how that behavior separates
path completion/spelling correction from inline variable expansion etc
(among other things it means that the command's history entry will still
reference the variable, so if the variable changes, executing the command
from history will pick up the new value).

- John

Re: printf %q represents null argument as empty string.

2013-01-11 Thread John Kearney

Am 11.01.2013 19:38, schrieb Dan Douglas:
> $ set --; printf %q\\n "$@"
> ''
>
> printf should perhaps only output '' when there is actually a corresponding
> empty argument, else eval "$(printf %q ...)" and similar may give different 
> results than expected. Other shells don't output '', even mksh's ${var@Q} 
> expansion. Zsh's ${(q)var} does.

that is not a bug in printf %q

it what you expect to happen with "${@}" 
should that be 0 arguments if $# is 0.

I however find the behavior irritating, but correct from the description.

to do what you are suggesting you would need a special case handler for this
"${@}" as oposed to "${@}j" or any other variation.


what I tend to do as a workaround is

printf() {
if [ $# -eq 2 -a -z "${2}" ];then
builtin printf "${1}"
else
builtin printf "${@}"
fi
}


or not as good but ok in most cases something like

printf "%q" ${1:+"${@}"}

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread John Kearney

Am 11.01.2013 19:27, schrieb Dan Douglas:
> Bash treats the variable as essentially undefined until given at least an 
> empty value.
>
> $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> 1,
> bash: line 0: typeset: x: not found
> $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
> 0,
> typeset -i x
>
> Zsh implicitly gives integers a zero value if none are specified and the
> variable was previously undefined. Either the ksh or zsh ways are fine IMO.
>
> Also I'll throw this in:
>
> $ arr[1]=test; [[ -v arr[1] ]]; echo $?
> 1
>
> This now works in ksh to test if an individual element is set, though it 
> hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
> some extra nuances to what it means for something to be defined...
>

Personally I like the current behavior, disclaimer I use nounset.
I see no problem with getting people to initialize variables.

it is a more robust programming approach.

Re: printf %q represents null argument as empty string.

2013-01-11 Thread John Kearney

Am 11.01.2013 22:05, schrieb Dan Douglas:
> On Friday, January 11, 2013 09:39:00 PM John Kearney wrote:
>> Am 11.01.2013 19:38, schrieb Dan Douglas:
>>> $ set --; printf %q\\n "$@"
>>> ''
>>>
>>> printf should perhaps only output '' when there is actually a 
> corresponding
>>> empty argument, else eval "$(printf %q ...)" and similar may give 
> different 
>>> results than expected. Other shells don't output '', even mksh's ${var@Q} 
>>> expansion. Zsh's ${(q)var} does.
>> that is not a bug in printf %q
>>
>> it what you expect to happen with "${@}" 
>> should that be 0 arguments if $# is 0.
>>
>> I however find the behavior irritating, but correct from the description.
>>
>> to do what you are suggesting you would need a special case handler for this
>> "${@}" as oposed to "${@}j" or any other variation.
>>
>>
>> what I tend to do as a workaround is
>>
>> printf() {
>> if [ $# -eq 2 -a -z "${2}" ];then
>> builtin printf "${1}"
>> else
>> builtin printf "${@}"
>> fi
>> }
>>
>>
>> or not as good but ok in most cases something like
>>
>> printf "%q" ${1:+"${@}"}
>>
>>
> I don't understand what you mean. The issue I'm speaking of is that printf %q 
> produces a quoted empty string both when given no args and when given one 
> empty arg. A quoted "$@" with no positional parameters present expands to 
> zero 
> words (and correspondingly for "${arr[@]}"). Why do you think "x${@}x" is 
> special? (Note that expansion didn't even work correctly a few patchsets ago.)
>
> Also as pointed out, every other shell with a printf %q feature disagrees 
> with 
> Bash. Are you saying that something in the manual says that it should do 
> otherwise? I'm aware you could write a wrapper, I just don't see any utility 
> in the default behavior.

um maybe an example will calrify my  attempted point

set -- arg1 arg2 arg3
set -- "--(${@})--"
printf "<%q> " "${@}"
<--\(arg1>  

set --
set -- "--(${@})--"
printf "<%q> " "${@}"
<--\(\)-->

so there is always at least one word or one arg, just because its "${@}"
should not  affect this behavior.

is that clearer as such bash is doing the right thing as far as I'm
concerned, truthfully its not normally what I want but that is beside
the point consistency is more important, especially when its so easy to
work around.

the relevant part of the man page is
 When there are no  array  members,  ${name[@]}  expands  to nothing.  

<<>>>
If  the double-quoted expansion occurs within a word, the expansion of
the first parameter is joined with the beginning part of the
   original word, and the expansion of the last parameter is joined
with the last part of the original word.  This is analogous to the 
expansion
   of  the  special parameters * and @ (see Special Parameters
above).  ${#name[subscript]} expands to the length of
${name[subscript]}.  If sub-
   script is * or @, the expansion is the number of elements in the
array.  Referencing an array variable without a subscript  is 
equivalent  to
   referencing the array with a subscript of 0.

so
set --

printf "%q" "${@}"
becomes
printf "%q" ""

which is correct as ''

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-11 Thread John Kearney

Am 11.01.2013 22:34, schrieb Dan Douglas:
> On Friday, January 11, 2013 09:48:32 PM John Kearney wrote:
>> Am 11.01.2013 19:27, schrieb Dan Douglas:
>>> Bash treats the variable as essentially undefined until given at least an 
>>> empty value.
>>>
>>> $ bash -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
>>> 1,
>>> bash: line 0: typeset: x: not found
>>> $ ksh -c 'typeset -i x; [[ -v x ]]; echo "$?, ${x+foo}"; typeset -p x'
>>> 0,
>>> typeset -i x
>>>
>>> Zsh implicitly gives integers a zero value if none are specified and the
>>> variable was previously undefined. Either the ksh or zsh ways are fine IMO.
>>>
>>> Also I'll throw this in:
>>>
>>> $ arr[1]=test; [[ -v arr[1] ]]; echo $?
>>> 1
>>>
>>> This now works in ksh to test if an individual element is set, though it 
>>> hasn't always. Maybe Bash should do the same? -v is tricky because it adds 
>>> some extra nuances to what it means for something to be defined...
>>>
>> Personally I like the current behavior, disclaimer I use nounset.
>> I see no problem with getting people to initialize variables.
> How is this relevant? It's an inconsistency in the way set/unset variables
> are normally handled. You don't use variadic functions? Unset variables /
> parameters are a normal part of most scripts.
>
>> it is a more robust programming approach.
> I strongly disagree. (Same goes for errexit.)
>
:)
we agree on errexit however SIGERROR is another matter quite like that.
Note the only reason I don't like errexit is because it doesn't tell you
why it exited, nounset deos.

no unset is very valuable. during the entire testing and validation
phase. Admittedly bash is more of a hobby for me. but I still have unit
testing for the function and more complex harness testing for the higher
level stuff. 

Before I ship ship code I may turn it off but normally if its really
critical I won't use bash for it anyway, I mainly use bash for analysis.

as such if bash stops because it finds a unset variable it is always a
bug that bash has helped me track down.

I guess it also depends on how big your scripts are I guess up to a
couple thousand lines is ok but once you get into the 10s of thousands
to keep your sanity and keep a high reliability you become more and more
strict with what you allow, strict naming conventions and coding styles.

setting nounset is in the same category of setting warnings to all and
and treat warnings as errors.

but then again I do mission critical designs so I guess I have a
different mindset.

Re: printf %q represents null argument as empty string.

2013-01-12 Thread John Kearney

Am 12.01.2013 15:34, schrieb Dan Douglas:
> On Friday, January 11, 2013 10:39:19 PM Dan Douglas wrote:
>> On Saturday, January 12, 2013 02:35:34 AM John Kearney wrote:
>> BTW, your wrappers won't work. A wrapper would need to implement format 
> Hrmf I should have clarified that I only meant A complete printf wrapper 
> would 
> be difficult. A single-purpose workaround is perfectly fine. e.g.
> printq() { ${1+printf %q "$@"}; }; ... which is probably something like what 
> you meant. Sorry for the rant.
>
Don't worry I've got a thick skin ;) feel free to rant, you have a
different perspective and I like that.

anyway now we have a point I disagree that
"${@}"

should expand to 0 or more words, from the documentation it should be 1
or more. At least that is how I read  that paragragh. IT says it will
split the word not make the word vanish.
so I had to test and it really does how weird, is that in the posix spec?.
set --
test_func() { echo $#; }
test_func "${@}"
0
test_func "1${@}"
1
test_func "${@:-}"
1
test_func "${@-}"
1

Now I'm confused ...

oh well sorry had the functionality differently in my head.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-12 Thread John Kearney

Am 12.01.2013 14:53, schrieb Dan Douglas:
> Yes some use -u / -e for debugging apparently. Actual logic relying upon 
> those 
> can be fragile of course. I prefer when things return nonzero instead of 
> throwing errors usually so that they're handleable.
ah but you can still do that if you want

you just do

${unsetvar:-0}  says you want 0 for null string or unset

${unsetvar-0}  says you want 0 for unset.

I know these aren't the sort of things you want to add retroactively,
but if you program from the ground up with this in mind your code is
much more explicit, and less reliant on particular interpreter
behavior.  So again it forces a more explicit programming style which is
always better. Truthfully most people complain my scripts don't look
like scripts any more but more like programs. But once they get used to
the style most see its advantages. at teh very least when they have to
figure out what is gone wrong they understand.


regarding -e it mainly has a bad name because there is no good guide how
to program with it.
so for example this causes stress
[ ! -d ${dirname} ] && mkdir ${dirname}
because if the dir exists it will exit the scripts :)
[ -d ${dirname} ] || mkdir ${dirname}
this however is safe.

actually forcing myself to work with SIGERR taught me a lot about how
this sort of thing works.

thats why I do for example use (old but simple example)
set -o errtrace
function TaceEvent {
local LASTERR=$?
local ETYPE="${1:?Missing Error Type}"
PrintFunctionStack 1
cErrorOut 1 "${ETYPE}
${BASH_SOURCE[1]}(${BASH_LINENO[1]}):${FUNCNAME[1]} ELEVEL=${LASTERR}
\"${BASH_COMMAND}\""
}
 trap 'TaceEvent ERR' ERR

which basically gives you a heads up everytime you haven't handled an
error return code.
so the following silly example

  test_func4() {
false
  }
  test_func3() {
test_func4
  }
  test_func2() {
test_func3
  }
  test_func1() {
test_func2
  }
  test_func1
will give me a log that looks like
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (225 ) :
main: "[5]/home/dethrophes/scripts/bash/test.sh(225):test_func1"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (223 ) :
test_func1  : "[4]/home/dethrophes/scripts/bash/test.sh(223):test_func2"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (220 ) :
test_func2  : "[3]/home/dethrophes/scripts/bash/test.sh(220):test_func3"
#D: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (217 ) :
test_func3  : "[2]/home/dethrophes/scripts/bash/test.sh(217):test_func4"
#E: Sat Jan 12 15:49:13 CET 2013 : 18055 : test.sh (214 ) :
test_func4  : "ERR
/home/dethrophes/scripts/bash/test.sh(217):test_func4 ELEVEL=1 \"false\""
which allows me to very quickly route cause the error and fix it.

if you really don't care you can just stick a ||true on the end to
ignore it in the future.

so in this case to something like
test_func4() {
false || true
  }
 
I mean it would be nice to have an unset trap, but without it nounset is
the next best thing.

Also I don't think of this as debugging it's code verification/analysis.
I do this so I don't have to debug my code. This is a big help against
typos and scoping errors. like I say its like using lint.

Re: printf %q represents null argument as empty string.

2013-01-12 Thread John Kearney

Am 12.01.2013 20:40, schrieb Chet Ramey:
> On 1/12/13 9:48 AM, John Kearney wrote:
>
>> anyway now we have a point I disagree that
>> "${@}"
>>
>> should expand to 0 or more words, from the documentation it should be 1
>> or more. At least that is how I read  that paragragh. IT says it will
>> split the word not make the word vanish.
>> so I had to test and it really does how weird, is that in the posix spec?.
> Yes.  Here's the relevant sentence from the man page description of $@:
>
>   When  there  are no positional parameters, "$@" and $@ expand to
>   nothing (i.e., they are removed).
>
> Posix says something similar:
>
>   If there are no positional parameters, the expansion of '@' shall
>   generate zero fields, even when '@' is double-quoted.
>
> Chet
thanks one lives and learns.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-12 Thread John Kearney

Am 13.01.2013 00:04, schrieb Chet Ramey:
> On 1/12/13 10:07 AM, John Kearney wrote:
>
>> regarding -e it mainly has a bad name because there is no good guide how
>> to program with it.
>> so for example this causes stress
>> [ ! -d ${dirname} ] && mkdir ${dirname}
>> because if the dir exists it will exit the scripts :)
> I'm not sure this is what you wanted to say.  When -e is set, that code
> will not cause an error exit if ${dirname} exists and is a directory.  Run
> this script in the bash source directory and see what happens:
>
> set -e
> [ ! -d builtins ] && mkdir builtins
> echo after
>
>
> Chet
:)
its a little more complex, truthfully I make rules how I should do stuff
and then just follow them.

in this case you actually need to put the code in a function, then its
actually the function return not the command itself that causes the
exit. At least I think thats what happens, truthfully sometimes even
with the caller trace it can be hard to tell what is actually going on.
i.e.

set -o errexit
test_func() {
[ ! -d test ] && echo test2
}

echo test3
test_func
echo test4

now so long as test doesn't exist in the cwd it should errexit.
at least it did for me just now.

Like I say the only reason I don't like errexit is it doesn't say why it
exited, so I use the ERR trap. Which is great.

Just to clarify I'm not complaining just saying why I think ppl have bad
experiences with errexit.

having said that it might be nice to get an optional backtrace on
errors. I do this myself but it might help others if it was natively
supported.

John

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 14:33, schrieb Greg Wooledge:
> On Sun, Jan 13, 2013 at 03:31:24AM +0100, John Kearney wrote:
>> set -o errexit
>> test_func() {
>> [ ! -d test ] && echo test2
>> }
>>
>> echo test3
>> test_func
>> echo test4
>>
>> now so long as test doesn't exist in the cwd it should errexit.
>> at least it did for me just now.
> Cannot reproduce.
>
> imadev:~$ cat bar
> #!/bin/bash
>
> set -e
> f() { test ! -d nosuchdir && echo no dir; }
> f
> echo survived
> imadev:~$ ./bar
> no dir
> survived

the "no dir" above means that the test didn't fail. The exit only
happens if the test fails. Sorry I keep seeming to make typos. I really
need more sleep.
this should exit.
#!/bin/bash

set -e
f() { test -d nosuchdir && echo no dir; }
echo testings
f
echo survived

All I was pointing out that its safer to use syntax

[] ||

or

[] && ||

you always need a || on a one liner to make sure the return value of the
line is a 0.
this isn't necessary in the script body I think but in a function it is,
unless its the last command then it will be auto returned..

but lets say you want to do 2 things in a function you have to do
something like.
f(){
mkdir "${1%/*}" ||return $?  # so the line doesn't return an error.
touch "${1}"
}

any way it is nearly always something that should be being done anyway.
It only the conditional one liners that tend to frustrate people a lot
from what I've seen.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 20:25, schrieb Greg Wooledge:
> On Mon, Jan 14, 2013 at 08:08:53PM +0100, John Kearney wrote:
>> this should exit.
>> #!/bin/bash
>>
>> set -e
>> f() { test -d nosuchdir && echo no dir; }
>> echo testings
>> f
>> echo survived
> OK, cool.  That gives me more ammunition to use in the war against set -e.
>
> ==
> imadev:~$ cat foo
> #!/bin/bash
>
> set -e
> test -d nosuchdir && echo no dir
> echo survived
> imadev:~$ ./foo
> survived
> ==
> imadev:~$ cat bar
> #!/bin/bash
>
> set -e
> f() { test -d nosuchdir && echo no dir; }
> f
> echo survived
> imadev:~$ ./bar
> imadev:~$ 
> ==
> imadev:~$ cat baz
> #!/bin/bash
>
> set -e
> f() { if test -d nosuchdir; then echo no dir; fi; }
> f
> echo survived
> imadev:~$ ./baz
> survived
> ==
>
>> All I was pointing out that its safer to use syntax
>>
>> [] ||
>>
>> or
>>
>> [] && ||
> I don't even know what "safer" means any more.  As you can see in my
> code examples above, if you were expecting the "survived" line to appear,
> then you get burned if you wrap the test in a function, but only if the
> test uses the "shorthand" && instead of the "vanilla" if.
>
> But I'm not sure what people expect it to do.  It's hard enough just
> documenting what it ACTUALLY does.
>
>> you always need a || on a one liner to make sure the return value of the
>> line is a 0.
> Or stop using set -e.  No, really.  Just... fucking... stop. :-(
>
>> but lets say you want to do 2 things in a function you have to do
>> something like.
>> f(){
>> mkdir "${1%/*}" ||return $?  # so the line doesn't return an error.
>> touch "${1}"
>> }
> ... wait, so you're saying that even if you use set -e, you STILL have to
> include manual error checking?  The whole point of set -e was to allow
> lazy people to omit it, wasn't it?
>
> So, set -e lets you skip error checking, but you have to add error checking
> to work around the quirks of set -e.
>
> That's hilarious.
>
I have no idea why errexit exists I doubt it was for lazy people
thought. its more work to use it.
I use trap ERR not errexit, which allows me to protocol unhandled errors.

I actually find trap ERR/errexit pretty straight forward now. I don't
really get why people are so against it. Except that they seem to have
the wrong expectations for it.

btw
|| return $?

isn't actually error checking its error propagation.


f(){
# not last command in function
mkdir "${1%/*}"  # exit on error.
mkdir "${1%/*}" ||return $?  # return an error.
mkdir "${1%/*}" ||true   # ignore error.

# last command in function
touch "${1}"# return exit code
}


what is confusing though is

f(){
touch "${1}"# exit on error
return $?
}


this wll not work as expected with errexit.

because the touch isn't the last command in the function, however just
removing the return should fix it.


also need to be careful of stuff like

x=$(false)
need something more like
x=$(false||true)
 
or
if x=$(false) ; then


basically any situation in which a line returns a non 0 value is
probably going to cause the exit especially in functions.


I just do it automatically now.
 

I guess most people aren't used to considering the line return values.

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-14 Thread John Kearney

Am 14.01.2013 22:09, schrieb Ken Irving:
> On Mon, Jan 14, 2013 at 08:57:41PM +0100, John Kearney wrote:
>> ...
>> btw
>> || return $?
>>
>> isn't actually error checking its error propagation.
> Also btw, I think you can omit the $? in this case;  from bash(1):
>
> return [n]
> ...
> If n is omitted, the return status is that of the  last  command
> executed  in the function body.  ...
>
> and similarly for exit:
>
> exit [n]
> ...  If  n  is  omitted,
> the exit status is that of the last command executed.  ...
>
> Ken
>
Thanks yhea your right, but I think its clearer to include it especially
for people with less experience. I try to be as explicit as possible.
Perl cured me of my taste for compactness in code . ;)

Re: typeset -p on an empty integer variable is an error. (plus -v test w/ array elements)

2013-01-15 Thread John Kearney

Am 14.01.2013 21:12, schrieb Chet Ramey:
> On 1/14/13 2:57 PM, John Kearney wrote:
>
>> I have no idea why errexit exists I doubt it was for lazy people
>> thought. its more work to use it.
> I had someone tell me one with a straight (electronic) face that -e
> exists `to allow "make" to work as expected' since historical make invokes
> sh -ce to run recipes.  Now, he maintains his own independently-written
> version of `make', so his opinion might be somewhat skewed.
>
> Chet
>
That actually makes a lot of sense. it explains the 2 weirdest things
about it, 1 no error message explaining what happened, 2 weird behavior
with functions.

|& in bash?

2013-01-17 Thread John Caruso

One feature of other shells (e.g. zsh and tcsh) I'd really love to have
in bash is "|&", which redirects both stdout and stderr--basically just
a shortcut for "2>&1 |".  Has this ever been considered for bash?

It may not seem like much of a difference, but it's saved me an enormous
numbers of keystrokes over the years.  There's nothing more frustrating
in bash than getting to (or worse, just past) "|" and realizing I need
to redirect stderr as well as stdout, then cursoring back and executing
a keyboard-acrobatic "2>&1" for the zillionth time

- John

Re: |& in bash?

2013-01-17 Thread John Caruso

In article , Chet Ramey wrote:
> On 1/17/13 1:01 PM, John Caruso wrote:
>> One feature of other shells (e.g. zsh and tcsh) I'd really love to have
>> in bash is "|&", which redirects both stdout and stderr--basically just
>> a shortcut for "2>&1 |".  Has this ever been considered for bash?
> 
> That has been in bash since bash-4.0.

I'm simultaneously happy and chagrined :-) (most of the servers I manage
are on bash 3.x, so I hadn't encountered it).  Thanks.

- John

Re: |& in bash?

2013-01-21 Thread John Caruso

In article , Dan Douglas wrote:
> It isn't very common to dump multiple streams into one pipe.

echo "n't very"  >/dev/null 2>&1

> I suggest avoiding |&.

Personally I wouldn't use it in scripts, since I try to stick as close
to plain vanilla Bourne shell as possible (though that's not as much of
a consideration these days as it used to be).  But for interactive use
it's one of the greatest shell time- and effort-savers I know of, and
I'm very happy to hear it's made its way into bash.  I wouldn't suggest
avoiding it unless you like carpal tunnel.

- John

1 2 3 >

1 - 100 of 278 matches

Mail list logo