compgen is slow for large numbers of options
Dear All, I don't know for certain if this is a bug per se, but I think "compgen -W" is much slower than it "should" be in the case of a large (1+) number of options. For example (on a fast i7 2700 CPU), I measure: compgen -W "`seq 1 5`" 1794#3.83 s compgen -W "`seq 1 5 | grep 1794`" #0.019 s In these examples, I'm using `seq` as a trivial way to generate some data, and picking 1794 as a totally arbitrary match. In the first example, compgen is doing the filtering, whereas in the 2nd, I obtain the same result very much faster with grep. If I increase the upper number by a factor of 10, to 50, these times become, 436 s (yes, really, 7 minutes!) and 0.20 s respectively. This suggests that the algorithm used by compgen is O(n^2) whereas the algorithm used by grep is 0(1). For a real world example, see: https://bugs.mageia.org/show_bug.cgi?id=373#c8 in which we are using completion on package-management. In this case, the number is 43031. I hope this is helpful. Richard
Re: compgen is slow for large numbers of options
If I increase the upper number by a factor of 10, to 50, these times become, 436 s (yes, really, 7 minutes!) and 0.20 s respectively. This suggests that the algorithm used by compgen is O(n^2) whereas the algorithm used by grep is 0(1). I meant: grep is O(n).
Re: compgen is slow for large numbers of options
Dear Bob, Thanks for your explanation. I do understand what is going on and why. But my point was that compgen has an implicit internal "grep" that is much less efficient than actual grep. Why is the performance of compgen's sorting/filtering algorithm so much worse than grep's ? Both of them start with a large list, and filter it to a small one. At any rate, might I suggest this is worthy of a line or two of documentation in the compgen manual? 40,000 possible completions is not necessarily unreasonable. [In this case, urpmi, as used by Mandriva/Mageia has had it wrong for some time.] Best wishes, Richard Message: 4 Date: Wed, 14 Mar 2012 13:40:36 -0600 From: Bob Proulx To: bug-bash@gnu.org Subject: Re: compgen is slow for large numbers of options Message-ID:<20120314194036.ga12...@hysteria.proulx.com> Content-Type: text/plain; charset=us-ascii Richard Neill wrote: I don't know for certain if this is a bug per se, but I think "compgen -W" is much slower than it "should" be in the case of a large (1+) number of options. I don't think this is a bug but just simply a misunderstanding of how much memory must be allocated in order to generate `seq 1 50`. For example (on a fast i7 2700 CPU), I measure: compgen -W "`seq 1 5`" 1794#3.83 s compgen -W "`seq 1 5 | grep 1794`" #0.019 s In these examples, I'm using `seq` as a trivial way to generate some data, and picking 1794 as a totally arbitrary match. Right. But in the first case you are generating 288,894 bytes of data and in the second case 89 bytes of data. That is a large difference. You could probably speed it up a small amount more by using grep -x -F and avoid the common substring matches. And perhaps more depending upon your environment with LC_ALL=C to avoid charset issues. In the first example, compgen is doing the filtering, whereas in the 2nd, I obtain the same result very much faster with grep. Yes, since grep is reducing the argument size to 89 bytes. That makes perfect sense to me. Anything that processes 89 bytes is going to be much faster than anything that processes 288,894 bytes. If I increase the upper number by a factor of 10, to 50, these times become, 436 s (yes, really, 7 minutes!) and 0.20 s That is an increase in argument size to 3,388,895 bytes and there will be associated memory overhead to all of that increasing the size on top of that value too. Three plus megabytes of memory just for the creation of the argument list and then it must be processed. That isn't going to be fast. You would do much better if you filtered that list down to something reasonable. respectively. This suggests that the algorithm used by compgen is O(n^2) whereas the algorithm used by grep is 0(1). On the counter it suggests that the algorithm you are using, one of fully allocating all memory, is inefficient. Whereas using grep as a filter to reduce that memory to 89 bytes is of course more efficient. I wrote a response on a similar issue previously. Instead of posting it again let me post a pointer to the previous message. http://lists.gnu.org/archive/html/bug-bash/2011-11/msg00189.html Bob
bug-report/request: allow "set +n" to re-enable commands.
Dear All, Might I suggest/request that "set +n" should undo the effect of "set -n" ? For example: #!/bin/bash echo one set -n echo two set +n echo three would print: one three Here's why I think it would be useful: 1. Bash doesn't have a block-comment mechanism, like /* ... */ and this would provide one. 2. The documentation for "help set" says that flags can be undone with "+", thus the inverse of -n is +n. (though in contradiction , it also says that subsequent commands (which would include the "set" are ignored) 3. It would allow for a neat hack with polyglots. For example: #!/bin/bash -n (there is actually a serious purpose to this, namely to create a php-cli script that shows all errors, despite the settings in the system-wide php-ini file) Example 3 works if you remove the "-n" and "set +n" parts, though it then emits an annoying complaint about "?php: No such file or directory" Thank you for your consideration, Best wishes, Richard
Bash - various feature requests
Dear All, I hope I'm posting this in the right place, since it it's really a feature request, rather than a bug report. However, here are a few things which I think would be nice to have in bash, and easy to implement: 1)substr support for a negative length argument. For example, stringZ=abcdef echo ${stringZ:2:-1} #prints cde i.e. ${string:x:y} returns the string, from start position x for y characters. but, if x is negative, start from the right hand side and if y is negative, print up to the end, -y. This would work the same way as PHP, and be extremely useful for, say, removing an extension from a filename. http://uk.php.net/manual/en/function.substr.php --- 2)"help strings" and "help operators" Strings and operators are all documented in the man page. Likewise, there is an excellent guide here: http://www.die.net/doc/linux/abs-guide/string-manipulation.html However, a quick reminder from within the shell (in the same way that 'for' and ':' are documented) would be really useful. --- 3)Some way to canonicalise a path. For example, converting: ./foo//bar/../baz.txt to /home/foo/baz.txt This isn't easy to do from a shell script. Maybe something like this could be a built in? http://publicobject.com/2006/06/canonical-path-of-file-in-bash.html There is a also a realpath program, but it seems not to be part of bash by default: http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/bash/bash-3.0/examples/loadables/realpath.c --- 4)A way to find a relative directory path. i.e. the solution to: If I were to start in directory /home/me/x/y/z and I wanted to get to /home/me/x/a/b/c then, what argument would I need to provide to 'cd' ? #Answer is ../../b/c How could a shell script calculate this? --- 5)An enhancement to read/readline, such that one can specify the initial value with which the buffer is filled. Currently, we can do: read -ep 'Enter your name: ' NAME and I might type "Richad Neill". #Note the deliberate typo. If the script recognises this as invalid, the best it can do is: echo "Name not recognised" read -ep 'Re-enter your name: ' NAME and the user must retype it in full. I'd like to propose adding an "-i" option for initial value: echo "Name not recognised" read -ep "Please correct your name: " -i "$NAME" The shell prints: Please correct your name: Richad Neill where the latter part is now editable. Thus the really nice editing features of readline can be used for updating values already stored in variables. This is extremely useful when the value is quite long. --- I hope these thoughts are of some use. I'm not on this list, so please CC me with any followup. If this report could have been more useful, I'd welcome any feedback. Best wishes, Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash - various feature requests
Dear Grzegorz, Thanks for your helpful reply. Grzegorz Adam Hankiewicz wrote: On 2006-12-27, Richard Neill <[EMAIL PROTECTED]> wrote: 1)substr support for a negative length argument. For example, stringZ=abcdef echo ${stringZ:2:-1} #prints cde i.e. ${string:x:y} returns the string, from start position x for y characters. but, if x is negative, start from the right hand side and if y is negative, print up to the end, -y. This would work the same way as PHP, and be extremely useful for, say, removing an extension from a filename. If extension removal is all you need, you can already do it. $ for f in *; do echo $f; done Makefile.am Makefile.in ucl $ for f in *; do echo ${f%.*}; done Makefile Makefile ucl I did know about this, but it seems slightly like a sledgehammer on a nut, using a regexp instead of a substring. Also, this way doesn't allow you to trim both the start and the end. Lastly, I just think it would be a really useful (and easy-to-implement, I think) feature, which is consistent with the use of PHP and perl. At the moment, the first parameter may be any integer (postive or negative), but the second parameter can only be positive. Best wishes, Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash - various feature requests
5)An enhancement to read/readline, such that one can specify the initial value with which the buffer is filled. Currently, we can do: read -ep 'Enter your name: ' NAME and I might type "Richad Neill". #Note the deliberate typo. If the script recognises this as invalid, the best it can do is: echo "Name not recognised" read -ep 'Re-enter your name: ' NAME and the user must retype it in full. I'd like to propose adding an "-i" option for initial value: echo "Name not recognised" read -ep "Please correct your name: " -i "$NAME" The shell prints: Please correct your name: Richad Neill where the latter part is now editable. Thus the really nice editing features of readline can be used for updating values already stored in variables. This is extremely useful when the value is quite long. Is this the same as read -ep "Please correct your name: $NAME" NAME ? No - that isn't what I meant. What I want to do is: 1)Print the text: "Please correct your name: " without a trailing newline. 2)Create an editable buffer (using readline), which is initialised with the current value of the variable $NAME and printed to screen. 3)The user edits the buffer as desired, and when he presses ENTER, its contents is assigned to the variable NAME. In the example I gave, with my proposed -i option, the script would present the user with this: Please correct your name: Richad Neill * the cursor position starts at the *, but can be moved around anywhere within the . So the typo could be corrected by keying: Alt-B Ctrl-B, Ctrl-B, r, ENTER whereas currently, the user must type in the full string from scratch. Is that clear? Sorry it's hard to explain without a diagram. Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Bash arithmetic doesn't give error message on wrap.
Bash Version: 3.2 Patch Level: 13 Release Status: release Description: $ echo $((40*40) -2446744073709551616 Repeat-By: Do some arithmetic in bash $(()). If the numbers are out of range, the output will be wrong in all sorts of interesting ways. No error message is given. Fix: Arbitrary-precision maths would be nice. But at least, could we have an error message if an overflow occurs? The man page says: "Evaluation is done in fixed-width integers with no check for overflow..." but I'd suggest this represents a bug, not a feature. Regards, Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash arithmetic doesn't give error message on wrap.
Chet Ramey wrote: Description: $ echo $((40*40) -2446744073709551616 Repeat-By: Do some arithmetic in bash $(()). If the numbers are out of range, the output will be wrong in all sorts of interesting ways. No error message is given. Fix: Arbitrary-precision maths would be nice. But at least, could we have an error message if an overflow occurs? The man page says: "Evaluation is done in fixed-width integers with no check for overflow..." but I'd suggest this represents a bug, not a feature. I'm comfortable with the current behavior. POSIX requires that expressions be evaluated according to the C standard, and that standard leaves the treatment of integer overflow as undefined. If POSIX says the behaviour is undefined, then surely bash can do whatever it wants. So, printing an error message would be allowed. The error message would be: a)Most helpful to the user (least surprise) b)Consistent with other cases, where bash does give warnings. For example: $ X=$((3+078)) bash: 3+078: value too great for base (error token is "078") $ echo $? 1 Regards, Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash arithmetic doesn't give error message on wrap.
Bob Proulx wrote: Richard Neill wrote: b)Consistent with other cases, where bash does give warnings. For example: $ X=$((3+078)) bash: 3+078: value too great for base (error token is "078") $ echo $? 1 That is not really a comparable case. The problem there is that the leading zero specifies an octal constant and the 8 cannot be converted to octal. The "3+" part is just a distraction. echo $((08)) bash: 08: value too great for base (error token is "08") Bob Are you sure this isn't comparable? After all, in both cases, the user has submitted something to which bash cannot give a sensible answer. In the integer-overflow case, bash simply returns the wrong answer, with no warning. But in the octal case, bash (quite correctly, and helpfully) prints a warning. If bash were to be consistent, then it should display no error message in the case of $((3+078)); it should either "Do what I mean" [evaluate 3 + 078 as 0103, treating 78 as 7 *8 + 8], or "Do something daft" [eg evaluate as 073 (treating 8 as an integer-overflow in the units place, which is not checked for)] There's obviously an advantage to the user in being warned when an integer overflow occurs; is there any possible downside (apart from a couple of extra lines of code? Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Re: Bash arithmetic doesn't give error message on wrap.
Bob Proulx wrote: Andreas Schwab wrote: Richard Neill <[EMAIL PROTECTED]> writes: Are you sure this isn't comparable? After all, in both cases, the user has submitted something to which bash cannot give a sensible answer. In the integer-overflow case, bash simply returns the wrong answer, with no warning. The answer is not really wrong, it's the same you get from the equivalent expression when evaluated in C. Let me phrase this in a different way. 8< [OK - agreed. Thanks for your explanation] Avoiding overflow is actually a very difficult problem. People have been working with and around overflow issues for years and years. There is no clear "right" answer. On some cpus the result is done one way and on others the result is done a different way. It is in these cases where typically POSIX would give up and declare it undefined behavior. This is why "40*40" appears as a completely different problem than "08". I thought testing for overflow was quite simple? Isn't it just a case of looking at the carry-bit, and seeing whether it gets set? If so, then the warning message would be a two-line patch to the code. That said, I don't know enough about CPU internals to know what the carry-bits do with multiplication. (Addition/Subtraction overflows just change the carry-flag; Integer Division never suffers from overflows). About the only way for bash to avoid it would be to include a full arbitrary precision math library to evaluate these expressions itself. I wasn't suggesting that! We have bc anyway. I was only suggesting that, when the CPU detects an overflow, bash could pass on the warning. But that would slow the shell down by a large amount and it would make the shell much bigger than it is today. Both of those things would cause people problems. The entire reason cpus have math processors is because these operations can be quite slow when done in software. To give some additional weight to this, note that perl also uses the underlying cpu for numerical computations. So this is the same issue as would be true in Perl. Or in C/C++ too. perl -e 'printf("%d\n",40*40);' -2446744073709551616 Yes... but that's actually the %d doing it. Perl would automatically convert to a float. Eg $ perl -e 'print(40*40);' 1.6e+19 Thanks very much for your explanation. Best wishes, Richard ___ Bug-bash mailing list Bug-bash@gnu.org http://lists.gnu.org/mailman/listinfo/bug-bash
Bash error message for unterminated heredoc is unhelpful.
Dear All, In some cases, bash gives exceptionally unhelpful error messages, of the sort "Unexpected end of file". This is next-to-useless as a debugging aid, since there is no way to find out where the error really lies. I'm using bash version: bash-3.2-7mdv2008.1 Here are 2 shell scripts with examples. bug-example.sh demonstrates the problem. bug-example2.sh is where bash gets it right. Thanks very much, Richard [EMAIL PROTECTED] ~]$ cat bug-example.sh - #!/bin/bash #This is an example of bash being unhelpful with syntax errors. function usage () { cat <<-EOT This help text is a heredoc EOT #NOTE that the line above contains a trailing TAB, and # is therefore NOT valid as the end of the heredoc. } usage echo "A long script goes here..." echo "Many lines later..." exit 0 #This script generates the following error: # ./bug-example.sh: line 37: syntax error: unexpected end of file #That is really not helpful, since there's no way to track down where #the error started. Line 37 is not interesting. What we need is a #warning about line 6. #At a minimum, we should get this error message: # ./bug-example.sh: line 37: syntax error: unexpected end of file # (started at line 6) #Better, would be this error message: # ./bug-example.sh: line 6: syntax error: unterminated heredoc. #An additional buglet is that in fact, trailing whitespace after #a heredoc terminator should probably be ignored anyway? - [EMAIL PROTECTED] ~]$ cat bug-example2.sh - #!/bin/bash #This is an example of bash being *helpful* with syntax errors. X=$((1 + 2) #NOTE missing trailing ). echo "X is $X" #Should print 'X is 3' echo "A long script goes here..." echo "Many lines later..." exit 0 #This script gets it right with the error message. # ./bug-example2.sh: line 5: unexpected EOF while looking for # matching `)' # ./bug-example2.sh: line 20: syntax error: unexpected end of file #So, we can quickly find the bug. -
Re: Bash error message for unterminated heredoc is unhelpful.
Chet Ramey wrote: > Richard Neill wrote: >> Dear All, >> >> In some cases, bash gives exceptionally unhelpful error messages, of the >> sort "Unexpected end of file". This is next-to-useless as a debugging >> aid, since there is no way to find out where the error really lies. > > For better or worse, bash allows end-of-file to delimit a here document. > That is historical sh behavior. > > The end-of-file syntax error message comes when the shell tries to read > the token following the here document. > > Chet > Thanks for your reply. Fair point. But nevertheless, couldn't bash still tell us why the EOF is unexpected, and where the _start_ of the offending syntax lies? That was the point of my second example: the error is still an unexpected EOF, but in that case, bash also informs the user that the missing ')' is due to a '(' which was on line 5. Also, bash should tell us *why* the end-of-file is unexpected... Incidentally, bash is similarly unhelpful in the case of an 'if' statement which is missing its 'fi', or a case missing its esac. If the script is a long one, there's no easy way to locate the offending token, short of laboriously commenting out sections and doing the "divide-and-conquer" method. Regards, Richard
Bash/readline enhancement: wish to pre-set initial value of input text
Dear All, When using read, it would be really neat to be able to pre-fill the form with a default (or previous) value. For example, a script which wants you to enter your name, and thinks that my name is Richard, but that I might want to correct it. Alternatively, this would be useful within a loop, to allow correction of previously-entered text, without fully typing it again. So, I propose an extra option, -i, to read, which will set the initial value of the text following the prompt. For example, #!/bin/bash read -e -p 'Enter your name: ' -i 'Richard' NAME echo "Hello, $NAME" This would print: Enter your name: Richard I would then be able to edit the part after the prompt, and change it to: Enter your name: R. Neill This would then print: Hello, R. Neill It is equivalent to the following in PHP/HTML: Enter your name: An alternative syntax might be to make use of stdin for the read command, eg: echo 'Richard' | read -e -p 'Enter your name: ' NAME though I think I prefer the -i. I hope you like this idea. Thanks very much for your help. Richard
Bash substrings: wish for support for negative length (read till n from end)
Dear All, Substrings in bash contain 2 parameters, the start and the length. Start may be negative, but if length is negative, it throws an error. My request is that bash should understand negative length. This would be useful in some occasions, and would be similar to the way PHP does it: http://uk.php.net/manual/en/function.substr.php For clarity, here are all the cases; the relevant ones are the last two: $ stringZ=abcdef $ echo ${stringZ:2} #Positive start, no length cdef#Reads from start. $ echo ${stringZ: -2} #Negative start, no length ef #Reads 2 back from end. $ echo ${stringZ:-2}#No space before the - abcdef #Is this what we expect? #(or an unrelated bug?) $ echo ${stringZ:2:1} #Starts at 2, reads 1 char. c $ echo ${stringZ:2: -1} #Wish: start at 2, read till ERROR #1 before the end. i.e. # cde $ echo ${stringZ: -3: -1} #Wish: start 3 back, read till ERROR #1 before the end. i.e. # de i.e. ${string:x:y} * returns the string, from start position x for y characters. * but, if x is negative, start from the right hand side * if y is negative, print up to (the end - y) Thanks very much, Richard
Re: Bash substrings: wish for support for negative length (read till n from end)
Jan Schampera wrote: > Richard Neill wrote: > >> $ echo ${stringZ:2: -1} #Wish: start at 2, read till >> ERROR#1 before the end. i.e. >> # cde >> >> $ echo ${stringZ: -3: -1}#Wish: start 3 back, read till >> ERROR#1 before the end. i.e. > > Use (-1), i.e. > > $ echo ${stringZ:2:(-1)} > > See also > http://bash-hackers.org/wiki/doku.php/syntax/pe#substring_expansion (at > bottom of the section). > Dear Jan, Thanks for your comment. I now understand why echo ${stringZ:-1} is invalid, and it has to be echo ${stringZ:(-1)} or echo ${stringZ: -1} BUT, the example that you gave doesn't actually work; it returns with an error: -bash: (-1): substring expression < 0 My point is that, though the first parameter (offset) may be negative, the second parameter (length) can only be positive. This is what would be useful to change. Thanks, Richard
Re: Bash/readline enhancement: wish to pre-set initial value of input text
Jan Schampera wrote: > Richard Neill wrote: >> Dear All, >> >> When using read, it would be really neat to be able to pre-fill the form >> with a default (or previous) value. >> >> For example, a script which wants you to enter your name, and thinks >> that my name is Richard, but that I might want to correct it. >> Alternatively, this would be useful within a loop, to allow correction >> of previously-entered text, without fully typing it again. > > A bit of the functionality (in some way) is already there. You can > preload the commandline history and use read -e: > > --snipsnap-- > If -e is supplied and the shell is interactive, readline is used to > obtain the line. > --snipsnap-- > > A bit of hard work, though. > I do have a sort of workaround, namely to put the default value on the clipboard, by using: function copy_to_clipboard () { #If we're running KDE, put the text into klipper. #Else, use xclip. Fail silently if we can't do it. { dcop klipper klipper setClipboardContents "$1" || echo "$1" | xclip ; } > /dev/null 2>&1 } copy_to_clipboard INITIAL_VALUE read -e -p "Enter name" NAME #User must middle-click but this is rather ugly, as well as only semi-functional. What I want to do is just to pre-fill readline's buffer, before it gets presented to the user. I also tried to hack something nasty out of xmacroplay, but that doesn't work. Richard
Bash: proposal for >>> operator
Dear All, Might I propose bash should add another operator, >>> for "redirection into a variable". This would be by analogy with the <<< operator. For example, we can currently use <<< to save an "echo", by doing this: TEXT="Hello World" grep -o 'hello' <<<"$TEXT" instead of TEXT="Hello World" echo "$TEXT" | grep -o 'hello' I am therefore proposing that the following syntax would be useful: echo "Hello World" >>> TEXT creates a variable named TEXT, whose contents is the string "Hello". Why is this useful? 1. Read has a nasty habit of creating subshells. For example, echo Hello | read TEXT doesn't do what we expect. TEXT is empty! 2. The $() or `` constructs are great, excepting that they also create subshells. This messes up things like PIPESTATUS. For example: echo hello | cat | cat | cat #hello echo [EMAIL PROTECTED] #0 0 0 0 TEXT=$(echo hello | cat | cat | cat ) echo [EMAIL PROTECTED] #0 Here we've captured the output we wanted, but lost the pipestatus. 3. The $() construct doesn't let you capture both stderr and stdout into different variables. I know I could do it all with tempfiles, but that somewhat misses the point. Incidentally, if this is useful, it would be nice to support the rather prettier counterpart to the <<< operator, and permit this usage: "$TEXT" >>> grep -o 'hello' What do you think? Regards, Richard
bash: request for a way to return variables to the parent of a subshell
At the moment, variables set within a subshell can never be accessed by the parent script. This is true, even for an implicit subshell such as caused by read. For example, consider the following (slightly contrived example) touch example-file ls -l | while read LINE ; do if [[ "$LINE" =~ example-file ]]; then MATCH=true; [a] echo "Match-1" fi ; done if [ "$MATCH" == true ] ;then [b] echo "Match-2" fi --- This prints "Match-1", but does not print "Match-2". The only way to get data out of the read-subshell is by something like "exit 2", and looking at $? It's already possible to export a variable into the environment, and for subshells to inherit variables from the main script. Do we need a new keyword to achieve the reverse? Is there any way to make sure that variables defined at [a] can be made to still exist at [b] ? Thanks, Richard
Re: bash: request for a way to return variables to the parent of a subshell
Dear Eric, Thank you for your helpful answer. I'd understood that bash *doesn't* pass info back from the child to the parent, but I didn't realise that it was fundamentally *impossible* to do in Unix. I guess that tempfiles would do it - though that's rather ugly. Is there any way to use "read" to iterate over its standard input without creating a subshell? If it's of interest, the actual part of the script I use is below - the aim is to parse the output of "ffmpeg -formats" to see whether certain codecs are supported by that build. Regards, Richard -- #Does this version of FFMPEG support the relevant file format? Exit #if not. Arguments: $1='file' or 'codec'; $2='E','encode' or # 'D','decode', $3=format/codec_name #Example:check_ffmpeg_format file D ogg #The output of `ffmpeg -formats` has section headings such as #"File formats", and each section is delimited by a blank line. #The first part of the line contains letters DEA(etc) depending #on whether the codec/file is supported #for reading (decoding) and/or writing (encoding). function check_ffmpeg_format_support(){ . local filecodec=$1 local decodeencode=$2 local filecodec_name=$3 if [ $filecodec == 'file' ];then local start_trigger='File formats:' local end_trigger='' local terminator='\ +' local filecodec_txt='file format' else local start_trigger='Codecs:' local end_trigger='' local terminator='$' local filecodec_txt='with codec' fi if [ $decodeencode == 'decode' -o $decodeencode == 'D' ];then local decodeencode='D[A-Z ]*' local decodeencode_txt='decoding' else local decodeencode='[A-Z ]?E[A-Z ]*' local decodeencode_txt='encoding' fi local matchme='^\ *'$decodeencode'\ +'$filecodec_name'\ *'$terminator local relevant=false #Warning: this pipe has the effect of a subshell. Variables are #set inside the pipeline, and cannot be accessed outside it. #Search between trigger points. ffmpeg -formats 2>/dev/null | while read line ; do if [[ $line =~ $start_trigger ]]; then relevant=true fi if [[ $line == $end_trigger ]]; then relevant=false fi if [ "$relevant" == true ];then if [[ $line =~ $matchme ]];then #Regex match. exit 2 fi #Exit the '| while read...' part, and return $? so we #know the result. fi done if [ $? != 2 ]; then echo -e "ERROR: the installed version of 'ffmpeg' was built without support enabled for $decodeencode_txt $filecodec_txt '$filecodec_name'.\n" exit 1 fi } -- Eric Blake wrote: According to Richard Neill on 7/22/2008 8:04 PM: | This prints "Match-1", but does not print "Match-2". | | The only way to get data out of the read-subshell is by something like | "exit 2", and looking at $? You can also use files. The position within a seekable file is preserved from child to parent, but once you are using files, you might as well go with the file contents. | It's already possible to export a variable into the environment, and for | subshells to inherit variables from the main script. Do we need a new | keyword to achieve the reverse? Is there any way to make sure that | variables defined at [a] can be made to still exist at [b] ? A new keyword is insufficient. This is something that is fundamentally not provided by Unix systems - short of using the file system, there really isn't a way to pass arbitrary information from a child process back to the parent. But maybe the 'source' builtin is what you are looking for, to execute a script in the same context rather than forking off a child.
Re: bash: request for a way to return variables to the parent of a subshell
Thank you. That's a really neat solution - and it would never have occurred to me. I always think from left to right! Richard Paul Jarc wrote: Richard Neill <[EMAIL PROTECTED]> wrote: the aim is to parse the output of "ffmpeg -formats" to see whether certain codecs are supported by that build. I'd use something like: while read line; do ... done < <(ffmpeg -formats 2>/dev/null) That puts ffmpeg into a subshell instead of read. paul
Bash RFE: Goto (especially for jumping while debugging)
Dear All, Here's a rather controversial request, namely that bash should support 'goto'. The reason I'd like to see it is to make debugging long scripts easier. I'm working at the moment on a 2000+ line script, and if I want to test stuff at the end, I'd really like to have something like the following: --- #!/bin/bash #initialisation stuff goes here. goto LABEL #lots of stuff here that I want to skip. #Bash doesn't have a multi-line comment feature. #Even if it did, one can't do a multi-line comment containing #another multi-line comment. A good editor makes this less #painful, but isn't an ideal solution. LABEL #stuff I want to test exit #stuff at the end of the script, which I don't want to run while testing. -- We already have exit, which is really short for "jump directly to the end". What would be great is a way to "jump into the middle". What do you think? Richard P.S. I am sure lots of people will complain (correctly) that "Goto is considered harmful". I'd agree that, in most cases, it is. But in some cases, such as the above, it's quite useful. Maybe the feature could be called "debug-goto" in order to emphasise its debugging nature, as opposed to use in a regular program. P.P.S. a hack that would demonstrate what I mean could be implemented by abusing heredocs. Instead of goto LABEL .. LABEL write cat >/dev/null <<'label_foo' ... LABEL This has exactly the same effect.
Re: Bash RFE: Goto (especially for jumping while debugging)
Bob Proulx wrote: Richard Neill wrote: Dear All, In the future please start a new message for a new thread of discussion. When you reply to old messages from three months ago those of us who actually keep months worth of email see the message threaded with the previous discussion about variables and subshells. If anyone has killed that thread then the continuation is also killed. D'oh! Sorry about that. I had always thought that editing the subject line was what changed the thread, rather than some internal state variable in the mail-client. I stand corrected. BTW, doesn't bash have a bugzilla? Here's a rather controversial request, namely that bash should support 'goto'. Although goto does have uses it is definitely a keyword of the damned. cat >/dev/null <<'label_foo' ... LABEL This has exactly the same effect. I actually was going to reply and suggest exactly the above. But then realized you had already suggested it. That does work. Actually, maybe the way is to define a function called "goto", which handles the common case of a forward jump. Normally I would do a commented out section. Easy and non-controversial. Or I would do if false; then...fi No good if one wants to jump over/out-of a control structure. For just a comment section this doesn't seem to be a compelling need. For that you would need to show how writing state-machines or something is so much easier or some such... I think existing functionality would work. Given that 'goto' would intentionally over-ride all other control structures, then I can suggest 2 ways this might be implemented: i. Treat goto as "exit", but don't clean up variables and functions. Then, go back to the start, skipping all instructions till LABEL. Doing it this way means that goto quits out of all nested control structures. ii. Source $0, but skip all instructions till LABEL. Doing it this way means that goto does not quit out of any nested control structures. Best wishes, Richard
idea: statically-linked "busy-bash"
Dear All, Here's an idea that occurred to me. I'm not sure whether it's a great idea, or a really really stupid one, so please feel free to shoot it down. Anyway, there are an awful lot of shell scripts where a huge number of the coreutils get repeatedly called in separate processes. This call-overhead makes the scripts run noticeably slower. What I'm suggesting is to experimentally build a version of bash which has mv/cp/ls/stat/grep/ all built in. This would be a rather bigger binary, (similar to busybox), but might allow much much faster execution of long scripts. A very quick experiment shows that this might be worthwhile: date; for ((i=0;i<100;i++)); do echo -n ""; done; date; for ((i=0;i<1;i++)); do /bin/echo -n ""; done; date Prints: Thu Apr 9 07:05:19 BST 2009 Thu Apr 9 07:05:30 BST 2009 Thu Apr 9 07:05:47 BST 2009 In other words, 1E6 invocations of the builtin takes about 11 seconds, while 1E4 invocations of the standalone binary takes 17 seconds. The builtin echo is therefore about 150 times faster. What do you think? Richard
Re: idea: statically-linked "busy-bash"
Andreas Schwab wrote: What I'm suggesting is to experimentally build a version of bash which has mv/cp/ls/stat/grep/ all built in. This is possible without rebuilding bash, see the documentation of the `enable' builtin. There are already a few examples in the bash distribution under examples/loadables. Thanks - that's interesting to know. However, most of the examples seem to be cut-down functions in some way - I can see that becoming incompatible/non-portable. What I was contemplating is to build in most of the actual gnu utils with the aim of making eg the builtin "grep" functionally identical to /bin/grep. Then I could make a system-wide change, replacing the regular bash shell with the fat-bash, and still have everything work... Richard
Re: $\n doesn't get expanded between double-quotes
Thanks for your reply. Description: Bash allows escape characters to be embedded by using the $'\n' syntax. However, unlike all other $variables, this doesn't work with embedded newlines. I think it should. Repeat-By: X="a$'\n'b c" echo "$X" expect to see: a b c actually see: a$'\n'b c Fix: $'\n' should be expanded within double-quotes, like other variables are. Otherwise, please correct the man-page to make it clearer. $'\n' is not a variable. As the man page says: Words of the form $'string' are treated specially. Note "Words". Inside double quotes, $'\n' is not a word. I agree that this is technically correct. However, it violates the principle of least-surprise, which is that, in bash, the $ symbol always expands the value of the thing after it, (with the exceptions of '$' and \$ .) (On re-reading the man-page, I agree that the documentation is consistent with your explanation; though it still appears more likely to imply mine) If this is a feature, not a bug, then is there a better way to include newlines in a variable-assignment? The syntaxX="a"$'\n'"b c"will do it, but that is really really ugly. X=$'a\nb c' This is still a missing feature: how to embed newlines in double-quoted bash string assignment: For example, if I want to write: EMAIL_BODY="Dear $NAME,$'\n\n'Here are the log-files for $(date)$'\n\n'Regards,$'\n\n'$SENDER" then this doesn't work. There are ways around it, such as: - building up the string in pieces or - EMAIL_BODY=$(echo -e "$EMAIL_BODY") but it's really ugly to do. As I imagine that nobody uses the current $'\n' inside double-quotes, may I request this as a functionality change? Best wishes, Richard