On Thu, 4 Sep 2025, 00:06 Chet Ramey, <[email protected]> wrote:
> If you think "significant parts" of the shell's behavior are undocumented,
> either point them out or propose documentation patches.
>
Looking at doc/bash.1, the following are some features which I believe are
intentional, and which yet are undocumented.
Compound statements generally:
• Semicolons are optional between a closing keyword (or symbol) and a
non-initial keyword. By way of example, note the lack of semicolons before
« then », « else », « fi », « do », « done », and « esac »:
*for* x *do if* (( RANDOM %2 )) *then case* $x *in* *) ( echo $x ) *esac
else while* [[ $1 = x ]] *do* echo $y ; *done fi done*
# *for*…*done* is all one line, in case your mailer decides to mangle it.
This might be clearer if “list of commands” was specified as *including* a
terminator, rather than explicitly specifying semicolon each time (which is
misleading, since newline and ampersand are equally valid there).
*Lists* says:
*A list is a sequence of one or more pipelines separated by one of the
operators ;, &, &&, or ||, and optionally terminated by one of ;, &, or
<newline>.*
This “optional” terminator is in fact *required* if the last command in the
list is (a pipeline whose last component is) a simple command, *and* the
following token is not one of « *)* », « *;;* », « *;;&* », or « *;&* ».
And then references to *list* should simply omit any mention of a following
semicolon, other than to refer to this definition.
That said, it might be helpful to rename this *list** to* *command-list* so
as to facilitate defining *word-list* (for use with « *for* *var* *in* … »
and « *select* *var* *in* … », and which likewise includes a trailing
semicolon or newline).
These definitions of *command-list* and *word-list* are, to the best of my
knowledge, the most consistent with the original Bourne shell
grammar, though the terminology used was doubtlessly different.
Arithmetic «for» loops:
*for ((**expr1**;**expr2**;**expr3**))* [*;*] *do* *list* *;* *done*
• After « *continue* », *expr3* will be evaluated and then *expr2*; that's
expected by anyone who writes C, but it's not stated in the manual;
• It notes that “*if any expression is omitted, it behaves as if it
evaluates to 1*”. That seems overly broad, as only *expr2* needs this
qualification; the others can simply be indicated as optional in the
synopsis, like:
*for ((* [*expr1*] *;* [*expr2*] *;* [*expr3*] *))* [*;*] *do*
*command-list* *done*
(using the definition of *command-list* as I suggested above)
• (I was going to mention that the semicolon before « do » is optional, but
that was recently updated in commit e009d30df, thank you.)
Select loops:
• synopsis indicates « *in* *word* » rather than « *in* *word-list* » or
equivalent
• if « *in* *word-list* » is absent, the semicolon before « do » is
optional (same as « *for* » loops, but see below about an alternative
definition for such lists).
Compound « case » statements:
• Semicolon is permitted before « *;;* », which logically consistent with «[
*(*]*pattern**)* *command-list* *;;* » using the definition of «
*command-list* » as above, including a terminating semicolon, ampersand, or
newline.
• The « ;; » before « esac » is optional (unless it's not preceded by a
command or a semicolon). However simply marking it as optional in the
synopsis is problematic unless COMMANDS is updated to mean "including the
trailing semicolon/ampersand/newline", and every other synopsis adjusted to
take that into account.
Conditional Expressions
• != does no mention pattern matching with [[
• =, == says unequivocally “True if the strings are equal”, but readers
with long attention spans will later read “When used with the [[ command,
performs pattern matching”.
Both of these need to be more even-handed, like:
=, == for the [[ command, attempt pattern matching and return true iff
successful; for the test and [ commands, compare strings and return true
iff they are the same.
!= same as = except the return status is inverted.
Readline Notation
*On some keyboards, the Meta key modifier produces characters with the
eighth bit (0200) set.*
Maybe document that this never applies to terminals that are in UTF-8 mode.
(Also, more users are familiar with decimal 128 or hexadecimal 0x80 than
with octal 0200.)
Expansions:
• Not actually *un*documented, but “documented in a way that leads to
confusion”: having the sentence “*Omitting the colon tests only for a
parameter that is unset*” *before* the descriptions of « ${*parameter*:-
*word*} », « ${*parameter*:=*word*} », « ${*parameter*:?*word*} », and « ${
*parameter*:+*word*} » means it is easily missed by someone searching for
the latter. Since those four expansions are the only ones to which this
applies, it might be clearer if that sentence was removed, and instead each
description mentioned both alternatives explicitly, for example:
${*parameter*-*word*}
* or*${*parameter*:-*word*}
Use Default Values.
If *parameter* is unset or, when using the second form, its value is null,
the expansion of *word* is substituted. Otherwise, the value of *parameter*
is substituted.
• Similarly the description of « ${*c* *command*;} » would be simpler if
introduced as:
${ *whitespace* *command-list* }* or*
${| *command-list* }
using the definition of *command-list* as I suggested above, but then note
that the whitespace is optional if *command-list* starts with a shell
metacharacter
Built-in commands
• « declare », « local », and « typeset » have « -c » (capitalize on
assignment) but don't docuent it.
I'm sure there are more, but I'm tired and getting error-prone.
-Martin