Re: [sr #111166] ngettext syntax

2025-02-24 Thread Phi Debian
May be some orthogonality regarding %d expecting a string vs num-expr

$ a=1234 i=1
$ echo ${a:i+1}
34
$ echo ${a:$((i+1))}
34
$

Why do we have this 'offset' evaluated, you could use the same argument you
used for printf that is $((...)) is good enough, add clarity, remove
ambiguity and the docco for 'offset' could have been "a valid number"

Why printf is different?

How many people bother to write ${a:$((i+1))} when ${a:i+1} is good enough.

I am pretty sure that if printf was orthogonal with ${var:offset:length},
nobody would ever use $((...)) for %d conversion specifier.

I think the 'clarity/ambiguity' argument is not completely real, may be it
is more "to be aligned with /bin/printf" that is not able to evaluate
expressions, but if this is the reason, then bash printf has no reason to
support say %n that /bin/printf could not honor.

$ /bin/printf '123%n\n' i ; echo "i=$i"
123/bin/printf: %n: invalid conversion specification
i=3
$ printf '123%n\n' i ; echo "i=$i"
123
i=3
$

So if we admit bash printf is not /bin/printf then %d expecting a 'a string
that must be valid num-expr' instead of 'a string that must be a valid
number' is a possible deviation. Again may be with a flag... if
compatibility we xyz is a concern


Re: How to fix this

2025-02-24 Thread microsuxx
what u wanna fix on it

On Mon, Feb 24, 2025, 3:13 PM rjuliuscezar--- via Bug reports for the GNU
Bourne Again SHell  wrote:

> usage: command-not-found 
>
> ╭─u0_a440@localhost ~
> ╰─🍄💨➤➤
>
>


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Phi Debian
On Mon, Feb 24, 2025 at 8:22 AM Martin D Kealey 
wrote:

>
>
> On Mon, 24 Feb 2025 at 13:48, Phi Debian  wrote:
>
>> I also forgot to mention that C99 introduced this
>>
>>There may be no gaps
>>in the numbers of arguments specified using '$'; for example, if
>>  argu‐
>>ments  1  and  3 are specified, argument 2 must also be specified
>> some‐
>>where in the format string.
>
>
> That rule is necessary in C because of the many ways that varargs can:
> different kinds of values can potentially occupy differing numbers of
> consecutive bytes in the space used to store as parameters. (I would say
> "in the call stack", except that's not how ISO-9899 defines things, even
> though it is how most implementations work in practice.)
>

> In the shell, we already have an argv with consistently sized elements, so
> randomly indexing into it does not pose any particular challenge. So I
> wouldn't expect that rule to be necessary, in the sense that it doesn't
> offer any flexibility to the implementation.
>
That's true, but yet it define a kind of idiom regarding printf that I agry
a shell don't have to follow (and it doesn't in my ksh93 patch) but still
could be considered.


> -Martin
>
> PS: Hmmm, well I guess repetition could be implemented by simply shifting
> off as many args as there are conversions, and that would be a simple way
> to get consistent repetition whether conversions are numbered or not, but
> it feels perverse and I don't like it - though maybe I'll change my mind
> when I've thought it over.
>

Not sure I get it, but if you mean
$ printf '%1$s %1$s %1$s\n" q w e r t y
Should produce
q q q
r r r

I don't agry, or I would say this is not the current ksh93 behavior
$ printf '%1$s %1$s %1$s\n' q w e r t y
q q q
w w w
e e e
r r r
t t t
y y y

Still ksh93 can produce your output (perverted indeed)
$ printf '%1$s %1$s %1$s%3$Z\n' q w e r t y
q q q
r r r

So while ksh can produce both output (flexible) your algo would prevent the
former output, so you loose flexibility


> PPS: I wouldn't actually shift the list of course; I'd just track a
> striding index offset instead.
>


How to fix this

2025-02-24 Thread rjuliuscezar--- via Bug reports for the GNU Bourne Again SHell
usage: command-not-found 

╭─u0_a440@localhost ~
╰─🍄💨➤➤



Re: [sr #111166] ngettext syntax

2025-02-24 Thread Chet Ramey

On 2/22/25 4:38 AM, Phi Debian wrote:


This way pure unumbered (no mixed indexed) works as before (as we used to)
Pure numbered (no unumbered) works as it does in C.
Mixed depart from C (as do fmt re-use) and the rule of thumb is easy to 
remember, count unnumbered as they appears, and numbered as indexed, there 
is one last catch, for fmt re-use, where to start the next iteration, well 
I decided the next one arg in line for next fmt re-use is the max of 
unumbered one and numbered one max index, i.e if there are 4 unumbered and 
max index is 3 then the next fmt arg start is 5, and the other way around 
if only 2 unumbered was used and max indexed is 4 then again next fmt start 
is at 5.


Yes, this is what I was trying to say the other day. You use some base for
the numbered arguments for the format reuse case, and it's the highest-
numbered argument one of the conversions consumes. That's basically what
POSIX says in point 10. That base starts at 0 so there's no offset the
first time through the format string.




This is one approach, other approach could be made, like refusing 
mix'n'match, accepting only pure access (numbered or unumbered exclusivly)


POSIX does say it's unspecified, and puts an application requirement in
place not to mix them.

Note that I used the same rule for numbered unumbered mix'n'match for the 
implementation of %*.*s and %n$*w$.*p$s  that is the use of num.prec

  |  |   prec
  |  width
  pos
Note the pos is not too intuitive but that's libc implementation, it comes 
before num.prec.


It looks like coreutils printf does the same thing.



 >
 >     It decreases clarity. If you have an argument '1+1', your proposal
 >     makes it depend on the conversion specifier. Right now, there's no
 >     ambiguity: 1+1 is a string, and $(( 1+1 )) is 2, regardless of
whether
 >     or not the conversion specifier accepts an integer argument.


But bash already have a conversion specifier ambiguity resolved

$ printf '%d\n' yo
bash: printf: yo: invalid number
0
$ printf '%x\n' yo
bash: printf: yo: invalid number
0


I don't get the argument. There's no ambiguity: yo is always a string.
If you're saying that you should treat it as an expression, then you get
the value of $yo. I'm pretty sure that doesn't increase clarity.



The beast of burden is already done in bash, it does scan the fmt string 
recognise integer conversion specifier fetch the arg 'string' and apply an 
integer validity test on it, this is on this last operation that the 
validity check could be replaced by a airth_eval(string)


Sure, I get how it could be done. I just don't buy the argument for doing
it.



It is not an expression context at the moment but could be one day, for now 
it is an 'integer' context, so the error when the string is not an integer 
(number), and nothing really define what is a context of an arg in a 
command arg list, beside it is a string.


When I do
$ function f { echo $(($1)) ; }
$ f 1+1
2

This is perfecty valid f() really accept what you call an expression 
context, while it is just a string that is internally view and used as an 
arith expr.


That expansion inside $((...)) is standardized. POSIX says what happens
to the expression.

If f() is capable to decide what arg could be an arith expresion why printf 
could not ?


Oh, it certainly could. I just don't think it adds anything. I know ksh93
does it, but ksh93 threw arithmetic evaluation in a bunch of different
places before $((...)) came along, and I think arithmetic expansion renders
them superfluous.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Chet Ramey

On 2/22/25 5:05 AM, Phi Debian wrote:



On Sat, Feb 22, 2025 at 9:49 AM Andreas Schwab > wrote:


On Feb 22 2025, Phi Debian wrote:

 > I forgot to mention your trick to nuke the fmt reuse still works
 >
 > $ printf '%s %s %s %999$s' A B C D E F G
 > A B C

As long as NL_ARGMAX >= 999.


Sounds like I was over optimistic

$ echo $KSH_VERSION
Version AJM 93u+m/1.1.0-alpha+b5087279 2025-01-15
$ printf '%s %s %999$s\n' 1 2 3 4 5 6 7 8
1 2
$ unset A B C D E F G H
$ printf '%s %s %999$s\n'  A B C D E F G H
./ksh: printf: C: parameter not set
./ksh: printf: D: parameter not set
./ksh: printf: E: parameter not set
./ksh: printf: F: parameter not set
./ksh: printf: G: parameter not set
./ksh: printf: H: parameter not set
A B
$


"11. If an argument operand to be consumed by a conversion specification
 does not exist:

o If it is a numbered argument conversion specification, printf should
  write a diagnostic message to standard error and exit with non-zero
  status, but may behave as for an unnumbered argument conversion
  specification."

I know "should" is less strict than "shall," but someone might still do it.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Chet Ramey

On 2/22/25 6:33 PM, Robert Elz wrote:


Further, this spec also violates Chet's rule:

chet.ra...@case.edu said:
   | There is no user who would think that using a numbered conversion specifier
   | is not an absolute position in the original argument list.

which does not apply when the format string is reused -- format string
reuse in the presence of numbered conversions would make a lot more
sense if it did, and %2$ *always* (on every iteration of the format
string) meant the 2nd arg.   Then mixing numbered and unnumbered
conversion specs would make sense, with only the unnumbered ones
consuming the excess args (and would make them make absolutely no
sense with only numbered conversions of course).


Yes, that rule is hard and fast only in the case of the first time through
the format string. User expectations differ if you reuse it.



And finally, to Andreas' comment which was what inspired me to continue
this relatively meaningless thread:

   | As long as NL_ARGMAX >= 999.

No, there is no such restriction. 


For printf(1), there is.

"In this case, the conversion specifier character '%' is replaced by the
sequence "%n$", where n is a decimal integer in the range [1,{NL_ARGMAX}],
giving the argument operand number."

An implementation is free to reject the numbered conversion if it exceeds
NL_ARGMAX, even if you say this is an application requirement, because of
rule 11 about out-of-range numbered conversion specifiers.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Chet Ramey

On 2/22/25 11:17 PM, Phi Debian wrote:
- if allowing mix and match numbered/unumbered then stays away from 
unumbered following a numbered mean the unumbered is last the numbered+1 
this is tempting but wrong, i.e '%3$s %s' one would be tempted to say %s is 
equiv to %4$s, but the width.prec predate that, because then one must 
define what happen when index back, i.e '%4$s %s %2$s %s' what is the last 
%s? some will say 3, other will say 6, in all case it will be verbose doc 
to say something complicated, so my recommendation is to  say non 
unnumbered start at 1 and increment at each unumbered, easy to remember and 
matching the case of fully unumbered and gaps are skipped over and ignored.


FreeBSD chose this interpretation, and macOS picked it up. They resolve
the ambiguity by simply rejecting unnumbered field width and precision
arguments with a numbered conversion specifier.

$ /usr/bin/printf '%3$*.*d\n' 4 2 3
printf: incomplete use of n$

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Chet Ramey

On 2/23/25 10:48 PM, Phi Debian wrote:


 > Again the rule I used for ksh93 is that the next arg for fmt reuse is
 > max( number_of_unumbered_occurences, highest_numberd_index )

So you make numbered conversion specifiers relative to some max base from
the previous scan through the format string, initializing that base to 0.
I had the same idea.


Yes, numbered are 1 based (0 is the fmt string :-) and don't move) and the 
next fmt roll start past the highest numbered index so for '%2*4$.*1$s' 
next roll arg1 is the 5th,

for unumbered we count '%s %s %s' next roll start at 4.
And the ugly mix
'%s %3$s %s' next roll start at 4 (sigma(%s)=2 max(indexes)=3 ==> next is 4)
'%s %s %3$s %s %s' next roll start at 5 (sigma(%s)=4 max(indexes)=3 ==>next 
is 5)

And we have gaps in numbered.


I think we're saying the same thing, and POSIX says it, too, it its own way.


I also forgot to mention that C99 introduced this

    There may be no gaps
        in the numbers of arguments specified using '$'; for example, if  argu‐
        ments  1  and  3 are specified, argument 2 must also be specified some‐
        where in the format string.

Another rule that could be imported, for ksh93 I didn't implemented it, so 
I accept what so called gap,


POSIX requires that:

"Unlike the printf() function, when numbered conversion specifications are
used, specifying the Nth argument does not require that all the leading
arguments, from the first to the (N-1)th, are specified in the format
string. For example, "%3$s %1$d\n" is an acceptable format operand which
evaluates the first and third argument operands but not the second."


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [sr #111166] ngettext syntax

2025-02-24 Thread Phi Debian
On Mon, Feb 24, 2025 at 4:14 PM Chet Ramey  wrote:

>
> >
> > The beast of burden is already done in bash, it does scan the fmt string
> > recognise integer conversion specifier fetch the arg 'string' and apply
> an
> > integer validity test on it, this is on this last operation that the
> > validity check could be replaced by a airth_eval(string)
>
> Sure, I get how it could be done. I just don't buy the argument for doing
> it.
>
> 
>
> Oh, it certainly could. I just don't think it adds anything. I know ksh93
> does it, but ksh93 threw arithmetic evaluation in a bunch of different
> places before $((...)) came along, and I think arithmetic expansion renders
> them superfluous.
>

Well I can provide a reason that pushed me toward arith exp for numeric
conversion char in printf, but again I have no problem with bash not doing
it,
this is mainly to reduce my interactive work keyboard typing experience,
i.e avoid to constantly type $((...) on each printf I type hundred of time
a day.

So bash can be used as a scripting language, and 'may be' verbose keyboard
typing is acceptable with a lot a $((...)) everywhere in a script, but when
doing interactive work saving this keyboard typing of $((...)) is a big
deal, specially that my typing skills are below the average, and dislexie
add to the complexity I keep doing $(...) in lieux of $((...)) with indeed
and great usage of the history to fix the typo, thanx for that :-)

In some aspect, the fmt reuse make a kind of iterator to produce thing
while printf was not an iterator in spirit, we have of other iterator like
{x..y} etc, the ftm reuse allow printf to become an iterator to
prepend/append things to a word in wordlist, I abuse it a lot, on the same
vein, I think the eval of num conversion is a kind of shortcut, and I love
shortcut when I am typing :-)

And indeed, printf could have a flag allowing num expression for num
conversion, either printf option or shopt option, after all we do have
option for extglob, meaning we deal with various form of patterns

No need to say my .rc file have extglob on all the time and would have
printf_numexpr on all the time too :-)