Hello. Chet Ramey wrote in <9b6dfdc2-ade9-16ad-8960-5b2887b35...@case.edu>: |On 7/7/22 12:11 PM, Steffen Nurpmeso wrote: |> Funnily my parser has only one (what i know) problem left, the |> same as bash. On the other hand i found more. | |The thing about all of this is that these are operators, and so delimit |tokens. Whitespace is siginficant only when determining the length of the |token or operator. Operators have context-dependent meaning (e.g., `+' can |be a unary or binary operator depending on context, and `++10' does not |mean the same thing as `+ +10').
Yes for one; but on the other hand all parsers _do_ resolve series of (overly) successive operators, including bash. (I post my test -- as of now, i do not yet support ?: --, we differ only in fewest cases, which supports my thinking.) ... You quoted the "wrong" things again here, which i only included in my message for context (i wholeheartly grant "my thoughts of usable context", but then again you say it above, no?). The problem of the bash parser is solely $ bash -c 'I=10;echo $((+10++I))' bash: line 1: +10++I: syntax error in expression (error token is "++I") whereas it does it right for numerics $ bash -c 'I=10;echo $((+10++10))' 20 I fixed that here today (if last "operator" is neither postfix nor a number, and if the current operator is postfix, then check whether a variable follows directly (after WS), if so, split postfix into a binary to push, and decrement buffer by one; the next is then parsed as UNARY and dropped, then we come to the var). |> And one more thing. |> |> -<802379605485813759> |> +<9223372036854775807> |> |> This is from |> |> $ bash -c 'echo $((999999999999999999999999999999999999999999999))' |> 802379605485813759 |> $ dash -c 'echo $((999999999999999999999999999999999999999999999))' |> 9223372036854775807 | |dash is what happens when you clamp the value at INTMAX_MAX (LLONG_MAX) |because it overflows -- but don't say anything about it -- instead of just |doing the conversion without checking for overflow. Neither value is |`right', and even the predictability of INTMAX_MAX is useless. Yes, what the standard "itoa" gives you. (With POW2 bases parsed as unsigned. I do not really know what the standard gives you, i "always used my own itoa and atoi things", ... or did not care for corner cases aka just bailed for error. I was baffled ...) So you seem to use your own itoa, and here is (another) bash bug. ..I was baffled.. with having to change my own itoa after ~17-18 years (actually a fully distinct 2nd implementation of "it"): [..automatic base detection code..] /* Char after prefix must be valid. However, after some error * in the tor software all libraries (which had to) turned to * an interpretation of the C standard which says that the * prefix may optionally precede an otherwise valid sequence, * which means that "0x" is not a STATE_INVAL error but gives * a "0" result with a "STATE_BASE" error and a rest of "x" */ [well i skip the code, better it is :)] I think the POSIX people were talking about this; Eric Blake? In this hindsight bash should bail with syntax error due to 0x: $ bash -c 'I=10;echo $((+10++0x))' 10 bash, like other shells, but not like perl, for example, does not support **= whereas it does support **. (It does not understand 0b and 0B for binary. So with limit-bound integers, **=, and +10++I, my MUA and bash produce the very same results in a ~750 line test that includes nonsense like "$((3+(3*(I=11,+++++++++++++++++++++++-+++++I))))". Yay!) Ciao from Germany, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)