Re: [PEG] Left recursion

Ondřej Bílka Tue, 11 Jan 2011 07:56:22 -0800

On Tue, Jan 11, 2011 at 09:51:21AM -0500, Lowell Thomas wrote:
     
> 
>    Longest-match: In the original version of APG I used a longest-match rule.
>    But I was dealing with applications that needed to parse thousands of text
>    messages per second and for efficiency I quickly dropped it in favor of a
>    prioritized-choice rule. I don*t know that I did any timing tests, but
>    having to test every alternative branch of every alternative to find the
>    longest match just seemed unnecessary, even if most of them fail rather
>    quickly.
> 
>     
Longest match just tries to hide problems under the rug.
Instead (A | A B) you have problem with (A B | A) B not matching AB
> 
>    Prioritized-choice: I have come across a few, very few actually, valid CFG
>    phrases that I couldn*t match with prioritized-choice. But I haven*t found
>    any that I couldn*t match with prioritized-choice plus syntactic
>    predicates.
> 
>     
> 
>    Aside: However, if you do find a rule (phrase) you can*t match, there is a
>    simple way to handle it without throwing out the simplicity of
>    recursive-descent parsing. If you allow semantic actions on the down side
>    of the parse tree traversal and allow them to return a matched phrase
>    without traversing the branch below, you can easily insert handwritten
>    parsers for selected rules (phrases) to cover the offending case. Here, of
>    course, you are getting into the murky waters of handwritten parsers and
>    all the drawbacks that go with that. Not for everyone, probably, but I use
>    that feature in a grammar-preserving way mainly for speed and efficiency.
>    Sometimes simple phrases generate lengthy parse trees. I*ve found that I
>    can sometimes speed up a parser by a factor of 7 or 8 by handwriting just
>    a few of the simpler rules (alphanum, UTF-8, etc.) within a large, complex
>    grammar.
> 
>     
> 
>    Maybe I*ve strayed a little from the title topic, but it somehow seems
>    relevant to the comments I*ve read here.
> 
>     
> 
>    Lowell
> 
>      -----Original Message-----
>      From: [email protected]
>      [mailto:[email protected]]on Behalf Of Roman Redziejowski
>      Sent: Monday, January 10, 2011 1:53 PM
>      To: [email protected]
>      Subject: Re: [PEG] Left recursion
> 
>      Hi Lowell, Ondrej, Peter & Peter!
> 
>      Many thanks for very enlightening comments to my mail, some of them
>      coming straight from the other side of the globe! (Isn't NZ an almost
>      exact antipode of Sweden?)
> 
>      I see I misunderstood your discussion of "parse trees": you clearly
>      meant it abstractly as the way to describe how a parser actually parsed
>      its input. So, forget my remarks about the parsers constructing them or
>      not.
> 
>      My main point was that adding mechanisms for left recursion introduces
>      extra complexity (and you all seem to agree). Even without it, PEG is
>      not easy to understand as a language definition tool. This is my "what
>      is this thing doing" problem. If I read a BNF specification, or a set of
>      CFG productions, or a regular expression, I have a quite clear picture
>      of the language it defines. Not so with PEG. Looking at a thing like A
>      <- aAa / aa,  " I need a cold towel around my head" (excuse me, Peter) 
>      to find out what and when it may consume. Bryan Ford made some things
>      easy by defining the language of an expression to be the set of inputs
>      on which the expression succeeds - without necessarily consuming all. I
>      believe (correct me if I am wrong) that in this sense the language of A
>      is the just the set of all strings beginning with "aa". But to predict,
>      in general, how a grammar containing such A will parse a given input
>      requires a real effort.
> 
>      PEG is a bit of programmer's paradise. Easy to write something that
>      works. Try it. Enjoy your tricks. Debug. Modify. Play. The Java grammar
>      that I posted on my page succeeded without an error on thousands of Java
>      programs. But only recently I found that all the time it parsed
>      hex-float literals as something entirely different. Shall we add left
>      recursion to have more fun? Or instead do something to simplify? How
>      about some discipline, like the discpline of "structured programming"
>      that 40 years ago extracted us from the morass of "spaghetti
>      programmnig"?
> 
>      Perhaps one day we shall learn to "think PEG" like we "think BNF" today
>      (if we find it worthwile).
> 
>      My second point was that you can do without left recursion. When you
>      specify a language, you have to eventually say what it means. It is then
>      useful to construct the grammar so that it facilitates specification of
>      semantics. But, if you cannot specify left-to-right execution in the
>      grammar -- tough luck, you have to do it in semantics. If you can
>      specify priority of operators in the grammar -- do it. I fully agree
>      with Peter G. that iteration and recursion have their places. (My
>      remarks were a bit provocative.) No need to mention that specifcation
>      has to be comprehensible for the user. (During my time as IBM's slave I
>      had to describe syntax by "railroad diagrams" -- kind of horizontal
>      flow-charts -- not to overstress the feeble minds of customers by
>      "equations".)
> 
>      A question to Peter C.: how do you implement your longest match?
> 
>      And to Lowell: how do you produce your beautiful trees? Any ready-made
>      tool, or hand-coded?
> 
>      Best reagrds from snowed-in Stockholm
> 
>      Roman


> _______________________________________________
> PEG mailing list
> [email protected]
> https://lists.csail.mit.edu/mailman/listinfo/peg


-- 

We didn't pay the Internet bill and it's been cut off.

_______________________________________________
PEG mailing list
[email protected]
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Left recursion

Reply via email to