exp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
This looks as step backwards
How it is different from old yacc
exp <- digit '*' digit EOF -> {$$ = (* $1 $3)}
My approach is different. I made tree structure explicit in rule by binding as
in example
tree = '(' number:value tree?:left tree?:rigth ')' result(Tree)
result(Tree) creates Tree object and sets value,left,rigth fields to
corresponding values
On Fri, Dec 10, 2010 at 01:47:59PM +1300, Peter Cashin wrote:
> Mathias:
> A good plan to separate out the:
> - grammar operators
> - parse tree building
> - parser action code
> I agree with you, but in my case I want to have the grammar language
> totally independent of the implementation programming language.
> So the grammar has no parser action code, although this can be added later
> in a particular programming language implementation. Parboiled has much
> closer integration with the programming language, which has advantages,
> but I really want grammar specifications that are totally independent.
> So now for the grammar operators, and parse tree building annotations. For
> many years I kept these separate: tree building annotations in the rule
> head (ie they annotate the rule name) or definition syntax (= for interior
> nodes, : for leaf nodes or terminals). The right hand side of the rule had
> the grammar expression body withs grammar operations only. One more head
> notation allows pruning the tree.
> This works fine, and maybe I should have stuck to that separation, but I
> introduced the `x prefix notation because I found that it was an advantage
> to be able to see the children nodes that would be generated by looking at
> the parent rule alone, and not having to refer to the children rule
> definitions to see how they are annotated.
> I have found that about half of many grammars turn out to be leaf nodes,
> so the ":" rule definition notation has a huge payoff in tree size and
> performance.
> Because of this the `x rule is not vital, but over time I found it was
> convenient, and a good trade-off for me.
> Cheers,
> Peter.
>
> On Fri, Dec 10, 2010 at 11:44 AM, Mathias <[1][email protected]>
> wrote:
>
> Over the course of developing parboiled (PEG parsing engine for Java and
> Scala using internals DSLs for grammar specification) I found that it's
> best to clearly separate the following three things:
> - grammar operators
> - parse tree building
> - parser action code
>
> The operators, together with the rules they are applied to, define the
> language your grammar recognizes. Nothing more, nothing less.
> Whether your engine builds a parse tree for input of that language or
> whether it doesn't should likely not be specified at the operator level,
> but it some other way.
> In "parboiled for Java" operators and rules are modeled as method calls
> whereas parse tree building is controlled via annotations on those
> methods. You can selectively enable or disable the creation of parse
> tree nodes per grammar rule, which allows you to tweak parse tree
> building exactly to your needs.
>
> The interface between the parsing engine and custom parser action code
> is the thing that has changed the most over the course of parboileds
> life cycle so far.
> Initially parser action code had to access the parse tree nodes in order
> to get to matched input. Creating a custom object structure during the
> parsing run (e.g. an AST) was done by decorating the parse tree. So the
> parse tree was the "work bench" of the parsing process.
> Over time this heavy centering around the parse tree turned out to have
> two problems:
>
> 1. Low Performance.
> Always having to create a parse tree even when all you care about is the
> AST is just wasteful. Especially since the parse tree can contain a huge
> number of nodes for larger inputs (sometimes more nodes than input
> characters).
>
> 2. Less room for automatic optimizations.
> The structure of the parse tree is dictated by the structure of the
> grammar rules. If your parser action code is built under the assumption
> that the parse tree has a given structure there is little leeway for the
> parsing engine to apply automatic rule optimizations. Decoupling the
> parser action code from the parse tree opens up the possibility to apply
> all kinds of automatic grammar tweaking before running the parser the
> first time. The engine might decide to completely change the rule
> structure, as long as all changes do not change the recognized language
> and are transparent to the parser action code.
>
> Currently parboiled implements the interface between the parsing engine
> and the action code in the following way:
> 1. Actions can appear anywhere in a rule.
> 2. Actions can access the matched input text of the sub rule immediately
> preceding the action but not of any other rule (so there is no sub rule
> labeling required).
> 3. For working with custom objects (e.g. AST nodes) the engine provides
> a "Value Stack", which is a simple stack structure that serves as a fast
> work bench for a parsing run. Actions can push objects onto this stack,
> pop them off, swap them around, and so on.
>
> This solution completely decouples the parse tree from everything else.
> You can enable or disable parse tree building without any effect on the
> rest of the parser. There is no need for addressing sub rules in action
> expressions and given a somewhat efficient value stack implementation
> the whole thing is quite fast. Additionally, in "parboiled for Scala",
> the action code with its manipulations of the value stack can be
> statically type-checked at compile time, which is a huge plus.
>
> In case you are interested in more details or broader explanations, the
> parboiled documentation is quite complete.
>
> Cheers,
> Mathias
>
> ---
> [2][email protected]
> [3]http://www.parboiled.org
> On 09.12.2010, at 21:01, Alan Post wrote:
>
> > I'm working on my PEG parser, in particular the interface between
> > the parse tree and the code one can attach to productions that
> > are executed on a successful parse.
> >
> > I've arranged for the two predicate operations, & and !, to not add
> > any output to the parse tree. �That means that the following
> > production:
> >
> > �rule <- &a !b "c"
> >
> > Produces the same parse tree as:
> >
> > �rule <- "c"
> >
> > Internally, this means that I recognize that the sequence operator
> > (which contains the productions '&a', '!b', and '"c"' in this
> > example) is being called with predicates in every position but one,
> > and rather than returning a list containing that single element,
> > I return just the single element.
> >
> > As I've been doing this, I've found that I want a new operator similar
> > to '&'. �'&' matches the production it is attached to, but it does not
> > advance the position of the input buffer.
> >
> > I'd like an operator that matches the production it is attached to,
> > advances the input buffer, but doesn't add anything to the parse
> > tree.
> >
> > Here's an example:
> >
> > �mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> >
> > the mulexp production is a sequence of four other rules, but only
> > two of them are needed by the associated code. �It would be nice
> > if I could write the code rule like it is above, rather than say
> > this:
> >
> > �(lambda (x op y EOF) (* x y))
> >
> > Having to account for all the rules in the sequence, but really
> > only caring about two of them. �Here is the example rewritten
> > with '^' expressing "match the rule, advance the input, but don't
> > modify the parse tree":
> >
> > �mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))}
> >
> > Before I go inventing syntax for this use case, will you tell me if
> > this is already being done with other parsers? �Have any of you had
> > this problem and already solved it, and if so, what approach did you
> > take?
> >
> > -Alan
> > --
> > .i ko djuno fi le do sevzi
> >
> > _______________________________________________
> > PEG mailing list
> > [4][email protected]
> > [5]https://lists.csail.mit.edu/mailman/listinfo/peg
>
> _______________________________________________
> PEG mailing list
> [6][email protected]
> [7]https://lists.csail.mit.edu/mailman/listinfo/peg
>
> References
>
> Visible links
> 1. mailto:[email protected]
> 2. mailto:[email protected]
> 3. http://www.parboiled.org/
> 4. mailto:[email protected]
> 5. https://lists.csail.mit.edu/mailman/listinfo/peg
> 6. mailto:[email protected]
> 7. https://lists.csail.mit.edu/mailman/listinfo/peg
> _______________________________________________
> PEG mailing list
> [email protected]
> https://lists.csail.mit.edu/mailman/listinfo/peg
--
Plasma conduit breach
_______________________________________________
PEG mailing list
[email protected]
https://lists.csail.mit.edu/mailman/listinfo/peg