[antlr-dev] removing offensive fragments

Terence Parr Mon, 01 Feb 2010 12:46:51 -0800

hi, as you know, I am trying to pipeline the new version of ANTLR as much 
possible. the parser does little or nothing except create an AST. then, I'm 
going to have a number of semantic checking passes over that tree. Only if it 
passes all those checks, will it move onto grammar analysis and finally code 
generation.


The question I have relates to how to deal with the AST that has errors in it. 
I'm entertaining the idea of removing the snippets of the AST that have 
erroneous constructs, at least where possible. For example, if I see a lexer 
rule in a tree grammar,I should not only give an error. Shouldn't I strip it 
from the AST so that further phases can assume valid input? Seems like that 
should be one of the goals of the semantic checking.

OTOH, what I can do is simply avoid grammar analysis and even token type 
assignment if there are problems. For example, even a simple error like a token 
( instead of a rule) reference with an argument like ID[34] would stop ANTLR 
before it tried to assign token types. I guess there is no point in proceeding 
very far down the pipeline because we won't be able to generate code in most 
cases. Let's get the errors and stop.

If you take a look at this page:

http://www.antlr.org/wiki/display/~admin/Enforcing+semantics

You'll see that I have listed all of the errors I think I can detect before 
getting into analysis and so on. These errors occur all the way up to and 
including token type assignment. As you see I have broken them into at least 
two phases.  If anything bad happens in the first phase, I would stop ANTLR. If 
everything is okay in the first phase, but I fail in the symbol testing phase 
(like repeated rule definition), I would stop before getting to token type 
assignment. Etc...

So, does anybody see a problem with stopping after the various phases. The only 
potential issue could be that people might want to see all errors at once.  I 
remember it being quite a hassle in the old version to be fault-tolerant and 
the various phases just trying to get as many errors as possible in one go. 
This proposal would be like a compiler that did not do any semantic analysis if 
it had a syntax error. so "int i:;" in Java would give you that one syntax 
error but would not tell you there was, perhaps, a duplicate method definition 
until you fixed that syntax error and moved on.

any thoughts appreciated... [note that I will not always be able to the tree 
maybe]

Ter
_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org/mailman/listinfo/antlr-dev

[antlr-dev] removing offensive fragments

Reply via email to