On Sat, 18 Jun 2011 12:33:11 -0700, Terence Parr <[email protected]> wrote:
> hi. Sorry for the delay. I'm going to use voice recognition here to describe
> what's going on. You can now look in the depot
>
> read user mwright * //depot/code/antlr4/main/...
>
> for a look at what I'm doing. If you look at the org.antlr.v4.codegen dir,
> and the subdirectory model, you'll find the source related to code
> generation.
Hi Ter,
That's neat, thanks. I could not quite get it to compile, the error is:
Compiling 244 source files to /h/argus/2/j/antlr4/g/antlr4/build/classes
/h/argus/2/j/antlr4/g/antlr4/antlr4/tool/src/org/antlr/v4/semantics/BasicSemanticTriggers.java:928:
cannot find symbol
symbol : method inContext(java.lang.String)
location: class org.antlr.v4.semantics.BasicSemanticTriggers
if ( !((inContext("OPTIONS"))) ) {
So it seems it needs an antlr 3 more recent than 3.3. So I tried the latest
snapsnot:
http://www.antlr.org/depot/antlr3/main/target/antlr-master-3.3.1-SNAPSHOT-completejar.jar
it fails with the same error. So I tried checking out what I thought might
be antlr 3:
//depot/code/antlr/main/...
But then I'm not sure, as the output includes the strings "4.0" and
"Antr 3 Runtime", it fails as it can't find stringtemplate:
[INFO] Installing /h/argus/2/j/antlr3/g/antlr3/antlr3/pom.xml to
/h/argus/2/home/mwright/.m2/repository/org/antlr/antlr-master/4.0-SNAPSHOT/antlr-master-4.0-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] Building Antlr 3 Runtime
[INFO] task-segment: [install]
[INFO] ------------------------------------------------------------------------
[INFO] [buildnumber:create {execution: default}]
[INFO] Storing buildNumber: Jun 20, 2011 24:44:12 at timestamp: 1308494652922
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory
/h/argus/2/j/antlr3/g/antlr3/antlr3/runtime/Java/src/main/resources
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 86 source files to
/h/argus/2/j/antlr3/g/antlr3/antlr3/runtime/Java/target/classes
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Compilation failure
/h/argus/2/j/antlr3/g/antlr3/antlr3/runtime/Java/src/main/java/org/antlr/runtime/tree/DOTTreeGenerator.java:[30,28]
package org.stringtemplate.v4 does not exist
> Basic idea is that I parse grammar into an AST and then create a graph (
> augmented transition network) representation that is a lot like a syntax
> diagram version of the grammar. This ATN gets serialized and generated in
> parsers and lexers, just FYI. I do the necessary semantic analysis and so
> on to figure out what grammar means. Then, to generate code, I create a
> model of the output using the objects in the model package. Then, an
> automatic walker traverses this model and instantiates a template with the
> same name as the model object. I have some cleanup work to do and will add
> an annotation that says which of the object fields should be traversed by
> the object model walker.
>
> The model is not necessarily inherently imperative, but there is probably a
> lot of subtle imperative stuff in there. At the highest level, I create a
> ParserFile which contains a Parser model object and a set of named actions
> consisting of Action objects. A parser has lots of things including a set of
> RuleFunction objects, which in turn, have a series of actions, including
> InvokeRule and MatchToken.
>
> The templates are much simpler because my code generator is creating a very
> explicit output model. Templates should only say how to spit out that object
> in text. The previous v3 code generator required a huge amount of thinking
> inside the template (that's what I get for allowing nested IFs in
> StringTemplate! ;)).
>
> Anyway, this is a start and you could look at the templates in Java.stg to
> see how much simpler they are. We should be able to generate Haskell no
> problem.
>
> You can also look at the source code at:
>
> http://antlr.org/depot/antlr4/main
> http://antlr.org/depot/antlr4/main/tool/src/org/antlr/v4/codegen
>
> Ter
Thanks, that's really neat. The Java.stg is a lot smaller and simpler looking
than before.
I need to generate a trivial Java lexer, than try to hand code that
into Haskell and/or Scheme, then try hacking the string templates some more.
Scheme seems similar to Haskell for this, they are both weird and both
use continuations :-).
I'm trying to learn Scheme for this purpose, with the idea of trying to
write both Haskell and Scheme targets for v4.
Below is the same sketch of the Haskell dfa I sent earlier in Scheme.
I'm not sure if v4 generates dfa's yet though, as it's commented in
Java.stg:
DFADecl(dfa) ::= <<
// define <dfa.name>
>>
Thanks, Mark
#!r6rs
(import (rnrs lists (6))
(rnrs base (6))
(rnrs io simple (6)))
; Copyright (c) 2011, Mark Wright. All rights reserved.
; Given:
; is the Array of tokens from the input file
; p the current parsing position
; o the ANTLR offset from the current parsing position, which is different to
; a normal offset, as 0 is undefined and returns Nothing, 1 is the current
; character, 2 is the next character, -1 is the previous character.
; return the character at ANTLR offset, or Nothing if the ANTLR offset is beyond
; the end of the file, or before the beginning of the file, or 0.
; vector -> int -> int -> token
(define lt
(lambda (is p o)
(cond
((or (= o 0) (>= (- (+ p o) 1) (vector-length is)) (< (+ p o) 0))
tid/nothing)
((> o 0) (vector-ref is (- (+ p o) 1)))
(else (vector-ref is (+ p o))))))
; The token IDs
(define tid/nothing 1000000)
(define tid/void 4)
(define tid/int 5)
(define tid/left-parenthesis 6)
(define tid/right-parenthesis 7)
(define tid/left-curly-brace 8)
(define tid/right-curly-brace 9)
(define tid/comma 10)
(define tid/semicolon 11)
(define tid/id 12)
(define tid/ws 13)
; The DFA states as labelled in the DFA diagram on p. 261 of the ANTLR book.
(define ds1/s0 0)
(define ds1/s1 1)
(define ds1/s2 2)
(define ds1/s3 3)
(define ds1/s4 4)
(define ds1/s5 5)
(define ds1/s6 6)
(define ds1/s7 7)
(define ds1/s8 8)
(define ds1/s9 9)
(define ds1/s10 10)
(define ds1/s11 11)
; Scanning indicates the DFA is still running. NoMatch means the DFA does
; does not match this input. Alt1 predicts a method forward declaration
signature.
; Alt2 predicticts a concrete method definition.
(define da1/scanning 0)
(define da1/no-match 1)
(define da1/alt-1 2)
(define da1/alt-2 3)
; The DFA state transition function.
; First parameter is the current state.
; Second parameter is the token ID.
; Result is the (DfaAlt1, DfaState1) pair, where the
; DfaAlt1 is Scanning while the DFA is still scanning ahead,
; in which case DfaState1 is the next state. Or if DfaAlt1 is
; NoMatch, then DfaState1 is the last state where the no match
; was detected. Or DfaAlt1 is the predicted alternative, and
; DfaState1 is the final state.
; sigmaDfa1 :: DfaState1 -> MethodTokenId -> (DfaAlt1, DfaState1)
(define sigma-dfa-1
(lambda (ds tid)
(cond
((and (= ds ds1/s0) (or (= tid tid/void) (= tid tid/int))) (values
da1/scanning ds1/s1))
((and (= ds ds1/s1) (= tid tid/id)) (values da1/scanning ds1/s2))
((and (= ds ds1/s2) (= tid tid/left-parenthesis)) (values da1/scanning
ds1/s3))
((and (= ds ds1/s3) (= tid tid/int)) (values da1/scanning ds1/s4))
((and (= ds ds1/s4) (= tid tid/id)) (values da1/scanning ds1/s5))
((and (= ds ds1/s5) (= tid tid/comma)) (values da1/scanning ds1/s6))
((and (= ds ds1/s5) (= tid tid/right-parenthesis)) (values da1/scanning
ds1/s9))
((and (= ds ds1/s6) (= tid tid/int)) (values da1/scanning ds1/s7))
((and (= ds ds1/s7) (= tid tid/id)) (values da1/scanning ds1/s8))
((and (= ds ds1/s8) (= tid tid/comma)) (values da1/scanning ds1/s6))
((and (= ds ds1/s8) (= tid tid/right-parenthesis)) (values da1/scanning
ds1/s9))
((and (= ds ds1/s9) (= tid tid/left-curly-brace)) (values da1/alt-2
ds1/s11))
((and (= ds ds1/s9) (= tid tid/semicolon)) (values da1/alt-1 ds1/s10))
(else (values da1/no-match ds)))))
; Loop to run the DFA.
; alt indicates if we are still Scanning, or finished.
; s is the current state.
; is is the input stream of tokens.
; p is the current zero based offset from the start of the token stream.
; o is the lookahead one based token offset, as described in the lt function.
; DfaAlt1 is the predicted alternative, or NoMatch if no alternative is matched.
; scanDfa1 :: DfaAlt1 -> DfaState1 -> Array Int MethodTokenId -> Int -> Int ->
(DfaAlt1, DfaState1)
(define scan-dfa-1
(lambda (alt s is p o)
(if (= alt da1/scanning)
(let ([t (lt is p o)])
(if (= t tid/nothing)
(cons da1/no-match s)
(let-values ([(alt-2 s-2) (sigma-dfa-1 s t)])
(scan-dfa-1 alt-2 s-2 is p (+ o 1)))))
(cons alt s))))
; Run the DFA to find the predicted alternative if the rule matches.
; is is the input stream of tokens.
; p is the current zero based offset from the start of the token stream.
; o is the lookahead one based token offset, as described in the lt function.
; DfaAlt1 is the predicted alternative, or NoMatch if no alternative is matched.
; predictDfa1 :: Array Int MethodTokenId -> Int -> Int -> DfaAlt1
(define predict-dfa-1
(lambda (is p o)
(car (scan-dfa-1 da1/scanning ds1/s0 is p o))))
; An example token sequence which should predict alt1 for the DFA on page 261
; of The Definitive ANTLR Reference.
(define la1 (vector tid/int tid/id tid/left-parenthesis tid/int tid/id
tid/comma tid/int tid/id
tid/comma tid/int tid/id tid/right-parenthesis
tid/semicolon))
(define la2 (vector tid/int tid/id tid/left-parenthesis tid/int tid/id
tid/comma tid/int tid/id
tid/comma tid/int tid/id tid/right-parenthesis tid/left-curly-brace))
(display (predict-dfa-1 la1 0 1))
(display "\n")
(display (predict-dfa-1 la2 0 1))
Non-text part: text/html
_______________________________________________
antlr-dev mailing list
[email protected]
http://www.antlr.org/mailman/listinfo/antlr-dev