Hi all, I've read with great attention your messages, especially Sam's very comprehensive answer. I now clearly understand that it's a research-level work, and I was definitely too ambitious in trying to dig into that - as I have limited time after my day job (and probably too limited knowledge too, but that's a non-problem as it wouldn't be a one-person task anyway). Nevertheless, I really appreciate the exchange.
Kind regards, Nicolas On Wednesday, December 2, 2020 at 7:56:58 PM UTC+1 Sam Tobin-Hochstadt wrote: > A few thoughts on these topics, which I've been thinking about for a while. > > First, let's distinguish two things. One is an _incremental_ system, > such as a parser, which is one which does less work in response to a > small change than it would need to do from scratch. The other is a > system with _error recovery_, which is one where in the presence of > one error, the system can still provide a useful answer and/or > continue on to discover other errors. tree-sitter, for example, aims > to do both of these, but they're quite different. > > With that in mind, several points: > > 1. It would be relatively straightforward to build an incremental > _reader_ -- going from text to s-expressions. You could start from the > grammar here: > https://github.com/racket/parser-tools/blob/master/parser-tools-lib/parser-tools/examples/read.rkt > which is just for Scheme, and the lexer here: > > https://github.com/racket/syntax-color/blob/master/syntax-color-lib/syntax-color/racket-lexer.rkt > which is for full Racket, which as Robby says is already > error-tolerant. The read syntax (in the absence of reader extensions) > is definitely context-free and probably LR(1). The code for the reader > is here: > https://github.com/racket/racket/tree/master/racket/src/expander/read > > However, just calling `read` from scratch every time isn't a big > bottleneck -- the biggest Racket-syntax file I have around is about > 86000 lines and takes 700ms to `read`. > > 2. As Robby points out, the big challenge is the macro expander, which > is (a) not a grammar, (b) large and complicated (the code is here: > https://github.com/racket/racket/tree/master/racket/src/expander and > it's about 35k lines) and (c) it runs arbitrary Racket code in the > form of macros. I'm definitely interested in thinking about what an > incremental expander would look like, but that's a big research > project and probably would require a different model of macros than > Racket has right now. It would not work to use some existing parsing > toolkit like tree-sitter. You could perhaps write a new macro expander > using an incremental computation framework such as Adapton > [https://docs.rs/adapton/0.3.31/adapton/] or write something like > Adapton for Racket. How well that would work is an interesting > question. You could also rewrite the macro expander to be incremental > more directly. > > 3. An error-tolerant macro expander is more plausible, but would again > require substantial changes to the expander. One possible idea is to > use the information the macro stepper already uses to reconstruct the > partial program right before it went wrong, and supply that to the IDE > to use for completion/etc. Another idea would be to replace pieces of > erroneous syntax with something that allows the expander to continue > (this is how error-tolerant parsers work). There are probably lots > more ideas that we could come up with. > > 4. Compiling to one of the OCaml intermediate languages is an > interesting idea -- I've thought about their flambda language as a > possible target before. The place to start is the `schemify` layer: > https://github.com/racket/racket/tree/master/racket/src/schemify that > turns fully-expanded Racket code into Scheme code for Chez Scheme. > Changing that to produce flambda would be plausible, although there > are a lot of mismatches between the languages that would be tricky to > overcome. Another possibility would be to directly produce JavaScript > from that layer. You might be interested in the RacketScript project: > https://github.com/vishesh/racketscript > > If you're interested in thinking more about these topics, or working > on them, I'm happy to offer more advice. > > Sam > > On Wed, Dec 2, 2020 at 9:53 AM nicobao <[email protected]> wrote: > > > > Hi! > > > > The Racket Reader and the Racket Expander always return "Error : blabla" > when you send it a bad Racket source code. > > As a consequence, when there is a source code error, DrRacket and the > Racket LSP cannot provide IDE functionalities like "find references", "info > on hover", "find definition"...etc. > > This is an issue, because 99% of the time one write code, the code is > incorrect. Other languages (Rust, Typescript/JS, Java, OCaml...etc) rely on > an incremental parser than can provide a tree even if the source code is > wrong. Basically it adds an "ERROR" node in the tree, and go on instead of > stopping everything and returning at the first error. > > Currently this compiler issue is blocking the Racket IDE to provide > better user experience. > > For my practical use case of Racket, it is important. > > > > I would like to help working towards that direction. > > I see two possible solutions to that: > > 1) improve the recursive descent parser of the Reader, as well as the > Expander to make them incremental and fault-tolerant > > 2) re-writing the parser in something like tree-sitter or Menhir, at the > cost of having to re-write the Reader/Expander logic (!!!) > > > > Both solutions are daunting tasks. > > > > For solution 1), could you point me to the Racket's recursive descent > parser source code? What about the Expander ? > > > > For solution 2), I was thinking of writing a tree-sitter grammar for > racket. However, I can't find a formal description of the grammar, like > Scheme did here: > > https://www.scheme.com/tspl4/grammar.html#APPENDIXFORMALSYNTAX > > Of course, the Racket documentation is still quite comprehensive, but it > would be nice if anyone could tell me if there is such formal document > somewhere? > > Besides, I wonder whether Racket/Scheme could even be described using a > LR(1) or a GLR grammar? > > > > Finally, is any work have been started towards this direction? > > > > Totally off-topic, but has anyone ever thought of compiling Racket down > to OCaml, in order to reuse js_of_ocaml and produce optimized JS code from > Racket? > > I was wondering whether it would be feasible. > > > > Final note: I know all of that is _very_ ambitious! > > > > Kind regards, > > Nicolas > > > > -- > > You received this message because you are subscribed to the Google > Groups "Racket Users" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/racket-users/d77440e3-1876-44e5-b52b-323d5715df66n%40googlegroups.com > . > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/1a94fbdb-69f5-4c29-9dcf-4349de89ac16n%40googlegroups.com.

