Re: [racket-users] Towards an Incremental Racket Parser for better IDE experience?

nicobao Wed, 09 Dec 2020 09:46:46 -0800

Hi all,

I've read with great attention your messages, especially Sam's very 
comprehensive answer.
I now clearly understand that it's a research-level work, and I was 
definitely too ambitious in trying to dig into that - as I have limited 
time after my day job (and probably too limited knowledge too, but that's a 
non-problem as it wouldn't be a one-person task anyway).
Nevertheless, I really appreciate the exchange.


Kind regards,
Nicolas

On Wednesday, December 2, 2020 at 7:56:58 PM UTC+1 Sam Tobin-Hochstadt 
wrote:

> A few thoughts on these topics, which I've been thinking about for a while.
>
> First, let's distinguish two things. One is an _incremental_ system,
> such as a parser, which is one which does less work in response to a
> small change than it would need to do from scratch. The other is a
> system with _error recovery_, which is one where in the presence of
> one error, the system can still provide a useful answer and/or
> continue on to discover other errors. tree-sitter, for example, aims
> to do both of these, but they're quite different.
>
> With that in mind, several points:
>
> 1. It would be relatively straightforward to build an incremental
> _reader_ -- going from text to s-expressions. You could start from the
> grammar here: 
> https://github.com/racket/parser-tools/blob/master/parser-tools-lib/parser-tools/examples/read.rkt
> which is just for Scheme, and the lexer here:
>
> https://github.com/racket/syntax-color/blob/master/syntax-color-lib/syntax-color/racket-lexer.rkt
> which is for full Racket, which as Robby says is already
> error-tolerant. The read syntax (in the absence of reader extensions)
> is definitely context-free and probably LR(1). The code for the reader
> is here: 
> https://github.com/racket/racket/tree/master/racket/src/expander/read
>
> However, just calling `read` from scratch every time isn't a big
> bottleneck -- the biggest Racket-syntax file I have around is about
> 86000 lines and takes 700ms to `read`.
>
> 2. As Robby points out, the big challenge is the macro expander, which
> is (a) not a grammar, (b) large and complicated (the code is here:
> https://github.com/racket/racket/tree/master/racket/src/expander and
> it's about 35k lines) and (c) it runs arbitrary Racket code in the
> form of macros. I'm definitely interested in thinking about what an
> incremental expander would look like, but that's a big research
> project and probably would require a different model of macros than
> Racket has right now. It would not work to use some existing parsing
> toolkit like tree-sitter. You could perhaps write a new macro expander
> using an incremental computation framework such as Adapton
> [https://docs.rs/adapton/0.3.31/adapton/] or write something like
> Adapton for Racket. How well that would work is an interesting
> question. You could also rewrite the macro expander to be incremental
> more directly.
>
> 3. An error-tolerant macro expander is more plausible, but would again
> require substantial changes to the expander. One possible idea is to
> use the information the macro stepper already uses to reconstruct the
> partial program right before it went wrong, and supply that to the IDE
> to use for completion/etc. Another idea would be to replace pieces of
> erroneous syntax with something that allows the expander to continue
> (this is how error-tolerant parsers work). There are probably lots
> more ideas that we could come up with.
>
> 4. Compiling to one of the OCaml intermediate languages is an
> interesting idea -- I've thought about their flambda language as a
> possible target before. The place to start is the `schemify` layer:
> https://github.com/racket/racket/tree/master/racket/src/schemify that
> turns fully-expanded Racket code into Scheme code for Chez Scheme.
> Changing that to produce flambda would be plausible, although there
> are a lot of mismatches between the languages that would be tricky to
> overcome. Another possibility would be to directly produce JavaScript
> from that layer. You might be interested in the RacketScript project:
> https://github.com/vishesh/racketscript
>
> If you're interested in thinking more about these topics, or working
> on them, I'm happy to offer more advice.
>
> Sam
>
> On Wed, Dec 2, 2020 at 9:53 AM nicobao <[email protected]> wrote:
> >
> > Hi!
> >
> > The Racket Reader and the Racket Expander always return "Error : blabla" 
> when you send it a bad Racket source code.
> > As a consequence, when there is a source code error, DrRacket and the 
> Racket LSP cannot provide IDE functionalities like "find references", "info 
> on hover", "find definition"...etc.
> > This is an issue, because 99% of the time one write code, the code is 
> incorrect. Other languages (Rust, Typescript/JS, Java, OCaml...etc) rely on 
> an incremental parser than can provide a tree even if the source code is 
> wrong. Basically it adds an "ERROR" node in the tree, and go on instead of 
> stopping everything and returning at the first error.
> > Currently this compiler issue is blocking the Racket IDE to provide 
> better user experience.
> > For my practical use case of Racket, it is important.
> >
> > I would like to help working towards that direction.
> > I see two possible solutions to that:
> > 1) improve the recursive descent parser of the Reader, as well as the 
> Expander to make them incremental and fault-tolerant
> > 2) re-writing the parser in something like tree-sitter or Menhir, at the 
> cost of having to re-write the Reader/Expander logic (!!!)
> >
> > Both solutions are daunting tasks.
> >
> > For solution 1), could you point me to the Racket's recursive descent 
> parser source code? What about the Expander ?
> >
> > For solution 2), I was thinking of writing a tree-sitter grammar for 
> racket. However, I can't find a formal description of the grammar, like 
> Scheme did here:
> > https://www.scheme.com/tspl4/grammar.html#APPENDIXFORMALSYNTAX
> > Of course, the Racket documentation is still quite comprehensive, but it 
> would be nice if anyone could tell me if there is such formal document 
> somewhere?
> > Besides, I wonder whether Racket/Scheme could even be described using a 
> LR(1) or a GLR grammar?
> >
> > Finally, is any work have been started towards this direction?
> >
> > Totally off-topic, but has anyone ever thought of compiling Racket down 
> to OCaml, in order to reuse js_of_ocaml and produce optimized JS code from 
> Racket?
> > I was wondering whether it would be feasible.
> >
> > Final note: I know all of that is _very_ ambitious!
> >
> > Kind regards,
> > Nicolas
> >
> > --
> > You received this message because you are subscribed to the Google 
> Groups "Racket Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to [email protected].
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/d77440e3-1876-44e5-b52b-323d5715df66n%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/1a94fbdb-69f5-4c29-9dcf-4349de89ac16n%40googlegroups.com.

Re: [racket-users] Towards an Incremental Racket Parser for better IDE experience?

Reply via email to