[PATCH] D130460: [pseudo] Add recovery for declarations

Haojian Wu via Phabricator via cfe-commits Thu, 22 Sep 2022 02:03:38 -0700

hokein added a comment.

I think the current version should be good to review, some bits need to take 
care of:


- left-recursive vs right-recursive of declaration-seq rule, see my other 
comments about it. Currently I keep it as-is to avoid the performance 
regression on large files (I still don't see your point how right-recursive can 
simplify the error recovery, it seems to me either can work at least for this 
case)
- recoveryNextDeclaration will consume at least 1 token (the return token index 
> Cursor) to avoid the risk of running in an infinite loop
- I keep the original implementation as-is (except fixing some bugs), but IMO 
`T->Indent == OrigIndent && LikelyStartsDeclaration(T->Kind)` is too strict 
(see my FIXME testcases), and we should probably include the IDENTIFIER (I 
believe it is a common case), `LikelyStartsDeclaration(T->Kind) || (T->Kind = 
IDENTIFIER && T->Indent == OrigIndent)` might be better.



================
Comment at: clang-tools-extra/pseudo/lib/cxx/CXX.cpp:291
+  // they appear at the beginning of a line.
+  auto LikelyStartsDeclaration = [&](tok::TokenKind K) -> bool {
+    switch (K) {
----------------
hokein wrote:
> the list looks reasonable -- I think we can lookup the `Follow(declaration)` 
> set to find more.
added more: kw_friend, kw_inline, kw_explicit, kw_virtual.


================
Comment at: clang-tools-extra/pseudo/lib/cxx/cxx.bnf:313
-declaration-seq := declaration
-declaration-seq := declaration-seq declaration
 declaration := block-declaration
----------------
hokein wrote:
> Just want to spell out, this causes a performance hit :(
> (a pure change of this has a ~6% slowdown, 7.17 MB/s -> 6.7MB/s on 
> SemaCodeComplete.cpp).
> 
> left-cursive grammar is more friendly to LR-like parsers, because the parser 
> can do the reduction earlier (after a parsed a `declaration`, we can perform 
> a `declaration-seq` reduce); while right-cursive grammar, the reduction 
> starts from the last declaration (this means our GSS is more deeper).
> 
> > change decl-sequence to right-recursive to simplify this
> I remembered we discussed this bit before, but I don't really remember the 
> details now :(
> 
Got more data, it does hurt the performance on large files:

SemaExpr.cpp: 9.54MB/s -> 7.98MB/s
SemaDeclCXX.cpp: 10.16MB/s -> 9.25MB/s

small/medium files doesn't seem to be affected (hypothesis: these files has 
less declarations than large files, thus being less affected by the 
right-recursive declaration-seq grammar rule)

Hover.cpp: 8MB/s -> 7.92MB/s
ASTSignals.cpp: 12.34M/s -> 12.29MB/s



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130460/new/

https://reviews.llvm.org/D130460

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D130460: [pseudo] Add recovery for declarations

Reply via email to