krasimir added inline comments.
================ Comment at: clang/lib/Format/TokenAnnotator.cpp:125 + CurrentToken->Next->getStartOfNonWhitespace().getLocWithOffset( + -1))) return false; ---------------- penagos wrote: > krasimir wrote: > > penagos wrote: > > > MyDeveloperDay wrote: > > > > I don't really understand what we are saying here? > > > Effectively we are checking that, barring intervening whitespace, we are > > > analyzing 2 consecutive '>' tokens. If so, we treat such sequence as a > > > binary op in lieu of a closing template angle bracket. If there's another > > > more straightforward way of accomplishing this check, I'm open to that, > > > but this seemed to be the most straightforward way at the time. > > I'm worried that this may regress template code. How does this account for > > cases where two consecutive `>`-s are really two closing template brackets, > > e.g., > > `std::vector<std::decay_t<int& >> v;`? > > > > In particular, one added test case is ambiguous: `>>` could really be two > > closing template brackets: > > https://godbolt.org/z/v19hj9vKn > > > > I have to say that my general feeling about trying to disambiguate between > > bitshifts and template closers is: don't try too hard inside clang-format > > as the heuristics are generally quite brittle and make the code harder to > > maintain; in cases where clang-format wrongly detects bitshift as > > templates, users should add parens around the bitshift, which IMO improves > > readability. > As this patch currently stands, it does not disambiguate between bitshift > '>>' operators and 2 closing template brackets, so in your snippet, we would > no longer insert a space between the '>' characters (despite arguably being > the better formatting decision in this case). > > I agree with your feeling that user guided disambiguation between bitshift > operators and template closing brackets via parens is the ideal solution and > also improves readability, but IMO the approach taken by clang-format to > format the '>' token should be conservative in that any change made should be > non-semantic altering, which is not presently the case. While the case you > mentioned would regress, we would no longer potentially alter program > semantics. Thinking about this more, would it make sense to modify the actual > white-space change generation later on in the analysis to not break up >> > sequences of characters in lieu of annotating the tokens differently as the > proposed patch is currently doing? I tried and can't make this misinterpret two consecutive template `>` as a bit shift, IMO because this check is guarded by the `Left->ParentBracket != tok::less` condition. Both `std::vector<std::decay_t<int&>> v;` and `test<test<a | b>> c;` below are handled correctly. I'm less worried about regressions in common template cases now. Thank you for pointing out altering program semantics, I agree. Please add a comment about this tradeoff and and a bit of the reasoning behind it in code for future reference. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100778/new/ https://reviews.llvm.org/D100778 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits