penagos added inline comments.

================
Comment at: clang/lib/Format/TokenAnnotator.cpp:125
+                 
CurrentToken->Next->getStartOfNonWhitespace().getLocWithOffset(
+                     -1)))
           return false;
----------------
krasimir wrote:
> penagos wrote:
> > krasimir wrote:
> > > penagos wrote:
> > > > MyDeveloperDay wrote:
> > > > > I don't really understand what we are saying here? 
> > > > Effectively we are checking that, barring intervening whitespace, we 
> > > > are analyzing 2 consecutive '>' tokens. If so, we treat such sequence 
> > > > as a binary op in lieu of a closing template angle bracket. If there's 
> > > > another more straightforward way of accomplishing this check, I'm open 
> > > > to that, but this seemed to be the most straightforward way at the time.
> > > I'm worried that this may regress template code. How does this account 
> > > for cases where two consecutive `>`-s are really two closing template 
> > > brackets, e.g.,
> > > `std::vector<std::decay_t<int& >> v;`?
> > > 
> > > In particular, one added test case is ambiguous: `>>` could really be two 
> > > closing template brackets:
> > > https://godbolt.org/z/v19hj9vKn
> > > 
> > > I have to say that my general feeling about trying to disambiguate 
> > > between bitshifts and template closers is: don't try too hard inside 
> > > clang-format as the heuristics are generally quite brittle and make the 
> > > code harder to maintain; in cases where clang-format wrongly detects 
> > > bitshift as templates, users should add parens around the bitshift, which 
> > > IMO improves readability.
> > As this patch currently stands, it does not disambiguate between bitshift 
> > '>>' operators and 2 closing template brackets, so in your snippet, we 
> > would no longer insert a space between the '>' characters (despite arguably 
> > being the better formatting decision in this case).
> > 
> > I agree with your feeling that user guided disambiguation between bitshift 
> > operators and template closing brackets via parens is the ideal solution 
> > and also improves readability, but IMO the approach taken by clang-format 
> > to format the '>' token should be conservative in that any change made 
> > should be non-semantic altering, which is not presently the case. While the 
> > case you mentioned would regress, we would no longer potentially alter 
> > program semantics. Thinking about this more, would it make sense to modify 
> > the actual white-space change generation later on in the analysis to not 
> > break up >> sequences of characters in lieu of annotating the tokens 
> > differently as the proposed patch is currently doing?
> I tried and can't make this misinterpret two consecutive template `>` as a 
> bit shift, IMO because this check is guarded by the `Left->ParentBracket != 
> tok::less` condition. Both `std::vector<std::decay_t<int&>> v;` and 
> `test<test<a | b>> c;` below are handled correctly.
> I'm less worried about regressions in common template cases now.
> Thank you for pointing out altering program semantics, I agree.
> Please add a comment about this tradeoff and and a bit of the reasoning 
> behind it in code for future reference.
I had come to the same conclusion when modifying the conditional here; namely 
the ParentBracket predicate is what catches the case you were alluding to 
earlier.
I've added a brief comment to `parseAngle()` to document the need for the 
change, explaining the conservative nature of the change w.r.t. nested template 
cases; thank you for the suggestion.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100778/new/

https://reviews.llvm.org/D100778

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to