[PATCH] D100778: [clang-format] Prevent extraneous space insertion in bitshift operators

Krasimir Georgiev via Phabricator via cfe-commits Mon, 26 Apr 2021 02:33:41 -0700

krasimir added inline comments.


================
Comment at: clang/lib/Format/TokenAnnotator.cpp:125
+                 
CurrentToken->Next->getStartOfNonWhitespace().getLocWithOffset(
+                     -1)))
           return false;
----------------
penagos wrote:
> krasimir wrote:
> > penagos wrote:
> > > MyDeveloperDay wrote:
> > > > I don't really understand what we are saying here? 
> > > Effectively we are checking that, barring intervening whitespace, we are 
> > > analyzing 2 consecutive '>' tokens. If so, we treat such sequence as a 
> > > binary op in lieu of a closing template angle bracket. If there's another 
> > > more straightforward way of accomplishing this check, I'm open to that, 
> > > but this seemed to be the most straightforward way at the time.
> > I'm worried that this may regress template code. How does this account for 
> > cases where two consecutive `>`-s are really two closing template brackets, 
> > e.g.,
> > `std::vector<std::decay_t<int& >> v;`?
> > 
> > In particular, one added test case is ambiguous: `>>` could really be two 
> > closing template brackets:
> > https://godbolt.org/z/v19hj9vKn
> > 
> > I have to say that my general feeling about trying to disambiguate between 
> > bitshifts and template closers is: don't try too hard inside clang-format 
> > as the heuristics are generally quite brittle and make the code harder to 
> > maintain; in cases where clang-format wrongly detects bitshift as 
> > templates, users should add parens around the bitshift, which IMO improves 
> > readability.
> As this patch currently stands, it does not disambiguate between bitshift 
> '>>' operators and 2 closing template brackets, so in your snippet, we would 
> no longer insert a space between the '>' characters (despite arguably being 
> the better formatting decision in this case).
> 
> I agree with your feeling that user guided disambiguation between bitshift 
> operators and template closing brackets via parens is the ideal solution and 
> also improves readability, but IMO the approach taken by clang-format to 
> format the '>' token should be conservative in that any change made should be 
> non-semantic altering, which is not presently the case. While the case you 
> mentioned would regress, we would no longer potentially alter program 
> semantics. Thinking about this more, would it make sense to modify the actual 
> white-space change generation later on in the analysis to not break up >> 
> sequences of characters in lieu of annotating the tokens differently as the 
> proposed patch is currently doing?
I tried and can't make this misinterpret two consecutive template `>` as a bit 
shift, IMO because this check is guarded by the `Left->ParentBracket != 
tok::less` condition. Both `std::vector<std::decay_t<int&>> v;` and 
`test<test<a | b>> c;` below are handled correctly.
I'm less worried about regressions in common template cases now.
Thank you for pointing out altering program semantics, I agree.
Please add a comment about this tradeoff and and a bit of the reasoning behind 
it in code for future reference.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100778/new/

https://reviews.llvm.org/D100778

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D100778: [clang-format] Prevent extraneous space insertion in bitshift operators

Reply via email to