[PATCH] D65043: [Format] Add C++20 standard to style options

Brian Gesiak via Phabricator via cfe-commits Wed, 24 Jul 2019 10:05:43 -0700

modocache marked 3 inline comments as done.
modocache added a comment.

> It sounds like currently they're very different, and you're proposing to make 
> them basically the same. I think that's a good thing.

I 100% agree with this statement, and thank you @sammccall for the suggestion 
to move in this direction.

> The obvious problem here is that this may well break existing projects that 
> use Cpp11 and C++14/17 features. I think we can punt on this by making 
> Cpp11/14 still enable 14/17 in LangOpts for now (so we don't change behavior, 
> just give people a better way to specify their intent) and change it later if 
> needed.

I like this idea. `LS_Cpp14` and `LS_Cpp17` are new options that we'll 
introduce and so clang-format should interpret them very precisely, whereas 
`LS_Cpp11` (for now) will remain a little ambiguous. Hopefully some thorough 
documentation will prevent some user confusion here.

The codebase I'm applying formatting to uses `-std=c++17` and `-fcoroutines-ts` 
to compile, but specifies `LS_Cpp11` in its .clang-format file -- as you 
mentioned, the intent was to simply use "new" C++ formatting, for whatever 
clang-format decided "new" to mean. I'd like to reason "out loud" about what 
changes we can make to clang-format, and what my codebase's migration path 
would look like should we make those changes. Based on the discussion above I'm 
thinking we make these commits/changes to clang-format itself:

1. Add the `enum LanguageStandard` @sammccall suggested above verbatim, thereby 
allowing users to specify an arbitrary standard such as `LS_Cpp17`. To split up 
the work for easier code review, we might want to consider just updating the 
.clang-format parsing to recognize `"Cpp17"`, but internally still have 
clang-format treat it as `LS_Cpp11`. This is because step (2) is a bit of a 
doozy.
2. Audit the existing uses of `FormatStyle::LS_Cpp{03,11,Auto}` in the 
clang-format codebase and modify them to instead check for the appropriate 
standard. We might want this work to land as several small commits.
  1. For example, formatting that should only be applied to C++17 should be 
modified, and instead of checking for `LS_Cpp11`, it should check for `LS_Cpp17 
|| LS_Cpp11` (`LS_Cpp11` is still included in the check because we're using 
that to mean "new"). Commits with changes like this can land without any impact 
to existing users.
  2. Formatting that should only be applied to C++20 should be modified to 
check for `LS_Cpp20` //instead of// `LS_Cpp11`. These commits //will// change 
end behavior for users. In my codebase, for example, we would probably want to 
update our .clang-format, from `"Cpp11"` to `"Cpp20"`, before we update to a 
clang-format that includes these commits, because otherwise our `co_yield` 
keywords will begin to get formatted incorrectly.
3. Change .clang-format parsing to recognize the new enum cases like `"Cpp17"` 
as `LS_Cpp17`, not `LS_Cpp11` (see item 1).
4. Commit the `CoroutinesTS` option from D65044 
<https://reviews.llvm.org/D65044>. My codebase would switch from `LS_Cpp20` to 
`LS_Cpp17` and `CoroutinesTS` and almost no formatting should change.

If the above series of patches sounds reasonable, I wouldn't mind working on 
some or all of them, so let me know!

================
Comment at: clang/docs/ClangFormatStyleOptions.rst:2227
+  * ``LS_Cpp20`` (in configuration: ``Cpp20``)
+    Use features of C++20 and C++2a (e.g.: treating ``co_yield`` as a keyword,
+    not an identifier, so ``co_yield++ i`` is formatted as ``co_yield ++i``).
----------------
Quuxplusone wrote:
> C++2a //will// be C++20, barring any radically unforeseen events. So saying 
> "C++20 and C++2a" is redundant. Personally I would follow GCC/Clang's lead 
> and say "C++2a" until the standard is actually out.
Thank you, I'll do that. I wasn't sure about the naming conventions. While 
we're talking about this, mind if I ask: why did "C++1z" use "z" whereas 
"C++2a" use "a"? Once C++20 is actually out, is the next version "C++2b", or 
does "C++2a" start referring to C++23? Sorry for the rookie questions.

================
Comment at: clang/include/clang/Format/Format.h:1878
     LS_Cpp11,
+    /// Use features of C++20 and C++2a (e.g.: treating ``co_yield`` as a
+    /// keyword, not an identifier, so ``co_yield++ i`` is formatted as
----------------
Quuxplusone wrote:
> Again, C++2a is likely a synonym for C++20.
> Three lines earlier, you might want to change "C++1z" to "C++17" (and grep 
> the codebase for other instances of "1z").
Cool, will do, thank you!

================
Comment at: clang/unittests/Format/FormatTest.cpp:3721
+      "        aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa) <= >\n"
+      "    5) {\n"
       "}");
----------------
Quuxplusone wrote:
> This doesn't seem to test what you set out to test, does it? `(x) <= > 5` 
> isn't a valid C++ expression anyway. Maybe what you want to test here is that 
> clang-format is willing to reformat pre-C++2a code
> 
>     LongName<&LongerName::operator<=> x;
> 
> into
> 
>     LongName<
>         &LongerName::operator<=
>     > x;
> 
> (that is, that it's willing to insert a line break between `<=` and `>`). 
> However, if clang-format //refused// to insert a line break in that one 
> position even in C++11 mode, would anything of value really be lost?
Absolutely right, I thought of this test not as a signal of what I want to 
happen, but rather to demonstrate what does happen. I'll either update this 
test to make that clear, or add a test like the one you suggested instead.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65043/new/

https://reviews.llvm.org/D65043

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D65043: [Format] Add C++20 standard to style options

Reply via email to