[PATCH] D28764: [clang-format] Implement comment reflowing (v3)

Krasimir Georgiev via Phabricator via cfe-commits Wed, 25 Jan 2017 03:26:08 -0800

krasimir added inline comments.


================
Comment at: lib/Format/ContinuationIndenter.cpp:1158-1159
+        CommentPragmasRegex.match(Current.TokenText.substr(2)) ||
+        Current.TokenText.substr(2).ltrim().startswith("clang-format on") ||
+        Current.TokenText.substr(2).ltrim().startswith("clang-format off"))
       return addMultilineToken(Current, State);
----------------
klimek wrote:
> krasimir wrote:
> > klimek wrote:
> > > krasimir wrote:
> > > > klimek wrote:
> > > > > Generally, we shouldn't need those here, as those should be part of 
> > > > > the Token.Finalized state.
> > > > Here the problem is that the comments /* clang-format on */ and /* 
> > > > clang-format off */ control the formatting of the tokens in between, 
> > > > but do not control their own formatting, that is they are not finalized 
> > > > themselves.
> > > > What happens is that if we rely solely on Token.Finalized and the 
> > > > ColumnLimit is small, say 20, then:
> > > > ```
> > > > /* clang-format off */ 
> > > > ```
> > > > gets broken into:
> > > > ```
> > > > /* clang-format off
> > > >   */
> > > > ```
> > > > 
> > > > Add comments about this.
> > > Isn't the right fix to change the tokens of the comment to be finalized?
> > That's arguable. Consider [[ 
> > https://github.com/llvm-mirror/clang/blob/master/unittests/Format/FormatTest.cpp#L10916
> >  | FormatTest.cpp:10916 ]]: in this case we still want to re-indent the /* 
> > clang-format on/off */ comments, just not break them. If we finalize the 
> > token, we can't re-indent.
> I see two options:
> - mark the comment with a special token type for clang-format on/off
> - pull out a function - I see you already have one called mayReflowContent; 
> can we use that?
mayReflowContent is too specific for content during region. I'll add a 
delimitsRegion method.


================
Comment at: lib/Format/ContinuationIndenter.cpp:1205-1210
+    if (SplitBefore.first != StringRef::npos) {
+      TailOffset = SplitBefore.first + SplitBefore.second;
+      ReflowInProgress = true;
+    } else {
+      ReflowInProgress = false;
+    }
----------------
klimek wrote:
> (optional) I'd probably write this:
>   ReflowInProgress = SplitBefore.first != StringRef::npos;
>   if (ReflowInProgress) {
>     TailOffset = SplitBefore.first + SplitBefore.second;
>   }
How about this?


================
Comment at: lib/Format/ContinuationIndenter.cpp:1231-1233
+      unsigned RemainingTokenColumnsAfterCompress =
+          Token->getLineLengthAfterCompress(LineIndex, TailOffset,
+                                            RemainingTokenColumns, Split);
----------------
klimek wrote:
> I think AfterCompression reads more naturally. Perhaps we should use "trim" 
> instead of compress, now that I think about it.
> So I'd probably go for getTrimmedLineLength or something.
I don't agree with "trimmig" because i have a connotation of "trimming" around 
the ends of strings like in ltrim/rtrim and this fundamentally operates 
somewhere inside. +1 for AfterCompression, though.


================
Comment at: lib/Format/WhitespaceManager.cpp:131
+        Changes[i - 1].Kind == tok::comment &&
+        OriginalWhitespaceStart != PreviousOriginalWhitespaceEnd;
   }
----------------
klimek wrote:
> Why was this needed? (what breaks without this change)
This is a dirty hack. The problem is that WhitespaceManager does its own 
formatting for trailing comments and sometimes it re-formats a piece of line 
even after it's been reflown with the previous line. Consider if we reflow the 
second line up:
```
// line 1
// line 2
```
That amounts to 2 changes, first (delimited by ()) for the whitespace between 
the two tokens, and second (delimited by []) for the beginning of the next line:
```
// line 1(
)[// ]line 2
``` 
So in the end we have two changes like this:
```
// line 1()[ ]line 2
```
Note that the OriginalWhitespaceStart of the second change is the same as the 
PreviousOriginalWhitespaceEnd of the first change.
In this case, the above code marks the second line as trailing comment and 
afterwards applies its own trailing-comment-alignment logic to it, which might 
introduce extra whitespace before "line 2".

For a proper solution we need a mechanism to say that a particular change 
emitted through the whitespace manager breaks the sequence of trailing comments.




================
Comment at: unittests/Format/FormatTest.cpp:2169
+
+  // Don't reflow lines starting with two punctuation characters.
+  EXPECT_EQ("// long long long\n"
----------------
klimek wrote:
> But we want to reflow lines starting with one punctuation character?
I guess I'm looking at lines that start like this: "ala", 'ala'. So it may be 
better to special-case these two instead.
The heuristic for now, in mayReflowContent is: the content of the line needs to 
have at least two characters and either the first or the second character must 
be non-punctuation.

Another approach would be to not do these funny things and to see how users 
will complain about mis-re-flows and implement blacklist/whitelist content 
regex prefixes in the future.

Add a test showing the current positive behavior.


https://reviews.llvm.org/D28764



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D28764: [clang-format] Implement comment reflowing (v3)

Reply via email to