[PATCH] D86559: [Sema, CodeGen] Allow [[likely]] and [[unlikely]] on labels

Aaron Ballman via Phabricator via cfe-commits Thu, 27 Aug 2020 06:16:58 -0700

aaron.ballman added a comment.

In D86559#2239859 <https://reviews.llvm.org/D86559#2239859>, @Mordante wrote:

> In D86559#2239013 <https://reviews.llvm.org/D86559#2239013>, @aaron.ballman 
> wrote:
>
>> I'd like to understand your reasoning for ignore + diagnose as opposed to 
>> count attrs (+ optionally diagnose) or other strategies. I can see pros and 
>> cons to basically anything we do here.
>>
>> If we decide to ignore conflicting likelihoods, then I agree we should issue 
>> a diagnostic. However, I suspect that's going to come up frequently in 
>> practice because people are going to write macros that include these 
>> attributes. For instance, it's very sensible for a developer to put 
>> [[unlikely]] in front of an `assert` under the assumption this is telling 
>> the compiler "this assertion isn't likely to happen" when in fact it's 
>> saying "this branch containing the assertion is unlikely to be taken".
>
> This macro is example why I dislike the counting. If `MyAssert` has an 
> `[[unlikely]]` attribute, then changing the number of `MyAssert` statements 
> will influence the likelihood of the branches taken.

<nod>, but this is an example of why I don't like the behavior of applying this 
to arbitrary statements; this is spooky behavior that would be very hard to 
spot in code reviews unless you're an expert on C++.

>> This is one of the reasons why the GCC behavior of allowing these attributes 
>> on arbitrary statements feels like an awful trap for users to get into. If 
>> the attribute only applied to compound statements following a flow control 
>> construct, I think the average programmer will have a far better chance of 
>> not using this wrong. (I know that I said in the other thread we should 
>> match the GCC behavior, but I'm still uncomfortable with it because that 
>> behavior seems to be fraught with dangers for users, and this patch 
>> introduces new sharp edges, as @Quuxplusone pointed out.)
>
> I'm also not fond of GCC's implementation, but I think their implementation 
> conforms with the standard.

Conforming to the standard is not hard in this case. We can parse the 
attributes and ignore their semantics entirely (but not the constraints) and 
that also conforms to the standard. Our job is to figure out what the best 
behavior is for our implementation (which may or may not mean following the GCC 
behavior).

> In hindsight I think it indeed makes more sense to only allow it on compound 
> statements. That will require thinking about how to mark multiple cases as 
> `[[likely]]` when falling through:
>
>   switch(a) {
>     case '0':
>     case '1':
>     case '2': [[likely]] { return 'a' - '0'; }      // All 3 cases likely?
>   
>     case '3':                                                      // neutral?
>     case '4': [[likely]]()                                    // likely?
>     case '5': [[unlikely]] {return 'a' - '0'; }  // unlikely?
>   }
>
> Of course this can be solved by only allowing it on the case labels:
>
>   switch(a) {
>    [[likely]] case '0':
>    [[likely]] case '1':
>    [[likely]] case '2':  { return 'a' - '0'; }    // All 3 cases likely
>   
>     case '3':                                                     // neutral
>     [[likely]] case '4':                                    // likely
>     [[unlikely case '5':  {return 'a' - '0'; }  // unlikely
>   }

That is exactly the behavior I am coming to believe we should follow. You can 
either write it on a compound statement that is guarded by a flow control 
decision (`if`/`else`/`while`) or you can write it on a label, otherwise the 
attribute is parsed and ignored (with a diagnostic). This feels like the right 
mixture of useful with understandable semantics so far, but perhaps we'll find 
other examples that change our minds.

The fallthrough behavior question has one more case we may want to think about:

  switch (x) {
  case 0:
  case 1:
  [[likely]] case 2: break;
  [[unlikely]] default:
  }

Does this mark cases `0` and `1` as being likely or not? I could see users 
wanting to use shorthand rather than repeat themselves on all the cases. 
However, I'm also not certain whether there would be any performance impact if 
we marked only `case 2` as likely and left cases `0` and `1` with default 
likelihood. My gut feeling is that this should only mark `case 2`, but others 
may have different views.

> I fully agree the behaviour mandated by the standard is way too complex and 
> user unfriendly. It would have been nice if there were simpler rules, making 
> it easier to use and to teach. Still I think it would be best to use the 
> complex approach now, since that's what the standard specifies. During that 
> process we can see whether there are more pitfalls. Then we can discuss it 
> with other vendors and see whether we can change the wording of the standard. 
> Do you agree?

The only requirement from the standard is that we parse `[[likely]]` or 
`[[unlikely]]` on a statement or label, that we don't allow conflicting 
attributes in the same attribute sequence, and that the attribute has no 
arguments to it. All the rest is recommended practice that's left up to the 
implementation as a matter of QoI. So we conform to the standard by following 
the approach on compound statements and labels + the constraint checking. We 
may not be following the recommended practice precisely, but we may not *want* 
to follow the recommended practice because it's bad for usability. So I'd like 
us to design the feature that we think gives the most value and is the easiest 
to use that matches the recommended practice as best we can, then start talking 
to other vendors. If other vendors don't want to follow our design, that's 
totally fine, but we should ensure our design *allows* users to write code that 
will behave similarly for other implementations without diagnostics. e.g., we 
don't want to design something where the user has to use macros to avoid 
diagnostics in cross-compiler code. After that, we may want to propose some 
changes to the recommended practice to WG21.

From my current thinking, it seems that there may be agreement that allowing 
these on arbitrary statements may be really difficult for users to understand 
the behavior of and that compound statements and labels are what we think is an 
understandable and useful feature and is also a strict subset of what we COULD 
support. By that, I think we should limit the feature to just compound 
statements and labels; this leaves the door open to allow the attributes on 
arbitrary statements in the future without breaking code. If we allow on 
arbitrary statements from the start, we can't later restrict the feature 
because we'd break code. This lets us start getting field experience with that 
behavior to see how we like it, which may also help when talking to other 
vendors because we'll have something concrete to talk about (hopefully). WDYT?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86559/new/

https://reviews.llvm.org/D86559

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D86559: [Sema, CodeGen] Allow [[likely]] and [[unlikely]] on labels

Reply via email to