Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

Bill Wendling Wed, 12 Mar 2025 13:03:05 -0700

On Mon, Mar 10, 2025 at 11:33 PM Henrik Olsson <h_ols...@apple.com> wrote:
>
>
>
> On Mar 10, 2025, at 11:04 PM, Martin Uecker <uec...@tugraz.at> wrote:
>
> Am Montag, dem 10.03.2025 um 19:30 -0400 schrieb John McCall:
>
> On 10 Mar 2025, at 18:30, Martin Uecker wrote:
>
> Am Montag, dem 10.03.2025 um 16:45 -0400 schrieb John McCall:
>
>
>
> ..
>
>
>
> While the next example is also ok in C++.
>
> constexpr int n = 2;
>
> struct foo {
>  char buf[n];
> };
>
> With both declarations of 'n' the example has UB in C++.
> So I am not convinced the proposed rules make a lot
> of sense for C++ either.
>
>
> If C required a diagnostic in your first example, it would actually
> put a fair amount of pressure on the C++ committee to get rid of
> this spurious UB rule.
>
>
> Why would C want a diagnostic here?
>
>
> When I said “your first example”, Martin, I did actually mean your
> first example:
>
>
> Sorry, I meant "there". No reason to be condescending though.
>
> But I think it’s clear that you and I just differ on some basic design
> philosophy, so let’s just end the conversation here.
>
>
> I think the issue that if one does not agree with the
> design decisions made previously for the name lookup rules
> in the C and C++ languages and wants to change those (or
> adding new inconsistent ones), then this is not simply
> a question of language design preferences.
>
> Martin
>
>
> John.
>
> I still think one could use designator syntax, i.e. '.n', which
> would be clearer and intuitive for both C and C++ programmers.
>
>
> This doesn’t really solve the ambiguity problem. If n is a field name,
> a programmer who writes __counted_by(n) almost certainly means to name
> the field. “The proper syntax is .n” is the cause of the bug, not its
> solution.
>
>
> Field names in C are in a different namespace. So far, when you write
> 'n' this *never* refers to field member in any context.  And I have
> never seen anybody request a warning for the examples above. So, no,
> "a programmer almost certainly means this" can not possible be true.
>
> Martin
>
>
> I won't speak for John, but the way I see it you can optimise for 
> theoretically consistent minimalist semantics, or strive to optimise each 
> feature for its most common use cases. That's a difference in design 
> philosophy.


It's not about having consistent minimalist semantics vs. optimizing
for the common case. Qing pointed out that the current way
'counted_by' and related attributes are implemented results in a real
conflict in parsing. She also pointed out that by continuing with the
current design, we're literally adding a new scoping rule to C. This
is a *massive* overreach for a new attribute feature. This won't be
less of an overreach by adding new diagnostics, no matter how helpful
they are. (Especially diagnostics which are suppresable, because once
a project suppresses a diagnostic, it's rarely turned back on.)

> Although I don't think these ambiguities matter in practice (as long as there 
> are workarounds) since their occurrence will be vanishingly rare, the fact 
> that 'n' currently means one thing doesn't mean that it should have to mean 
> the same thing in another context with different design goals, especially 
> when that is a completely new context. The context of "__counted_by(...)" 
> surrounding the expression should be enough to infer what is meant. The fact 
> that code could be structured in a way to mislead the reader is nothing new 
> to C, and when the context where that even could be a concern is largely 
> hypothetical I don't think eliminating it should be a priority.

Qing pointed out in four lines of code how there are two different
token resolution rules being used: one which is reliant upon C's
current scoping rules and the other which requires a completely new
scoping rule. This is no longer a question about what a programmer is
most likely to infer what is meant or if the documentation makes it
clear what's happening or any other ambiguous issues that me and
others put forth before. This is a serious problem. There are
essentially two options we have:

1. Create a proposal to the C standards committee adding an "instance
scope" into the language and use that for the feature, or
2. Find some other way, which doesn't require modifying the base language.

The suggestion is to use "__self" to refer to elements within the
least enclosing, non-anonymous struct, where "__self.name" is a field
in that struct and "name" is a global or shadowed local variable.
There have been other suggestions, such as using a function pointer,
e.g.:

struct b;
size_t do_calc(struct b *);

struct b {
  int num_elems;
  int *buf __counted_by(do_calc); // a pointer to 'struct b' is
implicitly passed in
};

size_t do_calc(struct b *ptr) { ... }

This solves everything, but has the disadvantage of being a bit heavy
handed (though I'd like to see this used for expressions).

We could possibly aid parsing when the 'count' occurs after the
attributed member by adding a declaration:

struct b {
  int *buf __counted_by(size_t num_elems; __self.num_elems);
  size_t num_elems;
};

And there may be other suggestions. However, until we come up with a
solution that doesn't require adding a new scoping rule to C, I don't
see how we can continue with releasing this feature. As for current
uses, it's possible to use 'clang-tidy' to modify code en masse. It's
a PITA, but I think it's necessary.

-bw

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

Reply via email to