Am Dienstag, dem 01.04.2025 um 15:01 +0000 schrieb Qing Zhao:
> 
> > On Apr 1, 2025, at 10:04, Martin Uecker <uec...@tugraz.at> wrote:
> > 
> > 
> > 
> > Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling:
> > > > I'd like to offer up this to solve the issues we're facing. This is a
> > > > combination of everything that's been discussed here (or at least that
> > > > I've been able to read in the centi-thread :-).
> > 
> > Thanks! I think this proposal much better as it avoids undue burden
> > on parsers, but it does not address all my concerns.
> > 
> > 
> > From my side, the issue about compromising the scoping rules of C
> > is also about unintended non-local effects of code changes. In
> > my opinion, a change in a library header elsewhere should not cause 
> > code in a local scope (which itself might also come from a macro) to
> > emit a warning or require a programmer to add a workaround. So I am
> > not convinced that adding warnings or a workaround such as
> > __builtin_global_ref  is a good solution.
> > 
> > 
> > I could see the following as a possible way forward: We only 
> > allow the following two syntaxes:
> > 
> > 1. Single argument referring to a member.
> > 
> > __counted_by(len)
> > 
> > with an argument that must be a single identifier and where
> > the identifier then must refer to a struct member. 
> > 
> > (I still think this is not ideal and potentially
> > confusing, but in contrast to new scoping rules it is
> > at least relatively easily to explain as a special rule.).  
> > 
> > 
> > 2. Forward declarations. 
> > 
> > __counted_by(size_t len; len + PADDING)
> 
> In the above, the PADDING is some constant? 

In principle - when considering only the name lookup rules -
it could be a constant, a global variable, or an automatic
variable, i.e. any ordinary identifiers which is visible at
this point. 

> 
> More complicated expressions involving globals will not be supported?

I think one could allow such expressions, But I think the
expressions should be restricted to expressions which have
no side effects.

> 
> > where then the second part can also be a more complicated 
> > expression, but with the explicit requirement that all
> > identifiers in this expression are then looked up according to
> > regular C language rules. So except for the forward declared
> > member(s) they are *never* looked up in the member namespace of
> > the struct, i.e. no new name lookup rules are introduced.
> 
> One question here:
> 
> What’s the major issue if we’d like to add one new scoping rule, for example,
> “Structure scope” (the same as the C++’s instance scope) to C? 
> 
> (In addition to the "VLA in structure" issue I mentioned in my previous 
> writeup, 
> is there any other issue to prevent this new scoping rule being added into C 
> ?).

Note that the "VLA in structure" is a bit of a red herring.  The exact same
issues apply to lookup of any other ordinary identifiers in this context.

enum { MY_BUF_SIZE = 100 };
struct foo {
  char buf[MY_BUF_SIZE];
};


C++ has instance scope for member functions. The rules for C++ are also
complex and not very consistent (see the examples I posted earlier,
demonstrating UB and compiler divergence).  For C such a concept would
be new and much less useful, so the trade-off seems unfavorable (in
constrast to C++ where it is needed).  I also see others issues:  Fully
supporting instance scope would require changes to how C is parsed, 
placing a burden on all C compilers and tooling. Depending on how you 
specify it, it would also cause a change in semantics
for existing code, something C tries very hard to avoid. If you add
warnings as mitigation,  it has the problem that it causes non-local
effects where introducing a name in in enclosing scope somewhere else
now necessitates a change to unrelated code, exactly what scoping rules
are meant to prevent.  

In any case, it seems a major change with many ramifications, including
possibly unintended ones. This should certainly not be done without
having a clear specification and support from WG14 (and probably not
done at all.)

Martin

> 
> Qing
> 
> 
> > 
> > 
> > I think this could address my concerns about breaking
> > scoping in C. Still, I personally would prefer designator syntax
> > for both C and C++ as a nicer solution, and one that already
> > has some support from WG14.
> > 
> > Martin
> > 
> > 
> > > > 
> > > > ---
> > > > 
> > > > 1. The use of '__self' isn't feasible, so we won't use it. Instead,
> > > > we'll rely upon the current behavior—resolving any identifiers to the
> > > > "instance scope". This new scope is used __only__ in attributes, and
> > > > resolves identifiers to those in the least enclosing, non-anonymous
> > > > struct. For example:
> > > > 
> > > > struct foo {
> > > >  char count;
> > > >  struct bar {
> > > >    struct {
> > > >      int len;
> > > >    };
> > > >    struct {
> > > >      struct {
> > > >        int *valid_use __counted_by(len); // Valid.
> > > >      };
> > > >    };
> > > >    int *invalid_use __counted_by(count); // Invalid.
> > > >  } b;
> > > > };
> > > > 
> > > > Rationale: This is how '__guarded_by' currently resolves identifiers,
> > > > so there's precedence. And if we can't force its usage in all
> > > > situations, it's less a feature and more a "nicety" which will lead to
> > > > a massive discrepancy between compiler implementations. Despite the
> > > > fact that this introduces a new scoping mechanism to C, its use is not
> > > > as extensive as C++'s instance scoping and will apply only to
> > > > attributes. In the case where we have two different resolution
> > > > techniquest happening within the same structure (e.g. VLAs), we can
> > > > issue warnings as outlined in Yeoul's RFC[1].
> > > > 
> > > > 2. A method of forward declaring variables will be added for variables
> > > > that occur in the struct after the attribute. For example:
> > > > 
> > > > A: Necessary usage:
> > > > 
> > > > struct foo {
> > > >  int *buf __counted_by(char count; count);
> > > >  char count;
> > > > };
> > > > 
> > > > B: Unnecessary, but still valid, usage:
> > > > 
> > > > struct foo {
> > > >  char count;
> > > >  int *buf __counted_by(char count; count);
> > > > };
> > > > 
> > > > * The forward declaration is required in (A) but not in (B).
> > > > * The type of 'count' as declared in '__counted_by' *must* match the 
> > > > real type.
> > > > 
> > > > Rationale: This alleviates the issues of "double parsing" for
> > > > compilers that aren't able to handle it. (We can also remove the
> > > > '-fexperimental-late-parse-attributes' flag in Clang.)
> > > > 
> > > > 3. A new builtin '__builtin_global_ref()' (or similarly named) is
> > > > added to refer to variables outside of the most-enclosing structure.
> > > > Example:
> > > > 
> > > > int count_that_will_never_change_we_promise;
> > > > 
> > > > struct foo {
> > > >  int *bar 
> > > > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise));
> > > >  unsigned flags;
> > > > };
> > > > 
> > > > As Yeoul pointed out, there isn't a way to refer to variables that
> > > > have been shadowed, so the 'global' in '__builtin_global_ref' is a bit
> > > > of a misnomer as it could refer to a local variable.
> > > > 
> > > > Rationale: For those who need the flexibility to use variables outside
> > > > of the struct, this is an acceptable escape route. It does make bounds
> > > > checking less strict, though, as we can't track any modifications to
> > > > the global, so caution must be used.
> > > > 
> > > > Bonus suggestion (by yours truly):
> > > > 
> > > > I'd like the option to allow functions to calculate expressions (it
> > > > can be used for a single identifier too, but that's too heavy-handed).
> > > > It won't be required for an expression, but is a good way to avoid any
> > > > issues regarding '__builtin_global_ref', like variables shadowing the
> > > > global variable. Example:
> > > > 
> > > > int global;
> > > > 
> > > > struct foo;
> > > > static int counted_by_calc(struct foo *);
> > > > 
> > > > struct foo {
> > > >  char count;
> > > >  int fnord;
> > > >  int *buf __counted_by(counted_by_calc);
> > > > };
> > > > 
> > > > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) {
> > > >  return ptr->count * (global << 42) - ptr->fnord;
> > > > }
> > > > 
> > > > A pointer to the current least enclosing, non-anonymous struct is
> > > > passed into 'counted_by_calc' by the compiler.
> > > > 
> > > > Rationale: This gets rid of all ambiguities when calculating an
> > > > expression. It's marked 'pure' so there should be no side-effects.
> > > > 
> > > > ---
> > > > 
> > > > I believe these suggestions cover everything we've discussed. Please
> > > > comment with anything I missed and your opinions on each.
> > > > 
> > > > [1] 
> > > > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510
> > > > 
> > > > Share and enjoy!
> > > > -bw
> > 
> > 
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging


Reply via email to