Am Dienstag, dem 01.04.2025 um 15:01 +0000 schrieb Qing Zhao: > > > On Apr 1, 2025, at 10:04, Martin Uecker <uec...@tugraz.at> wrote: > > > > > > > > Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling: > > > > I'd like to offer up this to solve the issues we're facing. This is a > > > > combination of everything that's been discussed here (or at least that > > > > I've been able to read in the centi-thread :-). > > > > Thanks! I think this proposal much better as it avoids undue burden > > on parsers, but it does not address all my concerns. > > > > > > From my side, the issue about compromising the scoping rules of C > > is also about unintended non-local effects of code changes. In > > my opinion, a change in a library header elsewhere should not cause > > code in a local scope (which itself might also come from a macro) to > > emit a warning or require a programmer to add a workaround. So I am > > not convinced that adding warnings or a workaround such as > > __builtin_global_ref is a good solution. > > > > > > I could see the following as a possible way forward: We only > > allow the following two syntaxes: > > > > 1. Single argument referring to a member. > > > > __counted_by(len) > > > > with an argument that must be a single identifier and where > > the identifier then must refer to a struct member. > > > > (I still think this is not ideal and potentially > > confusing, but in contrast to new scoping rules it is > > at least relatively easily to explain as a special rule.). > > > > > > 2. Forward declarations. > > > > __counted_by(size_t len; len + PADDING) > > In the above, the PADDING is some constant?
In principle - when considering only the name lookup rules - it could be a constant, a global variable, or an automatic variable, i.e. any ordinary identifiers which is visible at this point. > > More complicated expressions involving globals will not be supported? I think one could allow such expressions, But I think the expressions should be restricted to expressions which have no side effects. > > > where then the second part can also be a more complicated > > expression, but with the explicit requirement that all > > identifiers in this expression are then looked up according to > > regular C language rules. So except for the forward declared > > member(s) they are *never* looked up in the member namespace of > > the struct, i.e. no new name lookup rules are introduced. > > One question here: > > What’s the major issue if we’d like to add one new scoping rule, for example, > “Structure scope” (the same as the C++’s instance scope) to C? > > (In addition to the "VLA in structure" issue I mentioned in my previous > writeup, > is there any other issue to prevent this new scoping rule being added into C > ?). Note that the "VLA in structure" is a bit of a red herring. The exact same issues apply to lookup of any other ordinary identifiers in this context. enum { MY_BUF_SIZE = 100 }; struct foo { char buf[MY_BUF_SIZE]; }; C++ has instance scope for member functions. The rules for C++ are also complex and not very consistent (see the examples I posted earlier, demonstrating UB and compiler divergence). For C such a concept would be new and much less useful, so the trade-off seems unfavorable (in constrast to C++ where it is needed). I also see others issues: Fully supporting instance scope would require changes to how C is parsed, placing a burden on all C compilers and tooling. Depending on how you specify it, it would also cause a change in semantics for existing code, something C tries very hard to avoid. If you add warnings as mitigation, it has the problem that it causes non-local effects where introducing a name in in enclosing scope somewhere else now necessitates a change to unrelated code, exactly what scoping rules are meant to prevent. In any case, it seems a major change with many ramifications, including possibly unintended ones. This should certainly not be done without having a clear specification and support from WG14 (and probably not done at all.) Martin > > Qing > > > > > > > > I think this could address my concerns about breaking > > scoping in C. Still, I personally would prefer designator syntax > > for both C and C++ as a nicer solution, and one that already > > has some support from WG14. > > > > Martin > > > > > > > > > > > > --- > > > > > > > > 1. The use of '__self' isn't feasible, so we won't use it. Instead, > > > > we'll rely upon the current behavior—resolving any identifiers to the > > > > "instance scope". This new scope is used __only__ in attributes, and > > > > resolves identifiers to those in the least enclosing, non-anonymous > > > > struct. For example: > > > > > > > > struct foo { > > > > char count; > > > > struct bar { > > > > struct { > > > > int len; > > > > }; > > > > struct { > > > > struct { > > > > int *valid_use __counted_by(len); // Valid. > > > > }; > > > > }; > > > > int *invalid_use __counted_by(count); // Invalid. > > > > } b; > > > > }; > > > > > > > > Rationale: This is how '__guarded_by' currently resolves identifiers, > > > > so there's precedence. And if we can't force its usage in all > > > > situations, it's less a feature and more a "nicety" which will lead to > > > > a massive discrepancy between compiler implementations. Despite the > > > > fact that this introduces a new scoping mechanism to C, its use is not > > > > as extensive as C++'s instance scoping and will apply only to > > > > attributes. In the case where we have two different resolution > > > > techniquest happening within the same structure (e.g. VLAs), we can > > > > issue warnings as outlined in Yeoul's RFC[1]. > > > > > > > > 2. A method of forward declaring variables will be added for variables > > > > that occur in the struct after the attribute. For example: > > > > > > > > A: Necessary usage: > > > > > > > > struct foo { > > > > int *buf __counted_by(char count; count); > > > > char count; > > > > }; > > > > > > > > B: Unnecessary, but still valid, usage: > > > > > > > > struct foo { > > > > char count; > > > > int *buf __counted_by(char count; count); > > > > }; > > > > > > > > * The forward declaration is required in (A) but not in (B). > > > > * The type of 'count' as declared in '__counted_by' *must* match the > > > > real type. > > > > > > > > Rationale: This alleviates the issues of "double parsing" for > > > > compilers that aren't able to handle it. (We can also remove the > > > > '-fexperimental-late-parse-attributes' flag in Clang.) > > > > > > > > 3. A new builtin '__builtin_global_ref()' (or similarly named) is > > > > added to refer to variables outside of the most-enclosing structure. > > > > Example: > > > > > > > > int count_that_will_never_change_we_promise; > > > > > > > > struct foo { > > > > int *bar > > > > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise)); > > > > unsigned flags; > > > > }; > > > > > > > > As Yeoul pointed out, there isn't a way to refer to variables that > > > > have been shadowed, so the 'global' in '__builtin_global_ref' is a bit > > > > of a misnomer as it could refer to a local variable. > > > > > > > > Rationale: For those who need the flexibility to use variables outside > > > > of the struct, this is an acceptable escape route. It does make bounds > > > > checking less strict, though, as we can't track any modifications to > > > > the global, so caution must be used. > > > > > > > > Bonus suggestion (by yours truly): > > > > > > > > I'd like the option to allow functions to calculate expressions (it > > > > can be used for a single identifier too, but that's too heavy-handed). > > > > It won't be required for an expression, but is a good way to avoid any > > > > issues regarding '__builtin_global_ref', like variables shadowing the > > > > global variable. Example: > > > > > > > > int global; > > > > > > > > struct foo; > > > > static int counted_by_calc(struct foo *); > > > > > > > > struct foo { > > > > char count; > > > > int fnord; > > > > int *buf __counted_by(counted_by_calc); > > > > }; > > > > > > > > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) { > > > > return ptr->count * (global << 42) - ptr->fnord; > > > > } > > > > > > > > A pointer to the current least enclosing, non-anonymous struct is > > > > passed into 'counted_by_calc' by the compiler. > > > > > > > > Rationale: This gets rid of all ambiguities when calculating an > > > > expression. It's marked 'pure' so there should be no side-effects. > > > > > > > > --- > > > > > > > > I believe these suggestions cover everything we've discussed. Please > > > > comment with anything I missed and your opinions on each. > > > > > > > > [1] > > > > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510 > > > > > > > > Share and enjoy! > > > > -bw > > > > > -- Univ.-Prof. Dr. rer. nat. Martin Uecker Graz University of Technology Institute of Biomedical Imaging