On Thu, Mar 13, 2025 Martin Uecker <uec...@tugraz.at> wrote:

> ...
>
> So it seems to be a possible way forward while avoiding
> language divergence and without introducing anything too novel
> in either language.
>
> (But others still have concerns about .n and prefer __self__.)


     I would like to gently push back about __self__, or __self, or self,
because all of these identifiers are fairly common identifiers in code.
When I writing the paper for __self_func (
https://thephd.dev/_vendor/future_cxx/papers/C%20-%20__self_func.html ), I
searched GitHub and other source code indexing and repository services:
__self, __self__, and self has a substantial amount of uses. If there's an
alternative spelling to consider, I think that would be helpful.

    I would also like to offer that other people have approached me about
`::` as a way to help disambiguate identifiers and prevent local shadowing
in macros ( see: https://github.com/ThePhD/future_cxx/issues/69 ). However,
I don't think it helps with the case of this GCC extension:

int main () {
    int n = 1; // a local variable n
    struct foo {
        int n;     // a member variable n
        int a[n + 10];  // for VLA, this n refers to the local variable n.
        //char *b __attribute__ ((counted_by(n + 10)))
        // for counted_by, this n refers to the member variable n.
    };
}

      If you use `::n`, this allows you to reference a global variable. But
the contentious `n` here isn't a global variable, it's a local. So it's not
of much help here. If you stack another "n" at the global scope, you then
have another problem:

extern int n;
int main () {
    int n = 1; // a local variable n, shadows global
    struct foo {
        int n;     // a member variable n
        int a[n + 10];  // for VLA, this n refers to the local variable n.
        //char *b __attribute__ ((counted_by(n + 10)))
        // for counted_by, this n refers to the member variable n.
    };
}

Now, even if you use C++-style `::n`, and then use the rules proposed by
context-sensitivity, it becomes impossible to refer to the local variable
outside of the struct without additional annotation. You get the opposite
of this problem with `${KEYWORD}.n` (${KEYWORD} as a placeholder for
__self, since I still have the above-named problems with __self): it
enables referring to the structure variable with ${KEYWORD}, and the local
variable with nothing, but still leaves the global variable as
non-referenceable anymore. Part of this problem is self-inflicted: VLAs in
structures are a GNU extension and not an ISO C feature (for reasons like
this one). But it's still technically a problem, and we can't necessarily
step on GCC's affordance to make an extension in this space, so whatever we
come up with we will have both problems to fix.

     I see 2 plausible ways forward, though I've only thought about this
for 4 days:

     (0) Accept that Yeoul (and the others) are correct in that issuing an
error (diagnostic) for this case would be better. Effectively, it's just
bad code and you ask the user to change the local variable from e.g. `n`,
which is something they should have control over (theoretically). Then,
standardize `::n` to refer to the global. The local variable could have a
different name, the name in the structure might be similar to a global (but
is found by counted_by's lookup), and the global variable has to be named
explicitly with `::n". This does not necessarily solve the forward
reference problem, but all solutions proposed here require delayed
resolution (especially to deal with the common struct case), so this seems
like a moot point in-general.

     (1) Accept that we need ${KEYWORD}, or ${DOT} , to refer to locals.
This does not solve the problem where a local variable shadows a global
variable, so even if this path is taken I would still suggest `::n` to go
with it, so that we can solve the problem where a local variable shadows a
global variable. Then there's no new real "lookup rule", so people who feel
like we're violating C's core design space might feel less uneasy because
you have to use the new syntax (a keyword or `.`) to access in-struct
things. This still has a forward reference problem, so it's once again moot
whether or not the forward reference problem can be solved here.

     The (0) solution can be seen as more "natural"; there's no dots, no
keyword, but it requires a potential change in local variables for
conflicting cases. `::global` comes along for the ride as the way to
separate member fields from globals. I could see this working and, as I
understand it, this is the choice Clang was currently progressing with (?).

     The (1) solution can be seen as less "natural"; it requires extra
syntax to say what is, overwhelmingly, the common use case and ISO
standard-supported use case to make way for a pathological GNU extension in
VLA members. It becomes a bit more natural if you use {DOT}, rather than
{KEYWORD}, thanks to designated initializers being in both Standard, ISO C
and C++ now.

     An additional solution that has been proposed (but the author dropped
the proposal) is _Outer. The proposal was in the context of macros
normally, but it applies to this situation too (
https://www.open-std.org/JTC1/SC22/WG14/www/docs/n2679.pdf ). You could use
(0) + _Outer as a means of annotating the pathological case, and diagnose
(error) when a local variable plus a member field have the same name. This
would also get you over the finish line, without needing to change the name
of a local C variable as well. It would also not require you to add the
_Outer until you write problematic code.

      I'm sure that this is not helpful, as I'm just sort of stating a
bunch of different ways to solve the problem without really doing any
complex analysis. I think that VLA syntax inside of structures using local
variables to determine its size and not the member variable in its initial
introduction was a mistake that is currently having bad consequences for
this discussion. My preference for solutions is (0), then (1), but this is
only a reflection of personal expectation. It's also colored by having to
also think about this problem for __counted_by / bounds attributes for
function parameters, which is facing similar issues between choosing a
parameter name vs. choosing a global variable. I think `_Outer` would be
helpful if neither (0) or (1) finds traction, as an agreeable middleground
that has other applications.

     I think that, at least for array syntax and attribute syntax, some
amount of delayed resolution (in structures and parameter lists) would be
both expedient and wise.

Sincerely,
JeanHeyd

Reply via email to