RE: v2.1 Draft for a lengthof paper
Sorry for top-posting, my work account is stuck on Outlook. :-/ > For a WG14 paper you should add these findings to support that choice. > Another option would be for WG14 to standardize the then existing > implementation with the double underscores. +1, it's always good to explain prior art and existing uses as part of the paper. However, please also point out that C++ has a prior art as well which is slightly different and very much worth considering: they have one API for getting the array's rank, and another for getting a specific rank's extent. This is a general solution that doesn't require the programmer to have deep knowledge of C's declarator syntax and how it relates to multidimensional arrays. That said, I suspect WG14 would not be keen on standardizing `lengthof` without an ugly keyword given that there are plenty of other uses of it that would break: https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src/cmd/mailx/names.c?L53-55 https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_fw.c?L292-294 https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/blob/src/spur64.stack/validImage.c?L7014-7018 (and many, many others) >> > As for the parentheses, I personally think lengthof should follow >> > similar rules compared to sizeof. >> >> I think most people agree with this. > > I still don't, in particular not for standardisation. > > We have to remember that there are many small C compilers out there. Those compilers already have to handle parsing this for sizeof, so that's not particularly compelling (even if we wanted to design C for the lowest common denominator of implementation effort, which I'm not convinced is a good approach these days). That said, if we went with a rank/extent design, I think we'd *have* to use parens because the extent interface would take two operands (the array and the rank you're interested in getting the extent of) and it would be inconsistent for the rank interface to then not require parens. ~Aaron -Original Message- From: Jens Gustedt Sent: Wednesday, August 14, 2024 2:11 AM To: Alejandro Colomar ; Xavier Del Campo Romero Cc: Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko ; Ballman, Aaron Subject: Re: v2.1 Draft for a lengthof paper Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar : > Hi Xavier, > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > I have been overseeing these last emails - > > Ahhh, good to know; thanks! :) > > > thank you very much for your > > efforts, Alex! > > :-) > > > I did not reply until now because I do not have prior experience > > with gcc internals, so my feedback would probably have not been that > > useful. > > Ok. > > > Those emails from 2020 were in fact discussing two completely > > different proposals at once: > > > > 1. Add _Lengthof + #include 2. Allow static > > qualifier on compound literals > > Yup. > > > Whereas proposal #2 made it into C23 (kudos to Jens Gustedt!), and > > as you already know by now, proposal #1 received some negative > > feedback, suggesting _Typeof/typeof + some macro magic as a > > pragmatic workaround instead. > > The original author of that negative feedback talked to me in private > a week ago, and said he likes my proposal. We have no negative > feedback anymore. :) > > > Since the proposal did not get much traction and I would had been > > unable to contribute to gcc myself, I just gave up on it. IIRC the > > deadline for new proposals closed soon after, anyway. > > Ok. > > > But I am glad that someone with proper experience took the initiative. > > Fun fact: this is my second non-trivial patch to GCC. I wouldn't say > I had the proper experience with GCC internals when I started this > patch set. But I'm unemployed at the moment, which gives me all the > time I need for learning those. :) > > > I still think the proposal is relevant and has interesting use cases. > > > > > I have only added lengthof for now, not _Lengthof, as suggested by Jens. > > > Depending on feedback, I'll propose the uglified version. > > > > Probably, all of us know why the uglified version is the usual > > approach preferred by the C standard: we do not know how many > > applications would break otherwise. > > Yup. > > > However, we see that this trend is now changing with C23, so > > probably it makes sense to d
RE: v2.1 Draft for a lengthof paper
> I think that this argument goes too short. E. g. implementation that already > have compound expressions (or lambdas ;-) may provide a > quality > implementation using `static_assert` and `typeof` alone, and don't have to > touch their compiler at all. > > We should not impose an implementation in the language where doing it in a > header can be completely sufficient. But can doing this in a header be completely sufficient in practice? e.g., the user who passes a pointer rather than an array is in for quite a surprise, or passing a struct, or passing a FAM, etc. If we want to put constraints on the interface, that may be more challenging to do from a header file than from the compiler. offsetof is a cautionary tale in that compilers that want a reasonable QoI basically all implement this as a builtin rather than the header-only version. > Plus, implementing as a macro in a header (probably ) makes also a > feature test, for those applications that already have something similar. > this was basically what we did for `unreachable` and I think it worked out > fine. True! I'm still thinking on how important rank + extent is vs overall array length. If C had constexpr functions, then I'd almost certainly want array rank and extent to be the building blocks and then lengthof can be a constexpr function looping over rank and summing extents. But we don't have that yet, and "bird hand" vs "bird in bush"... :-D ~Aaron -Original Message----- From: Jens Gustedt Sent: Wednesday, August 14, 2024 8:18 AM To: Ballman, Aaron ; Alejandro Colomar ; Xavier Del Campo Romero Cc: Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko Subject: RE: v2.1 Draft for a lengthof paper Hi Aaron, Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" : > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > For a WG14 paper you should add these findings to support that choice. > > Another option would be for WG14 to standardize the then existing > > implementation with the double underscores. > > +1, it's always good to explain prior art and existing uses as part of the > paper. However, please also point out that C++ has a prior art as well which > is slightly different and very much worth considering: they have one API for > getting the array's rank, and another for getting a specific rank's extent. > This is a general solution that doesn't require the programmer to have deep > knowledge of C's declarator syntax and how it relates to multidimensional > arrays. > > That said, I suspect WG14 would not be keen on standardizing `lengthof` > without an ugly keyword given that there are plenty of other uses of it that > would break: > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/src > /cmd/mailx/names.c?L53-55 > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod_f > w.c?L292-294 > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/bl > ob/src/spur64.stack/validImage.c?L7014-7018 > (and many, many others) > > >> > As for the parentheses, I personally think lengthof should follow > >> > similar rules compared to sizeof. > >> > >> I think most people agree with this. > > > > I still don't, in particular not for standardisation. > > > > We have to remember that there are many small C compilers out there. > > Those compilers already have to handle parsing this for sizeof, so that's not > particularly compelling (even if we wanted to design C for the lowest common > denominator of implementation effort, which I'm not convinced is a good > approach these days). That said, if we went with a rank/extent design, I > think we'd *have* to use parens because the extent interface would take two > operands (the array and the rank you're interested in getting the extent of) > and it would be inconsistent for the rank interface to then not require > parens. I think that this argument goes too short. E. g. implementation that already have compound expressions (or lambdas ;-) may provide a quality implementation using `static_assert` and `typeof` alone, and don't have to touch their compiler at all. We should not impose an implementation in the language where doing it in a header can be completely sufficient. Plus, implementing as a macro in a header (probably ) makes also a feature test, for those applications that already have something similar. this was basically what we did for `unreachable` and I think it worked out fine. Jens > ~A
RE: v2.1 Draft for a lengthof paper
> What regex did you use for searching? I went cheap and easy rather than trying to narrow down: https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patternType=regexp&sm=0 > I was thinking of renaming the proposal to elementsof(), to avoid confusion > between length of an array and length of a string. Would you mind checking > if elementsof() is ok? From what I was seeing, it looks to be used more uniformly as a function-like macro accepting a single argument. ~Aaron -Original Message- From: Alejandro Colomar Sent: Wednesday, August 14, 2024 8:58 AM To: Jens Gustedt ; Ballman, Aaron Cc: Xavier Del Campo Romero ; Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko Subject: Re: v2.1 Draft for a lengthof paper Hi Aaron, Jens, On Wed, Aug 14, 2024 at 02:17:52PM GMT, Jens Gustedt wrote: > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" > : > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing > > > implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part > > of the paper. However, please also point out that C++ has a prior > > art as well which is slightly different and very much worth > > considering: they have one API for getting the array's rank, and > > another for getting a specific rank's extent. This is a general > > solution that doesn't require the programmer to have deep knowledge > > of C's declarator syntax and how it relates to multidimensional > > arrays. I have added that to my draft. I'll publish it soon as a reply to the GCC mailing list. See below for details of what I have added for now. > > > > That said, I suspect WG14 would not be keen on standardizing > > `lengthof` without an ugly keyword given that there are plenty of other > > uses of it that would break: > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/s > > rc/cmd/mailx/names.c?L53-55 > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod > > _fw.c?L292-294 > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/ > > blob/src/spur64.stack/validImage.c?L7014-7018 > > (and many, many others) What regex did you use for searching? I was thinking of renaming the proposal to elementsof(), to avoid confusion between length of an array and length of a string. Would you mind checking if elementsof() is ok? > > >> > As for the parentheses, I personally think lengthof should > > >> > follow similar rules compared to sizeof. > > >> > > >> I think most people agree with this. > > > > > > I still don't, in particular not for standardisation. > > > > > > We have to remember that there are many small C compilers out there. > > > > Those compilers already have to handle parsing this for sizeof, so > > that's not particularly compelling Agree. I suspect it will be simpler for existing compilers to follow sizeof than to have new syntax. However, it's easy to keep it as a QoI detail, so I've temporarily changed the wording to require parentheses, and let implementations lift that restriction. > > (even if we wanted to design C > > for the lowest common denominator of implementation effort, which > > I'm not convinced is a good approach these days). Off-topic, but I wish that had been the approach when a few implementations (I suspect proprietary vendors; this was never disclosed) rejected redefining NULL as the right thing: (void *) 0. I fixed one of the last free-software implementations of NULL that expanded to 0, and nullptr would probably never have been added if WG14 had not accepted the pressure from such horrible implementations. <https://github.com/cc65/cc65/issues/1823> > > That said, if we went with a rank/extent design, I think we'd *have* > > to use parens because the extent interface would take two operands > > (the array and the rank you're interested in getting the extent of) > > and it would be inconsistent for the rank interface to then not > > require parens. Prior art C It is common in C programs to get the number of elements of an array via the usual sizeof division and wrap it in a macro. Common names include: β’
RE: v2.1 Draft for a lengthof paper
> I am currently on a summer bike trip, so not able to provide a full reference > implantation. But could do so, once I am back. No need (after thinking on this a bit more, I believe you're right that this can be done in a macro-only implementation; we might not go that route in Clang because of AST matching needs and whatnot, but that's not an issue), but thank you for the offer. Please enjoy your summer bike trip! π > Why would you be looping? lengthof only addresses the outer dimension sizeof > would need a loop, no ? Due to poor reading comprehension, I missed in the paper that lengthof works on the outer dimension. π I think having a way to get the flattened size of a multidimensional array is a useful feature. ~Aaron -Original Message- From: Jens Gustedt Sent: Wednesday, August 14, 2024 9:25 AM To: Ballman, Aaron ; Alejandro Colomar ; Xavier Del Campo Romero Cc: Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko Subject: RE: v2.1 Draft for a lengthof paper Am 14. August 2024 14:40:41 MESZ schrieb "Ballman, Aaron" : > > I think that this argument goes too short. E. g. implementation that > > already have compound expressions (or lambdas ;-) may provide a > quality > > implementation using `static_assert` and `typeof` alone, and don't have to > > touch their compiler at all. > > > > We should not impose an implementation in the language where doing it in a > > header can be completely sufficient. > > But can doing this in a header be completely sufficient in practice? Ithindso. > e.g., the user who passes a pointer rather than an array is in for quite a > surprise, or passing a struct, or passing a FAM, etc. If we want to put > constraints on the interface, that may be more challenging to do from a > header file than from the compiler. offsetof is a cautionary tale in that > compilers that want a reasonable QoI basically all implement this as a > builtin rather than the header-only version. Yes, with the tools that I listed and the ideas that are already in the paper you can basically do all that, including given valuable feedback in case of failure. I am currently on a summer bike trip, so not able to provide a full reference implantation. But could do so, once I am back. > > Plus, implementing as a macro in a header (probably ) makes also > > a feature test, for those applications that already have something similar. > > this was basically what we did for `unreachable` and I think it worked out > > fine. > > True! > > I'm still thinking on how important rank + extent is vs overall array > length. If C had constexpr functions, then I'd almost certainly want > array rank and extent to be the building blocks and then lengthof can > be a constexpr function looping over rank and summing extents. But we > don't have that yet, and "bird hand" vs "bird in bush"... :-D Why would you be looping? lengthof only addresses the outer dimension sizeof would need a loop, no ? Generally I would be opposed to imposing a complicated solution for a simple feature Jens > > ~Aaron > > -Original Message- > From: Jens Gustedt > Sent: Wednesday, August 14, 2024 8:18 AM > To: Ballman, Aaron ; Alejandro Colomar > ; Xavier Del Campo Romero > Cc: Gcc Patches ; Daniel Plakosh > ; Martin Uecker ; Joseph Myers > ; Gabriel Ravier ; Jakub > Jelinek ; Kees Cook ; Qing > Zhao ; David Brown ; > Florian Weimer ; Andreas Schwab > ; Timm Baeder ; A. Jiang > ; Eugene Zelenko > Subject: RE: v2.1 Draft for a lengthof paper > > Hi Aaron, > > Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" > : > > Sorry for top-posting, my work account is stuck on Outlook. :-/ > > > > > For a WG14 paper you should add these findings to support that choice. > > > Another option would be for WG14 to standardize the then existing > > > implementation with the double underscores. > > > > +1, it's always good to explain prior art and existing uses as part of the > > paper. However, please also point out that C++ has a prior art as well > > which is slightly different and very much worth considering: they have one > > API for getting the array's rank, and another for getting a specific rank's > > extent. This is a general solution that doesn't require the programmer to > > have deep knowledge of C's declarator syntax and how it relates to > > multidimensional arrays. > > > > That said, I suspect WG14 would no
RE: v2.1 Draft for a lengthof paper
> Ahh, context:global seems to be what I wanted. Where is that documented? For me it is the default when I go to https://sourcegraph.com/search but there's documentation at https://sourcegraph.com/docs/code-search/working/search_contexts > Thanks! I'll rename it to elementsof(). Rather than renaming it, I'd say that the name chosen in the proposed text is a placeholder, and have a section in the prose that describes different naming choices, pros and cons, suggests a name from you as the author, but asks WG14 to pick the final name. I know Jens mentioned he doesnβt like the name `elementsof` and I suspect if we ask five more people we'll get about seven more opinions on what the name could/should be. π ~Aaron -Original Message- From: Alejandro Colomar Sent: Wednesday, August 14, 2024 10:00 AM To: Ballman, Aaron Cc: Jens Gustedt ; Xavier Del Campo Romero ; Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko Subject: Re: v2.1 Draft for a lengthof paper Hi Aaron, On Wed, Aug 14, 2024 at 01:21:18PM GMT, Ballman, Aaron wrote: > > What regex did you use for searching? > > I went cheap and easy rather than trying to narrow down: > https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patter > nType=regexp&sm=0 Ahh, context:global seems to be what I wanted. Where is that documented? > > I was thinking of renaming the proposal to elementsof(), to avoid confusion > > between length of an array and length of a string. Would you mind checking > > if elementsof() is ok? > > From what I was seeing, it looks to be used more uniformly as a > function-like macro accepting a single argument. Thanks! I'll rename it to elementsof(). Cheers, Alex > ~Aaron -- <https://www.alejandro-colomar.es/>
RE: v2.1 Draft for a lengthof paper
> I would love to see a proposal for adding this GNU extension to ISO C. > Did nobody do it yet? I could try to, if I find some time. (But I'll take a > longish time for that; if anyone else does it, it would be great.) It's been discussed but hasn't moved forward because there are design issues with it (the odd way in which it produces a resulting value, sometimes surprising behavior with how it interacts with flow control, the fact that it can't be used in all contexts, etc). The committee was leaning more towards lambdas despite those being a bit orthogonal. ~Aaron -Original Message- From: Alejandro Colomar Sent: Wednesday, August 14, 2024 10:48 AM To: Jens Gustedt Cc: Ballman, Aaron ; Xavier Del Campo Romero ; Gcc Patches ; Daniel Plakosh ; Martin Uecker ; Joseph Myers ; Gabriel Ravier ; Jakub Jelinek ; Kees Cook ; Qing Zhao ; David Brown ; Florian Weimer ; Andreas Schwab ; Timm Baeder ; A. Jiang ; Eugene Zelenko Subject: Re: v2.1 Draft for a lengthof paper On Wed, Aug 14, 2024 at 03:50:21PM GMT, Jens Gustedt wrote: > > > > > > > > That said, I suspect WG14 would not be keen on standardizing > > > > `lengthof` without an ugly keyword given that there are plenty of other > > > > uses of it that would break: > > > > > > > > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/u > > > > sr/src/cmd/mailx/names.c?L53-55 > > > > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ > > > > ipod_fw.c?L292-294 > > > > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-v > > > > m/-/blob/src/spur64.stack/validImage.c?L7014-7018 > > > > (and many, many others) > > > > What regex did you use for searching? > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > confusion between length of an array and length of a string. Would > > you mind checking if elementsof() is ok? > > No, not for me. I really want as to go consistently to talk about > array length for this. Consistent terminology is important. I understand your desire for consistency. I think your paper is a net improvement over the status quo (which is a mix of length, size, and number of elements). After your proposal, there will be only length and number of elements. That's great. However, strlen(3) came first, and we must respect it. Since you haven't proposed eliminating "number of elements" from the standard, and it would still be used alongside length, I think elementsof() would be consistent with your view (consistent with "number of elements"). Alternatively, you could use a new term, for example extent, for referring to the number of elements of an array. That would be more respectful to strlen(3), keeping a strong distinction between string length and array **. Or how about always referring to it as "number of elements"? It's longer to type, but would be the most consistent approach. Also, elementsof() is free to use, while lengthof() has a several existing incompatible cases (as Aaron has shown), so we can't use that name so freely. > > I have concerns about a libc (or a predefined macro) implementation: > > the sizeof division causes double evaluation with any VLAs, while my > > implementation for GCC has less cases of evaluation, and when it > > needs to evaluate, it only does it once. It would be hard to find a > > good wording that would allow an implementation to implement this as a > > macro. > > No, we should not allow double evaluation. > > putting this in a `({})` I would love to see a proposal for adding this GNU extension to ISO C. Did nobody do it yet? I could try to, if I find some time. (But I'll take a longish time for that; if anyone else does it, it would be great.) > and doing a `typedef typeof(X) _my_type;` with the macro parameter `X` > at the beginning completely avoids double evaluation. So quality > implantations are possible, but perhaps differently and with other builtins > than we are imagining. Don't impose the view of one particular implementation > onto others. Ahhh, good. I haven't thought of that possibility. Sure, that makes sense now. It gives more strength to your proposal of allowing libc implementations, and thus require parens in the standard. > Somewhere was brought in an argument with `offsetof`. > This is exactly what we need. Implementations being able to start with > a simple solution (as everybody did in the beginning of `offsetof`), > and improve that implementation at their pace when they are ready for > it. Agree. > > > this was basically what we did for `unreachable` and I think it > > > worked out fine. > > I still think that the different options that we had there can be used > to ask the right questions for WG14. I'm looking at it. I've already taken some parts of it. :) Cheers, Alex -- <https://www.alejandro-colomar.es/>