Hi Martin, On Sun, Sep 01, 2024 at 11:51:10AM GMT, Martin Uecker wrote: > > Alex, > > I am all for making things more consistent, but there is also a cost > to changing stuff too much. length is the established > term in most programming languages and I would recommend to stick > to it. > > Note that it is not true that the standard consistently refers to > > char a[3][n] > > as a VLA. It does so in the description in sizeof but not in the > type compatibility rules, at least as understood by most compilers.
The wording in 6.2.7 (Compatible type and composite type) is unclear, IMO. Also, it is not the definition of the term "variable length array", but rather a use of it to describe the composite type. I'd say the wording is incorrect and should be fixed, since it's a misuse of the term "variable length array". The term VLA is actually defined in 6.7.7.3 (Array declarators), and it clearly defines what is and is not a VLA. <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#subsubsection.6.7.7.3> N3301::6.7.7.3p4 (with formatting blanks inserted by me): [...] If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a _variable length array_ type. (Variable length arrays with automatic storage duration are a conditional feature that implementations may support; see 6.10.10.4.) (The underscores represent the italics in the document.) > This is an inconsistency we *should* fix, but I do not think that > changing away from "length" is a good ida. > > Note that "number of elements" is inherently an ambiguous term for > multi-dimensional arrays, I presume you consider the ambiguity in the following way: int a[3][4]; has 12 int "elements". While colloquially that could make sense, it has no precedents in ISO C or in technical language used in C projects (other than maybe in a few rare projects). In C, there's no such thing as a multi-dimensional array. While there are some mentions to "multi-dimensional array" in the standard, they're used to clarify the syntax with arrays of arrays (especially to clarify that C is different from for example Fortran). However, the standard also makes clear that technically they're just arrays of elements that happen to have array type. It is in <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#subsubsection.6.5.3.2> where it says EXAMPLE The following snippet has an array object defined by the declaration: int x[3][5]; Here x is a 3 × 5 array of objects of type int; more precisely, x is an array of three element objects, each of which is an array of five objects of type int. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five objects of type int. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five objects of type int. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the objects of type int, so x[i][j] yields an int. The "flat" number of elements is something that C does not have (it's impossible to calculate it in a generic way), which is a consequence of multi-dimensional arrays not being a thing in C. Consider for example the following array: struct s { int a[10][10]; }; struct s a[10][10]; The number of elements is 10. The number of ints is 10000, and the number of s structures is 100, but I don't think any of those should be a fundamental property described by a standard term, nor any operator (nor operator-like macro) should get any of those two. I'd like to see code that needs such a thing before considering it. If some specific code needs the number of structures, it can get it with sizeof(a) / sizeof(struct s), or if some specific code needs the number of integers, it can get it with sizeof(a) / sizeof(int). But that's still not justification to have a term for it (and currently there's none). > and I am not sure how you want to avoid > this without making the wording more complex (e.g. "number of elements > of the outermost array). The standard already uses "number of elements of an array" all around the standard, and IMO it's quite clear and simple about it. It is also consistent with decay of arrays, which decay to a pointer to their first element (which can be itself an array; it is understood that the ultimate fundamental object is not an element of the (outermost) array). <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#subsection.6.2.5> An _array type_ describes a contiguously allocated nonempty set of objects with a particular member object type, called the _element type_. The element type shall be complete whenever the array type is specified. Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called "array of T". The construction of an array type from an element type is called "array type derivation". <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#subsection.6.5.4.5> EXAMPLE 2 Another use of the sizeof operator is to compute the number of elements in an array: sizeof array / sizeof array[0] <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#subsection.6.7.7.4> A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to type", where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression. <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3301.pdf#section.17.2> (69) A declaration of an array parameter includes the keyword static within the [ and ] and the corresponding argument does not provide access to the first element of an array with at least the specified number of elements (6.7.7.4). Seems quite consistent and unambiguous to me. Every time the standard needs precise wording for this, it uses the term "number of elements". It uses length only in a few colloquial expressions, and size is more often used, but we agree that it's an abuse, I think. > > So I would recommend not to go this way. You would need a really > good argument to convince me to vote for this, and I haven't seen > any such argument. How about the above? Have a lovely night! Alex > > Martin > > > > Am Sonntag, dem 01.09.2024 um 11:10 +0200 schrieb Alejandro Colomar: > > Hi Jens, Martin, > > > > On Wed, Aug 14, 2024 at 05:44:57PM GMT, Jens Gustedt wrote: > > > Am 14. August 2024 16:47:32 MESZ schrieb Alejandro Colomar > > > <a...@kernel.org>: > > > > > > I was thinking of renaming the proposal to elementsof(), to avoid > > > > > > confusion between length of an array and length of a string. Would > > > > > > you > > > > > > mind checking if elementsof() is ok? > > > > > > > > > > No, not for me. I really want as to go consistently to talk about > > > > > array length for this. Consistent terminology is important. > > > > > > > > I understand your desire for consistency. I think your paper is a net > > > > improvement over the status quo (which is a mix of length, size, and > > > > number of elements). After your proposal, there will be only length and > > > > number of elements. That's great. > > > > > > > > However, strlen(3) came first, and we must respect it. > > > > > > Sure, string length, a dynamic feature, and array length are two > > > features. > > > > > > But we also have VLA and not VNEA in the standard, So we should respect > > > this ;-) > > > > I hadn't thought about it until yesterday after Martin insisted in > > preferring lengthof over nelementsof or a contraction of it, and worried > > about nelementsof possibly causing ambiguity with multi-dimensional > > arrays. But: > > > > VLA is a misnomer. > > ~~~~~~~~~~~~~~~~~~ > > > > First, let's assume length refers to the number of elements, as we all > > agree that length should not refer to the size in bytes of an array, > > since we already have the term "size" for it, which is consistent with > > sizeof. > > > > int vla[3][n]; > > > > The array from above is a so-called variable length array, according to > > the standard. But it does not have a variable length, according to the > > presumed meaning of length. It does indeed have a variable size. The > > element of vla is itself an array, which is the one that really has a > > variable length (or number of elements, as is the more technical term). > > > > So, if n3187 develops, and really pretends to uniquely and unambiguously > > use a term for the number of elements and another one for the size of an > > array, it should also rename "variable length array" into "variable size > > array". > > > > It is indeed due to this problematic misuse of the colloquial term > > length that "lenght" and not "number of elements" is misleading in > > multi-dimensional arrays. The standard is very strict in using NoE for > > the first dimension of an array (so its true dimension), and not for > > the dimensions of arrays that are elements of it. > > > > And now you could say that this is only a problem of multi-dimensional > > arrays. It's not. They're just the composition of arrays with elements > > of type array. The same problem arises with single dimensional arrays > > in complex situations (although, admittedly, this is non-standard): > > > > $ cat vla.c > > int > > main(void) > > { > > int n = 5; > > > > struct s { > > int v[n]; > > }; > > > > struct s a[3]; > > > > return sizeof(a); > > } > > $ gcc -Wall -Wextra -Wpedantic vla.c > > vla.c: In function ‘main’: > > vla.c:7:22: warning: a member of a structure or union cannot have a > > variably modified type [-Wpedantic] > > 7 | int v[n]; > > | ^ > > $ ./a.out; echo $? > > 60 > > > > a is a VLA even if it is a single-dimension array of known constant > > number of elements. Huh? :) > > > > Terminology > > ~~~~~~~~~~~ > > > > Once we've determined that "length" in VLA does refer to the size and > > not the number of elements, it's hard to justify a reformation of > > terminology that would base on length meaning number of elements. > > > > Indeed, either basing justifications of the origins of length on > > strlen(3) or on VLA, we must conclude that "variable length array" must > > be renamed to "variable size array". I'm preparing a paper for that. > > > > If eventually that paper would be accepted, I'd prepare a second paper > > that would reform every use of size and length with arrays so that size > > always refers to the size in bytes, length is completely removed, and > > number of elements stands as the only term to refer to the number of > > elements. > > > > > > Have a lovely day! > > Alex > > > > > > Since you haven't proposed eliminating "number of elements" from the > > > > standard, and it would still be used alongside length, I think > > > > elementsof() would be consistent with your view (consistent with "number > > > > of elements"). > > > > > > didn't we ? Then this is actually a good idea to do so, thanks for the > > > idea ! > > > -- <https://www.alejandro-colomar.es/>
signature.asc
Description: PGP signature