Hi,
Usual arithmetic promotions are (sometimes) quite unexpected.
Especially, in random architectures where int may be wider than expected.
Fixed-width math in C has been partially supported since C99. I say
partially, because the following code may surprise some (few) programmers:
uint16_t x = 1;
uint16_t y = 2;
if (x - y < 0)
puts("Hmm");
But the following code will surprise probably some more programmers:
uint32_t x = 1;
uint32_t y = 2;
if (x - y < 0)
puts("Really?"); // happens on ILP64 for example
And there's no guarantee that the future will avoid surprises in the
following code:
uint64_t x = 1;
uint64_t y = 2;
if (x - y < 0)
puts("WTF is going on?!"); // may happen in the future
These simple examples are tractable, but in relatively more complex
expressions, making sure with casts that every step is using fixed-width
types can be unreadable. Also, many programmers may assume that int is
never going to have some insane width such as 128 bits, and write code
that will break in the future.
There is really no way of making sure that operations are done in their
original types, unless you fill your code with explicit casts
everywhere. That is due to the native C types having some defined
minimum sizes, but no maximum ones. A theoretical future implementation
might make int be 128 bits, and screw all of those assuming uint64_t
math will never be under "integer promotions".
The common integer promotions may have been useful in the old pre-C99
times, but they are more of a pain right now. Of course changing the
language to use a different set of rules would break much existing code,
so that's not viable. However, defining such different rules for
specific paths of code (or whole new programs) may be useful.
---
I'm proposing an attribute that would apply a new set of rules,
specifically designed for fixed width operations, but which also try to
be sane defaults for any other scenario. Let's call it
[[gnu::no_int_promotion]]).
The rules would be:
- An object or expression with an integer type to which this attribute
has been applied will not be converted to an int or unsigned int
automatically. (Disables the last paragraph of C2X::6.3.1.1.2).
- Usual arithmetic conversions will still apply however, to determine a
common type for operands in the case of non-unary operators.
(C2X::6.3.1.8 will still apply).
Let's see this with some examples. I'll add a comment to expressions,
noting the type to which the operands are converted:
[[gnu::no_int_promotion]] unsigned char u8;
[[gnu::no_int_promotion]] signed char s8;
[[gnu::no_int_promotion]] unsigned short u16;
[[gnu::no_int_promotion]] short s16;
[[gnu::no_int_promotion]] unsigned long u64;
[[gnu::no_int_promotion]] long long s64ll;
u8 + u8; // unsigned char
s8 + s16; // short
u8 + u16; // unsigned short
u8 + s8; // unsigned char
u16 + s8; // unsigned short
u8 + s16; // short
u64 + s64ll; // unsigned long long
This attribute would also apply to types, so that it can be used in
typedefs:
typedef uint8_t [[gnu::no_int_promotion]] u8_t;
and then any variable declared with type u8_t (or cast to that type)
would also have the attribute applied to it, and will therefore not
promote to int.
In the case that an expression mixes types with the attribute and types
without it, the type of the result will have or not the attribute
applied to it, depending on whether the type with the attribute has
"won"/persisted according to the rules of C2X::6.3.1.8.1. I guess it
will never persist in such cases, since if the other side is already
promoted to int, int will typically win over a shorter type (and the
attribute already is useless if the type is greater than int), but it
may have some consequence in some weird scenario.
Also, to be clear:
(u8 << 7) >> 1; // unsigned char
---
If this proposal seems acceptable, a more complete feature set would
also include a new set of fixed-width types (non-fixed-width types don't
make much sense with these attributes, so don't bother creating them),
which could be names as:
typedef int8_t [[gnu::no_int_promotion]] s8_t;
typedef uint8_t [[gnu::no_int_promotion]] u8_t;
typedef int16_t [[gnu::no_int_promotion]] s16_t;
typedef uint16_t [[gnu::no_int_promotion]] u16_t;
typedef int32_t [[gnu::no_int_promotion]] s32_t;
typedef uint32_t [[gnu::no_int_promotion]] u32_t;
typedef int64_t [[gnu::no_int_promotion]] s64_t;
typedef uint64_t [[gnu::no_int_promotion]] u64_t;
So that it's easier to declare and use variables of really-fixed-width
type. Maybe another name is preferable, but since _t is already
reserved for the implementation, I think these short names are nice and
valid, and probably appealing to users (uintN_t were already a bit ugly
for some, and uint_fastN_t even more, so something like int_fixedN_t
wouldn't be as accepted as u8_t).
UINT8_C() does absolutely nothing, since it provides a value of type
uint8_t _with integer promotions applied_, that is, promoted to int.
Therefore, we need a new macro for specifying constants of type u8_t, to
be able to do the following:
u8 + U8_C(1);
And since the language specifies no suffices for constants shorter than
int, we would also need to create a suffix as an extension. Since the
standard suffices already mirror scanf syntax, I propose doing so and
creating h/H and hh/HH, for short and char respectively, and that would
allow the implementation of the macros mentioned above:
#define U8_C(c) ([[gnu::no_int_promotion]] c ## UHH)
#define S8_C(c) ([[gnu::no_int_promotion]] c ## HH)
#define U16_C(c) ([[gnu::no_int_promotion]] c ## UH)
#define S16_C(c) ([[gnu::no_int_promotion]] c ## H)
#define U32_C(c) ([[gnu::no_int_promotion]] c)
#define S32_C(c) ([[gnu::no_int_promotion]] c)
#define U64_C(c) ([[gnu::no_int_promotion]] c ## UL)
#define S64_C(c) ([[gnu::no_int_promotion]] c ## L)
Of course that is an example for amd64; other archs may use different
underlying types. I'm also not sure if I put the attribute in the
correct position, but the idea is there. I also ignore if attributes
may cause some problems with the preprocessor. Another idea would be to
add a completely new set of suffices for these types, that already imply
the attribute by definition (a new core language feature):
#define U8_C(c) c ## U8
#define S8_C(c) c ## S8
#define U16_C(c) c ## U16
#define S16_C(c) c ## S16
#define U32_C(c) c ## U32
#define S32_C(c) c ## S32
#define U64_C(c) c ## U64
#define S64_C(c) c ## S64
I personally prefer this last approach. It's simpler, and also allows
using the suffices directly, without needing to resort to macros that
make the code longer.
---
Maybe, it would also be nice to provide the ability for some programs to
completely turn off int promotion (and another one to override that
decission:
-fno-int-promotion
-fint-promotion
---
What do you think about this feature set? Does it sound interesting?
Thanks,
Alex
--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/