Re: Endianess attribute

Ken Raeburn Thu, 02 Jul 2009 06:37:25 -0700

On Jul 2, 2009, at 06:02, Paul Chavent wrote:

Hi.
I already have posted about the endianess attribute (http://gcc.gnu.org/ml/gcc/2008-11/threads.html#00146).
For some year, i really need this feature on c projects.
Today i would like to go inside the internals of gcc, and i wouldlike to implement this feature as an exercise.
You already prevent me that it would be a hard task (aliasing,etc.), but i would like to begin with basic specs.

As another gcc user (and, once upon a time, developer) who's had todeal with occasional byte ordering issues (mainly in networkprotocols), I can imagine some uses for something like this. But...

The spec could be :
- add an attribute (this description could change to be compatiblewith existing ones (diabdata for example))
 __attribute__ ((endian("big")))
 __attribute__ ((endian("lil")))

I would use "little" spelled out, rather than trying to use some cuteabbreviation. Whether it should be a string vs a C token like littleor __little__, I don't know, or particularly care.

- this attribute only apply to ints

It should at least be any integral type -- short to long long orwhatever TImode is. (Technically maybe char/QImode could be allowedbut it wouldn't have any effect on code generation.) I wouldn't jumpto the conclusion that it would be useless for pointers or floatingpoint values, but I don't know what the use cases for those would belike. However, I think that's a case where you could limit theimplementation initially, then expand the support later if needed,unlike the pointer issue below.

- this attribute only apply to variables declaration
- a pointer to this variable don't inherit the attribute (thisbehavior could change later, i don't know...)

This seems like a poor idea -- for one thing, my use cases wouldprobably involve something like pointers to unaligned big-endianintegers in allocated buffers, or maybe integer fields in packedstructures, again via pointers. (It looks like you may be trying tohandle the latter but not the former in the code you've got so far.)For another, one operation that may be used in code refactoringinvolves taking a bunch of code accessing some variable x (andpresumably similar blocks of code elsewhere that may use differentvariables), and pulling it out into a separate function that takes theaddress of the thing to be modified, passed in at the call sites tothe new function; if direct access to x and access via &x behavedifferently under this attribute, suddenly this formerly reasonabletransformation is unsafe -- and perhaps worst of all, the behaviorchange would be silent, since the compiler would have nothing tocomplain about.

Also, changing the behavior later means changing the interpretation ofsome code after deploying a compiler using one interpretation.Consider this on a 32-bit little-endian machine:


  unsigned int x __attribute__((endian("big"));
  *&x = 0x12345678;

In normal C code without this attribute, reading and writing "*&x" isthe same as reading and writing x. In your proposed version, "*&x"would use the little-endian interpretation, and "x" would use the big-endian interpretation, with nothing at the site of the executable codeto indicate that the two should be different. But an expression likethis can come up naturally when dealing with macro expansions. Or,someone using this attribute may write code depending on thatdifferent handling of "*&x" to deal with a selected byte order in somecases and native byte order in other cases. Then if you update thecompiler so that the attribute is passed along to the pointer type, inthe next release, suddenly the two cases behave the same -- breakingthe user's code when it worked under the previous compiler release.If you support taking the address of specified-endianness variables atall, you need to get the pointer handling right the first time around.

I would suggest that if you implement something like this, theattribute should be associated with the data type, not the variabledecl; so in the declaration above, x wouldn't be treated specially,but its type would be "big-endian unsigned int", a distinct type from"int" (even on a big-endian machine, probably).

The one advantage I see to associating the attribute with the declrather than the type is that I could write:


  uint32_t thing __attribute__((endian("big")));

rather than needing to figure out what uint32_t is in fundamental Ctypes and create a new typedef incorporating the underlying type plusthe attribute, kind of like how you can't write a declaration using"signed size_t". But that's a long-standing issue in C, and I don'tthink making the language inconsistent so you can fix the problem insome cases but not others is a very good idea.

- the test case is

 uint32_t x __attribute__ ((endian("big")));
 uint32_t * ptr_x = x;

Related to my suggestions above, I think this assignment should get awarning about incompatible pointer types.

Though, it brings up an interesting additional question -- shouldpointers to big-endian int and "normal" int be compatible on big-endian machines? Under C, "char", "unsigned char" and "signed char"are three distinct types, even though "char" must functionally be thesame as one of the others. I'd suggest that probably the normal typeshould be incompatible with both of the explicit-endian types, to helpmake the code type-safe and not dependent on the target machine's byteorder.

Ken

Re: Endianess attribute

Reply via email to