> -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of > [EMAIL PROTECTED] > Sent: Wednesday, January 24, 2007 12:19 AM > To: gcc@gcc.gnu.org > Subject: char should be signed by default > > GCC should treat plain char in the same fashion on all types of machines > (by default).
No. GCC should fit in within the environment it is running in. That's the whole point of ABI's. Even in the case of GNU/Linux where you had a clean slate at the beginning, there are now existing ABI's that you need to adhere to. > The ISO C standard leaves it up to the implementation whether a char > declared plain char is signed or not. This in effect creates two > alternative dialects of C. During the standards process we called those "don't chars". But there are other places where the standard explicitly doesn't say which alternative an implementation should choose (whether plain bitfields sign extend or not, whether ints are 32 or 64 bits, etc.). > The GNU C compiler supports both dialects; you can specify the signed > dialect with -fsigned-char and the unsigned dialect with > -funsigned-char. However, this leaves open the question of which dialect > to use by default. You use the ABI, which specifies whether chars and plain bitfields sign extend or not. > The preferred dialect makes plain char signed, because this is simplest. > Since int is the same as signed int, short is the same as signed short, > etc., it is cleanest for char to be the same. However, I've worked on machines that did not have a signed character instruction and you had to generate about 3 instructions to sign extend it. During the standards process of the original C standard (ANSI C89), Dennis Ritchie expressed an opinion that in hindsight, making chars signed was a bad idea, and that logically chars should be unsigned. This is because outside of the USA, people use 8-bit character sets, and you want to index into arrays. > Some computer manufacturers have published Application Binary Interface > standards which specify that plain char should be unsigned. It is a > mistake, however, to say anything about this issue in an ABI. This is > because the handling of plain char distinguishes two dialects of C. Both > dialects are meaningful on every type of machine. Whether a particular > object file was compiled using signed char or unsigned is of no concern > to other object files, even if they access the same chars in the same > data structures. No, this is the whole purpose of an ABI, to nail down all of these niggling details. If you use either -fsigned-char or -funsigned-char, you are essentially breaking the ABI. Now in the case of chars, usually it won't bite you, but it can if you include header files with structure fields written for the ABI. > A given program is written in one or the other of these two dialects. > The program stands a chance to work on most any machine if it is > compiled with the proper dialect. It is unlikely to work at all if > compiled with the wrong dialect. It depends on the program, and whether or not chars in the user's character set is sign extended (ie, in the USA, you likely won't notice a difference between the two if chars just hold character values). > Many users appreciate the GNU C compiler because it provides an > environment that is uniform across machines. These users would be > inconvenienced if the compiler treated plain char differently on certain > machines. And many users appreciate that GNU C fits in with the accepted practices on their machine. > Occasionally users write programs intended only for a particular machine > type. On these occasions, the users would benefit if the GNU C compiler > were to support by default the same dialect as the other compilers on > that machine. But such applications are rare. And users writing a > program to run on more than one type of machine cannot possibly benefit > from this kind of compatibility. > > There are some arguments for making char unsigned by default on all > machines. If, for example, this becomes a universal de facto standard, > it would make sense for GCC to go along with it. This is something to be > considered in the future. Unfortunately you are usually limited by the choices you made at the original implementation. Any change involves a massive flag day. > (Of course, users strongly concerned about portability should indicate > explicitly whether each char is signed or not. In this way, they write > programs which have the same meaning in both C dialects.) > > -- Michael Meissner AMD, MS 83-29 90 Central Street Boxborough, MA 01719