Re: A question about a possible build problem.
George R Goffe <[EMAIL PROTECTED]> writes: > tail: cannot open `+16c' for reading: No such file or directory > tail: cannot open `+16c' for reading: No such file or directory > tail: cannot open `+16c' for reading: No such file or directory You have the buggy version of coreutils that does not recognize old-fashioned options. You may be able to work around this with 'export _POSIX2_VERSION=199209' (or equivalent construct if you are using some other shell). zw
Re: Do CO++ signed types have modulo semantics?
Joe Buck wrote: Here's a simple example. int blah(int); int func(int a, int b) { if (b >= 0) { int c = a + b; int count = 0; for (int i = a; i <= c; i++) count++; blah(count); } } >>> >>> Mark Mitchell wrote: >>> I just didn't imagine that these kinds of opportunities came up very often. >>> (Perhaps that's because I routinely write code that can't be compiled well, >>> and so don't think about this situation. In particular, I often use unsigned >>> types when the underlying quantity really is always non-negative, and I'm >>> saddened to learn that doing that would result in inferior code.) >> >> However, it's not clear that an "optimization" which alters side effects >> which have subsequent dependants is ever desirable (unless of course the >> goal is to produce the same likely useless result as fast as some other >> implementation may, but without any other redeeming benefits). > > On Tue, Jun 28, 2005 at 09:32:53PM -0400, Paul Schlie wrote: >> As the example clearly shows, by assuming that signed overflow traps, when >> it may not, such an optimization actually alters the behavior of the code, > > There is no such assumption. Rather, we assume that overflow does not > occur about what happens on overflow. Then, for the case where overflow > does not occur, we get fast code. For many cases where overflow occurs > with a 32-bit int, our optimized program behaves the same as if we had a > wider int. In fact, the program will work as if we had 33-bit ints. Far > from producing a useless result, the optimized program has consistent > behavior over a broader range. To see this, consider what the program > does with a=MAX_INT, b=MAX_INT-1. My optimized version, which always > calls blah(b+1), which is what a 33-bit int machine would do. It does > not trap. > > Since you made an incorrect analysis, you draw incorrect conclusions. - fair enough, however it seems to me that assuming overflow does not occur and assuming overflows are trapped are logically equivalent? - But regardless, given that a and b are defined as arbitrary integer arguments, unless the value range of a+b is known to be less than INT_MAX, presuming otherwise may yield a different behavior for targets which wrap signed overflow (which is basically all of them). So unless by some magic the compiler can guess that the author of the code didn't actually desire the behavior previously produced by the compiler, the optimization will only likely produce a undesired result quicker, likely no one's benefit although neither behavior is necessarily portable. (and confess I don't understand the 33-bit int concept, as a 32-bit int target will still wrap or trap b+1 on overflow when computed, and for targets which wrap signed overflow it seems irrelevant, as the optimized result may not be consistent with the code's previously compiled results regardless, nor more correct, therefore seemingly most likely less useful.) Overall, I guess I still simply believe the the first rule of optimization is to preserve existing semantics unless explicitly authorized otherwise, and then only if accompanied with corresponding warnings for all potentially behavior altering assumptions applied.
The utility of standard's semantics for overflow
After reading many messages on the subject of overflow of signed integral values, I am inclined to think that the standard was wrong to say that int overflow is undefined. Of course this definition improves performance, but at what cost? At the cost of program stability? If the programmer wants a robust application, then casting to unsigned must be present for almost any usage of int. In the end, nobody will use int because it is very difficult to prove that it does not wrap for all possible input. And since the compiler will draw more and more conclusions from "assume overflow never happens", the effects will become more and more destructive. Checking for int overflow after the fact is much simpler than playing tricks with casting - before the fact. So people will simply move to unsigned and implement 2's complement themselves. To let the compilers gain the performance the standard intended, it had to introduce a new type or a modifier. Only for (nooverflow int i=0; i <= x ; ++i) ++count; would be transformed to count+=x+1 Normal 'int' will then have an implementation defined behavior of overflow. Unfortunately, the standard attempts at improving legacy code, and as a result breaks legacy code. This is unlike aliasing, when most lines of code out there, did not break aliasing rules (even before they were introduced). Int overflow is violated by most lines of code I have seen (it is very uncommon to find code that asserts no overflow before a+b). Also, the compiler itself may introduce new cases of overflow (e.g. after transforming a*b+a*c to a*(b+c), when run with a==0, and b,c == MAX_INT). I am not sure if this may create invalid assumptions in later compiler passes (today's gcc or later). I did not try to formally prove it either way. (I tend to think that invalid assumptions will be introduced by, e.g., a later VRP pass). I don't know what gcc can do to improve the situation, the standard is a thing gcc has to live with. Maybe start by trying to affect c++0x ? Michael
Re: The utility of standard's semantics for overflow
Michael Veksler wrote: If the programmer wants a robust application, then casting to unsigned must be present for almost any usage of int. If you have a variable in your program that is signed but must always be in the range of int, then int is the appropriate representation. If the pressure in a tank must be in the range -2**31 .. +2**31-1, it does not make the application more robust to ensure that if the pressure goes "over" the max, it suddenly turns negative. That's likely to be just as disastrous as any other behavior in this serious error situation. In practice of course, the pressure in this example is likely to be in a much more constrained range, and part of making a "robust application" will be to ensure that the value always remains in the required range. In a more expressive language like Ada, the corresponding type would be declared with the appropriate range, and an exception would be raised if the value is outside this range. In practice in a critical application, you are likely to not want any exceptions, so proving such a program exception free is one of the tasks that faces the applications programmer in writing a reliable program. See for example: http://www.praxis-cs.com/pdfs/Industrial_strength.pdf
named address spaces (update)
I continued to work on the support for named address spaces in GCC. I managed to move much of the managing code for the namespace attribute into the create funtions of tree nodes, so in most cases, only the language frontends need to assign and check the named address spaces. I moved to creation of the namespace list to the gen-modes program (all examples are taken out of my m68hc05 GCC port http://www.auto.tuwien.ac.at/~mkoegler/index.php/gcc ). A named address space is allocated with NAMESPACE(name) in the mode definition file of the port, eg: NAMESPACE(EEPROM); NAMESPACE(LORAM); (I know, that the NAMESPACE is not the correct naming, but named address space is a bit too long. Any suggestions?) This will result in the address spaces EEPROMspace and LORAMspace. By default the address spaces DEFAULTspace and NONEspace are generated. DEFAULTspace is the normal memory, NONEspace an invalid address space to catch errors. Index: genmodes.c === RCS file: /cvs/gcc/gcc/gcc/genmodes.c,v retrieving revision 1.17 diff -u -r1.17 genmodes.c --- genmodes.c 1 Jun 2005 02:55:50 - 1.17 +++ genmodes.c 19 Jun 2005 13:02:16 - @@ -104,6 +104,87 @@ static struct mode_adjust *adj_alignment; static struct mode_adjust *adj_format; +/* named address space */ +struct namespace_data +{ + struct namespace_data *next; /* next this class - arbitrary order */ + + const char *name;/* printable mode name without suffix */ + const char *file;/* file and line of definition, */ + unsigned int line; /* for error reporting */ +}; + +static struct namespace_data *namespaces = 0; +static int n_namespaces = 0; +static htab_t namespaces_by_name; + +/* Utility routines. */ +static inline struct namespace_data * +find_namespace (const char *name) +{ + struct mode_data key; + + key.name = name; + return (struct namespace_data *) htab_find (namespaces_by_name, &key); +} + +static struct namespace_data * +new_namespace (const char *name, + const char *file, unsigned int line) +{ + struct namespace_data *m; + + m = find_namespace (name); + if (m) +{ + error ("%s:%d: duplicate definition of namespace \"%s\"", +trim_filename (file), line, name); + error ("%s:%d: previous definition here", m->file, m->line); + return m; +} + + m = XNEW (struct namespace_data); + memset (m, 0, sizeof (struct namespace_data)); + m->name = name; + if (file) +m->file = trim_filename (file); + m->line = line; + + m->next = namespaces; + namespaces = m; + n_namespaces++; + + *htab_find_slot (namespaces_by_name, m, INSERT) = m; + + return m; +} + +static hashval_t +hash_namespace (const void *p) +{ + const struct namespace_data *m = (const struct namespace_data *)p; + return htab_hash_string (m->name); +} + +static int +eq_namespace (const void *p, const void *q) +{ + const struct namespace_data *a = (const struct namespace_data *)p; + const struct namespace_data *b = (const struct namespace_data *)q; + + return !strcmp (a->name, b->name); +} + +#define NAMESPACE(N) make_namespace(#N, __FILE__, __LINE__) + +static void +make_namespace (const char *name, + const char *file, unsigned int line) +{ + new_namespace (name, file, line); +} + + /* Mode class operations. */ static enum mode_class complex_class (enum mode_class c) @@ -769,6 +850,7 @@ { int c; struct mode_data *m, *first, *last; + struct namespace_data *n; printf ("/* Generated automatically from machmode.def%s%s\n", HAVE_EXTRA_MODES ? " and " : "", @@ -827,6 +909,23 @@ #if 0 /* disabled for backward compatibility, temporary */ printf ("#define CONST_REAL_FORMAT_FOR_MODE%s\n", adj_format ? "" :" const"); #endif + + puts ("\ +\n\ +enum namespace_type\n{\n"); + + for (n = namespaces; n; n = n->next) + { + int count_; + printf (" %sspace,%n", n->name, &count_); + printf ("%*s/* %s:%d */\n", 27 - count_, "", +trim_filename (n->file), n->line); + } + + puts ("\ + MAX_NAMESPACE,\n\ + NUM_NAMESPACES = MAX_NAMESPACE\n\ +};\n"); puts ("\ \n\ #endif /* insn-modes.h */"); @@ -866,6 +965,19 @@ } static void +emit_namespace_name (void) +{ + struct namespace_data *m; + + print_decl ("char *const", "namespace_name", "NUM_NAMESPACES"); + + for (m = namespaces; m; m = m->next) +printf (" \"%s\",\n", m->name); + + print_closer (); +} + +static void emit_mode_name (void) { int c; @@ -1190,6 +1302,7 @@ { emit_insn_modes_c_header (); emit_mode_name (); + emit_namespace_name (); emit_mode_class (); emit_mode_precision (); emit_mode_size (); @@ -1233,6 +1346,7 @@ } modes_by_name = htab_create_alloc (64, hash_mode, eq_mode, 0, xcalloc, free); + namespaces_by_name = htab_create_alloc (64, hash_namespace, eq_namespace, 0, xcalloc, free); create_modes (); complete_all_modes (); Index: machmode.de
Re: The utility of standard's semantics for overflow
> This is unlike aliasing, when most lines of code out there, > did not break aliasing rules (even before they were > introduced). Are you sure? IIRC -fstrict-aliasing was once enabled at -O2 and then disabled to give people more time to fix their code. -- Eric Botcazou
Re: GCC-4.1.0 size optimization bug for MIPS architecture...
On Tue, Jun 28, 2005 at 11:59:15PM -0500, Steven J. Hill wrote: > I have discovered what appears to be an optimization bug with '-Os' > in GCC-4.1.0 for the MIPS architecture. It appears that functions > which are declared as 'inline' are being ignored and instead turned > into to function calls ... Not a bug. The inline marker is merely suggestive. You told the compiler to optimize for size, and it is doing that. If you absolutely have to have the function inlined, then you need to use __attribute__((__always_inline__)). r~
Re: The utility of standard's semantics for overflow
Robert Dewar <[EMAIL PROTECTED]> wrote on 29/06/2005 11:42:07: > Michael Veksler wrote: > > > If the programmer wants a robust application, > > then casting to unsigned must be present for almost any > > usage of int. > > If you have a variable in your program that is signed but must > always be in the range of int, then int is the appropriate > representation. If the pressure in a tank must be in the > range -2**31 .. +2**31-1, it does not make the application > more robust to ensure that if the pressure goes "over" the max, it > suddenly turns negative. That's likely to be just as > disastrous as any other behavior in this serious error > situation. This is right to some extent, and I referred to it in my original mail. I claim that it is easier to write a code that checks these cases after the overflow, rather than before. I also claim that checking overflows (as implied by the standard) results in almost pure unsigned arithmetic, so why do we have signed arithmetic to begin with? > > In practice of course, the pressure in this example is > likely to be in a much more constrained range, and part > of making a "robust application" will be to ensure that > the value always remains in the required range. In a more > expressive language like Ada, the corresponding type would > be declared with the appropriate range, and an exception > would be raised if the value is outside this range. But this is not what C does. In C there is an assumption "bugs like int overflow may be transformed into any other possible bug". No exception need be raised. > In practice in a critical application, you are likely to > not want any exceptions, so proving such a program exception > free is one of the tasks that faces the applications programmer > in writing a reliable program. See for example: > > http://www.praxis-cs.com/pdfs/Industrial_strength.pdf > And as it is written in section 3 "... gives compiler writers substantial freedom to re-order expressions ..." and then "A more sound approach is to design a language so that these ordering effects cannot occur". This last quote can be implemented only by moving to modulo semantics.
Re: The utility of standard's semantics for overflow
Michael Veksler wrote: This is right to some extent, and I referred to it in my original mail. I claim that it is easier to write a code that checks these cases after the overflow, rather than before. I also claim that checking overflows (as implied by the standard) results in almost pure unsigned arithmetic, so why do we have signed arithmetic to begin with? er .. because we want 1 / (-1) to be -1 instead of 0? because we want -1 to be less than 1 etc. signed arithmetic is a bit different from unsigned arithmetic :-) But this is not what C does. In C there is an assumption "bugs like int overflow may be transformed into any other possible bug". No exception need be raised. Right, which is like Ada with checks suppressed, but in general in critical applications exceptions are turned off anyway, which leaves Ada in EXACTLY the same situation as C (overflow erroneous) And as it is written in section 3 "... gives compiler writers substantial freedom to re-order expressions ..." and then "A more sound approach is to design a language so that these ordering effects cannot occur". This last quote can be implemented only by moving to modulo semantics. Overflow and ordering effects are quite different. And if you want to avoid undefined for overflow, modulo semantics is only one of many ways of doing this (none of which has been suggested or adopted by the C standards committee -- when you find yourself disagreeing with this committee in this way, you need to really do some studying to find out why before deciding they were wrong).
Re: The utility of standard's semantics for overflow
On Wed, 2005-06-29 at 11:32 +0300, Michael Veksler wrote: > This is unlike aliasing, when most lines of code out there, > did not break aliasing rules (even before they were > introduced). Int overflow is violated by most lines of > code I have seen (it is very uncommon to find code that > asserts no overflow before a+b). Believe it or not most uses of integral values are made in such a way that overflow is the exception rather than the rule (at least on general computers where the int arithmetic and the memory is cheap, in embeded system the situtation might differ somewhat even thought I have doubts if the embedded processors are of 32b class, for 8/16b processor the story is of course different). In most cases, the programmers choose the type to allow for all the standard cases and do not look at the possibility of overflow. How many loops are written using ints or unsigned for very small integers where even a short might be sufficient Untill now, there is a widespread assumption that 2^32 or 2^31 is equivalent to infinity for most purposes, because those numbers will never be reached (remember the unix clock ticks within a 32 bit unsigned, which still has a few (counted) years to go) in any practical situation (of course if a user wants to break the code and has switches to provide initial values. So unless you do arithmetics or combinatorics, most of the uses of "wide" (ie > 32b) integral types semantically (ie in the programmer's mind) assume that overflow does not happen in practise in the program.
Re: GCC-4.1.0 size optimization bug for MIPS architecture...
"Steven J. Hill" <[EMAIL PROTECTED]> writes: > I have discovered what appears to be an optimization bug with '-Os' > in GCC-4.1.0 for the MIPS architecture. It appears that functions > which are declared as 'inline' are being ignored and instead turned > into to function calls which is breaking the dynamic linker loader > for uClibc on MIPS. You should mark the functions that absolutely need to be inlined with __attribute__((always_inline)). Or just -Dinline="__attribute__((always_inline))" -Andi
Re: The utility of standard's semantics for overflow
Eric Botcazou <[EMAIL PROTECTED]> wrote on 29/06/2005 11:49:24: > > This is unlike aliasing, when most lines of code out there, > > did not break aliasing rules (even before they were > > introduced). > > Are you sure? IIRC -fstrict-aliasing was once enabled at -O2 and then > disabled to give people more time to fix their code. > Yes, I am pretty sure. I said "most lines of code", not "most applications", to indicate the density difference. If each line of code has, e.g., 1% chance to violate overflow rules, and 0.01% chance to violate aliasing rules, then for 10KLOC, you have: - probability of 63% to violate aliasing rules - and 100% (99.999 with 43 nines) to violate overflow rules. So most chances an a small application you will have at least one violation for each rule. However, the number of violation will be substantially different between the two: The mean for number of violations for 10KLOC is: - aliasing: 1 - overflow: 100 The numbers on probabilities are concocted, but they give the general feeling of what I meant. Aliasing problems may happen only when either: 1. Using type casts of pointers. 2. In C: implicit void* conversions. 3. Implicit pointer conversions through union. These are less common (LOC percentage) in most code than i++, or a+b Algorithm inputs are rarely validated for potential future overflow, but outputs are normally validated for sane results. If you are not supposed to produce a negative result, you are going to catch it no later than at the beginning of your next stage (assuming you have at least some sanity checks). One more note: Programming languages should not only be sound, they should fulfill a wide range of needs of the development cycle. One of the needs is the ease of writing correct code and verifying it. By defining a language rule that makes C/C++ very fast at the cost of x2 more bugs is an unacceptable design decision. Most of these new bugs will be latent bugs, and users will normally not encounter them, but they will be there waiting for new bad inputs and an improved compiler inteligence. Michael
Re: The utility of standard's semantics for overflow
Theodore Papadopoulo wrote: So unless you do arithmetics or combinatorics, most of the uses of "wide" (ie > 32b) integral types semantically (ie in the programmer's mind) assume that overflow does not happen in practise in the program. I think that's probably right. And in the context of this discussion, what does happen to most programs if an int used with this assumption overflows? Answer, it's probably a bug, and it is unlikely that silent wrapping will correspond to correct behavior.
Re: named address spaces (update)
> Limitations are: > * All pointer have Pmode size. The ability to have various pointer widths would be nice too.
Re: GCC-4.1.0 size optimization bug for MIPS architecture...
Richard Henderson wrote: Not a bug. The inline marker is merely suggestive. You told the compiler to optimize for size, and it is doing that. If you absolutely have to have the function inlined, then you need to use __attribute__((__always_inline__)). This makes sense, but I also have a binutils-2.16.1, gcc-3.4.4 and the same uClibc code and gcc-3.4.4 does produce a valid dynamic loader with '-Os'. When looking at the dissassembly for that, the _syscall1 and other functions are inlined. So, apparently things have changed with regards to inling from the gcc-3.4.x series to gcc-4.1.x? I can upload the binaries for the gcc-3.4.4 produced version if needed. -Steve
Re: GCC-4.1.0 size optimization bug for MIPS architecture...
On Wed, 2005-06-29 at 07:44 -0500, Steven J. Hill wrote: > Richard Henderson wrote: > > > > Not a bug. The inline marker is merely suggestive. You told > > the compiler to optimize for size, and it is doing that. > > > > If you absolutely have to have the function inlined, then you > > need to use __attribute__((__always_inline__)). > > > This makes sense, but I also have a binutils-2.16.1, gcc-3.4.4 > and the same uClibc code and gcc-3.4.4 does produce a valid > dynamic loader with '-Os'. When looking at the dissassembly > for that, the _syscall1 and other functions are inlined. So, > apparently things have changed with regards to inling from the > gcc-3.4.x series to gcc-4.1.x? The inlining heuristics have changed a bit between those two, and in addition, because of the new intermediate representation, the inliner will make different decisions than it used to. In other words, you were probably just getting lucky with 3.x, and your luck has run out :)
Re: Do CO++ signed types have modulo semantics?
> Overall, I guess I still simply believe the the first rule of optimization > is to preserve existing semantics unless explicitly authorized otherwise, > and then only if accompanied with corresponding warnings for all potentially > behavior altering assumptions applied. It is, you just believe that semantics exist where they don't.
RE: The utility of standard's semantics for overflow
Original Message >From: Robert Dewar >Sent: 29 June 2005 13:14 > Theodore Papadopoulo wrote: > >> So unless you do arithmetics or combinatorics, most of the uses of >> "wide" (ie > 32b) integral types semantically (ie in the programmer's >> mind) assume that overflow does not happen in practise in the program. > > I think that's probably right. And in the context of this discussion, > what does happen to most programs if an int used with this assumption > overflows? Answer, it's probably a bug, and it is unlikely that > silent wrapping will correspond to correct behavior. In fact, doesn't this suggest that in _most_ circumstances, *saturation* would be the best behaviour? And of course, since it's undefined, a compiler could entirely legitimately use saturating instructions (on a platform that supports them) for signed int arithmetic. cheers, DaveK -- Can't think of a witty .sigline today
Re: named address spaces (update)
On Wed, Jun 29, 2005 at 10:47:40AM +0200, Martin Koegler wrote: > NAMESPACE(EEPROM); > NAMESPACE(LORAM); > > (I know, that the NAMESPACE is not the correct naming, but named > address space is a bit too long. Any suggestions?) ADDRSPACE? Named is implicit, since you provided a name. BTW, you may get more comments if all of your text is before all of the patches; I nearly missed three quarters of your message :-) -- Daniel Jacobowitz CodeSourcery, LLC
Re: The utility of standard's semantics for overflow
Dave Korn wrote: In fact, doesn't this suggest that in _most_ circumstances, *saturation* would be the best behaviour? Actually I think a handlable exception, a la Ada, is the best solution. Whether saturation is appropriate is really problem dependent. If you are counting the number of primes less than bla, then saturating is not likely to be helpful. If you are computing some real quantity with minimal precision, saturation might be helpful And of course, since it's undefined, a compiler could entirely legitimately use saturating instructions (on a platform that supports them) for signed int arithmetic. yes, of course, this would indeed be legitimate, and could be an option. Almost anything is reasonable as an option. I think it would be quite reasonable to have options for trapping (-ftrapv working nicely :-) and for wrap around. But I would not normally make either of these the default in C. cheers, DaveK
Re: Do CO++ signed types have modulo semantics?
On Tue, 28 Jun 2005, Joe Buck wrote: There is no such assumption. Rather, we assume that overflow does not occur about what happens on overflow. Then, for the case where overflow does not occur, we get fast code. For many cases where overflow occurs with a 32-bit int, our optimized program behaves the same as if we had a wider int. In fact, the program will work as if we had 33-bit ints. Far from producing a useless result, the optimized program has consistent behavior over a broader range. To see this, consider what the program does with a=MAX_INT, b=MAX_INT-1. My optimized version, which always calls blah(b+1), which is what a 33-bit int machine would do. It does not trap. This point about 33-bit machines is interesting because it raises an optimisation scenario that hasn't been mentioned so far. Consider doing 32-bit integer arithmetic on 64-bit machines which only support 64-bit arithmetic instructions. On such machines you have to use sign-extensions or zero-extensions after 64-bit operations to ensure wrap-around semantics (unless you can prove that the operation will not overflow the bottom 32 bits, or that the value will not be used in a way that exposes the fact you're using 64-bit arithmetic). But -- if I have understood correctly -- if the 32-bit values are signed integers, a C compiler for such a machine could legitimately omit the sign-extension. Whereas for unsigned 32-bit values the C standard implies that you must zero-extend afterwards. I hadn't realised that. This has been an enlightening thread :) Nick
Re: Do CO++ signed types have modulo semantics?
> From: Daniel Berlin <[EMAIL PROTECTED]> >> Overall, I guess I still simply believe the the first rule of optimization >> is to preserve existing semantics unless explicitly authorized otherwise, >> and then only if accompanied with corresponding warnings for all potentially >> behavior altering assumptions applied. > > It is, you just believe that semantics exist where they don't. - Agreed, I do believe that an implementation should strive to define likely useful and consistent behaviors for those which the standard has defined as being undefined. As an implementation which chooses not define a specific behavior for one specified as being undefined, which it may, as an undefined behavior may invoke any behavior, then it remains undefined. And may correspondingly yield an truly undefined resulting behavior if ever presumed to differ from the factual behavior which the compiler would otherwise produce; therefore of likely dubious value.
Re: Do CO++ signed types have modulo semantics?
> "Nicholas" == Nicholas Nethercote <[EMAIL PROTECTED]> writes: Nicholas> This point about 33-bit machines is interesting because it Nicholas> raises an optimisation scenario that hasn't been mentioned Nicholas> so far. Nicholas> Consider doing 32-bit integer arithmetic on 64-bit machines Nicholas> which only support 64-bit arithmetic instructions. On such Nicholas> machines you have to use sign-extensions or zero-extensions Nicholas> after 64-bit operations to ensure wrap-around semantics Nicholas> (unless you can prove that the operation will not overflow Nicholas> the bottom 32 bits, or that the value will not be used in a Nicholas> way that exposes the fact you're using 64-bit arithmetic). Nicholas> But -- if I have understood correctly -- if the 32-bit Nicholas> values are signed integers, a C compiler for such a machine Nicholas> could legitimately omit the sign-extension. Not necessarily. Not on MIPS, for example, because the spec requires that signed values in registers must be sign extended or the operations are undefined (not undefined for overflow -- undefined all the time). paul
Re: named address spaces (update)
Martin Koegler wrote: I continued to work on the support for named address spaces in GCC. I managed to move much of the managing code for the namespace attribute into the create funtions of tree nodes, so in most cases, only the language frontends need to assign and check the named address spaces. I moved to creation of the namespace list to the gen-modes program (all examples are taken out of my m68hc05 GCC port http://www.auto.tuwien.ac.at/~mkoegler/index.php/gcc ). A named address space is allocated with NAMESPACE(name) in the mode definition file of the port, eg: NAMESPACE(EEPROM); NAMESPACE(LORAM); (I know, that the NAMESPACE is not the correct naming, but named address space is a bit too long. Any suggestions?) Because it's a space of addresses (as opposed to a space of names), how about this?: ADDRESSSPACE (EEPROM); ADDRESSSPACE (LORAM); Eric
Re: named address spaces (update)
DJ Delorie wrote: Limitations are: * All pointer have Pmode size. The ability to have various pointer widths would be nice too. I would agree with this too. It would be very useful, e.g. for the AVR port. Eric
Pro64-based GPLed compiler
Hello everyone, In 2000, SGI released a GPLed compiler suite. http://gcc.gnu.org/ml/gcc/2000-05/threads.html#00632 http://web.archive.org/www.sgi.com/newsroom/press_releases/2000/may/linux-ia64.html I've taken PathScale's source tree (they've removed the IA-64 code generator, and added an x86/AMD64 code generator), and tweaked the Makefiles. I thought some of you might want to take a look at the compiler. http://www-rocq.inria.fr/~gonzalez/vrac/open64-alchemy-src.tar.bz2 Disclaimer: this release has received *very* little testing. Some might cringe when they see the way I hacked the Makefile structure. (I welcome all comments and suggestions.) I haven't managed to build the IPA (inter-procedure analyzer) module. -- Regards, Marc
Re: The utility of standard's semantics for overflow
> Yes, I am pretty sure. I said "most lines of code", not "most > applications", > to indicate the density difference. If each line of code has, e.g., 1% > chance > to violate overflow rules, and 0.01% chance to violate aliasing rules, > then for 10KLOC, you have: > - probability of 63% to violate aliasing rules > - and 100% (99.999 with 43 nines) to violate overflow rules. Then there are different "most"s because, if each line of code has 1% chance to violate overflow rules, "most" of them don't for reasonable definitions of "most". -- Eric Botcazou
Re: Do C++ signed types have modulo semantics?
On Tuesday 28 June 2005 14:09, Steven Bosscher wrote: > On Tuesday 28 June 2005 14:02, Ulrich Weigand wrote: > > Steven Bosscher wrote: > > > Anyway, I've started a SPEC run with "-O2" vs. "-O2 -fwrapv". Let's > > > see how big the damage would be ;-) > > > > Please make sure to include a 64-bit target, where it actually makes any > > difference. (I recall performance degradations of 20-30% in some > > SPECfp cases from getting induction variable reduction wrong ...) > > Yeah, I'm testing on an AMD64 box, both 64 bits and 32 bits. And the numbers are, only those tests that build in both cases, left is base == "-O2", right is peak == "-O2 -fwrapv: 32-bits 64-bits 164.gzip733 733 819 820 175.vpr 703 707 718 719 176.gcc 886 892 977 955 181.mcf 527 527 415 414 186.crafty 877 893 13451351 253.perlbmk 941 944 971 975 254.gap 769 759 784 782 255.vortex 1094108611531122 256.bzip2 708 707 786 782 300.twolf 10371030834 830 168.wupwise 762 755 865 829 171.swim695 679 696 699 172.mgrid 395 394 741 562 173.applu 590 588 693 656 177.mesa701 693 10551058 179.art 479 484 930 912 183.equake 825 834 840 808 188.ammp716 723 877 862 200.sixtrack446 456 434 414 Note that (for unknown reasons) peak is always ~.5% higher than base on this tester even if you compare identical compilers. So 1% wins are not really interesting. What is interesting is the higher score for crafty with -fwrapv for the 32-bits case. The rest is in the noise for 32-bits. For 64-bits, gcc itself takes a hit and so do vortex and all the SPECfp benchmarks. See especially mgrid. Gr. Steven
Re: Do CO++ signed types have modulo semantics?
On Wed, Jun 29, 2005 at 03:40:11AM -0400, Paul Schlie wrote: > > Since you made an incorrect analysis, you draw incorrect conclusions. > > - fair enough, however it seems to me that assuming overflow does not occur > and assuming overflows are trapped are logically equivalent? No. Assuming overflows are trapped puts an unnecessary constraint on the compiler. It prevents treating integer arithmetic as associative, for example (as that might introduce a new trap). > Overall, I guess I still simply believe the the first rule of optimization > is to preserve existing semantics unless explicitly authorized otherwise, > and then only if accompanied with corresponding warnings for all potentially > behavior altering assumptions applied. But C does not define the semantics of integer overflow, so there are no existing semantics to preserve.
Statement expression with function-call variable lifetime
Hello, I'm implementing a tiny vfork/exit implementation using setjmp and longjmp. Since the function calling setjmp can't return (if you still want to longjmp to its jmp_buf) I implemented vfork using a statement expression macro. Here's my implementation of vfork. jmp_buf *vfork_jmp_buf; #define vfork() ({ \ int setjmp_ret; \ jmp_buf *prev_jmp_buf = vfork_jmp_buf, new_jmp_buf; \ vfork_jmp_buf = &new_jmp_buf; \ if( (setjmp_ret = setjmp(*vfork_jmp_buf)) != 0 ) \ vfork_jmp_buf = prev_jmp_buf; \ setjmp_ret; \ }) Unfortunately, after tracing a nasty bug I found that the same problem applies to leaving a statement expression as does returning from a function. The storage allocated for prev_jmp_buf and new_jmp_buf is unallocated as soon as we leave the scope of the statement expression. gcc quickly reuses that stack space later in the same function, overwriting the saved jmp_buf. Does anyone have a suggestion how I can allocate some storage space here for prev_jmp_buf and new_jmp_buf that will last the lifetime of the function call instead of the lifetime of the statement expression macro? My best idea was to use alloca, but it wouldn't look pretty. Can someone confirm that memory allocated with alloca would last the lifetime of the function call, and not the lifetime of the statement expression? Please cc me in your reply. Thanks, Shaun
Re: Statement expression with function-call variable lifetime
On Wed, Jun 29, 2005 at 10:34:20AM -0700, Shaun Jackman wrote: > the statement expression macro? My best idea was to use alloca, but it > wouldn't look pretty. Can someone confirm that memory allocated with > alloca would last the lifetime of the function call, and not the > lifetime of the statement expression? Yes, that's correct (and the only way to do this). -- Daniel Jacobowitz CodeSourcery, LLC
Re: Pro64-based GPLed compiler
Marc Gonzalez-Sigler wrote: Hello everyone, I've taken PathScale's source tree (they've removed the IA-64 code generator, and added an x86/AMD64 code generator), and tweaked the Makefiles. I thought some of you might want to take a look at the compiler. http://www-rocq.inria.fr/~gonzalez/vrac/open64-alchemy-src.tar.bz2 This reference doesn't work. The directory vrac looks empty. Vlad
Re: Do C++ signed types have modulo semantics?
On Wed, 2005-06-29 at 18:46 +0200, Steven Bosscher wrote: > On Tuesday 28 June 2005 14:09, Steven Bosscher wrote: > > On Tuesday 28 June 2005 14:02, Ulrich Weigand wrote: > > > Steven Bosscher wrote: > > > > Anyway, I've started a SPEC run with "-O2" vs. "-O2 -fwrapv". Let's > > > > see how big the damage would be ;-) > > > > > > Please make sure to include a 64-bit target, where it actually makes any > > > difference. (I recall performance degradations of 20-30% in some > > > SPECfp cases from getting induction variable reduction wrong ...) > > > > Yeah, I'm testing on an AMD64 box, both 64 bits and 32 bits. > > And the numbers are, only those tests that build in both cases, > left is base == "-O2", right is peak == "-O2 -fwrapv: None of these numbers actually include real loop transformations that often take advantage of wrapping semantics, so it's not interesting. In fact, i'm surprised that the numbers are different *at all*. It's only heavy duty reordering things like vectorization, linear loop transforms, distribution, you name it, etc, that really want to know the number of iterations in a loop, etc that it makes any significant difference. So i would advise anyone arguing against turning on -fwrapv simply because it doesn't seem to hurt us at O2. And i'll again point out that the exact opposite is the default in every other compiler i'm aware of. XLC at O2 has qstrict_induction on by default (the equivalent), and warns the user when it sees a loop where it's making the assumption[1] The XLC people told me since they turned this on in 1998, they have had one real piece of code where it actually mattered, and that was a char induction variable. ICC does the same, though i don't think it bothers to warn. Open64 does the same, but no warning. Not sure about Sun CC, but i'd be very surprised if they did it. Personally, i only care about wrapping for induction variables. If you guys want to leave regular variables to do whatever, fine. But if you turn on this wrapping behavior, you have more or less given up any chance we have of performing heavy duty loop transforms on most real user code, even though the user doesn't actually give two shits about wrapping. --Dan [1] The manual points out the performance degradation using the option is quite severe. I've sent actual numbers privately to some people, but i don't want to make them public because i'm not sure the xlc folks would like it.
Re: Do C++ signed types have modulo semantics?
On Wednesday 29 June 2005 20:01, Daniel Berlin wrote: > So i would advise anyone arguing against turning on -fwrapv simply > because it doesn't seem to hurt us at O2. wtf, "doesn't seem to hurt us at -O2". Look again at the 64 bits numbers! Losing 5% on the fp benchmarks is a serious regression. Even without exercising the heavy-ammo loop optimizers, -fwrapv is a serious performance-hurter. Gr. Steven
Re: The utility of standard's semantics for overflow
On Wed, Jun 29, 2005 at 02:12:40PM +0100, Dave Korn wrote: > In fact, doesn't this suggest that in _most_ circumstances, *saturation* > would be the best behaviour? No, you'd be killing most emulators and a lot of virtual machine implementations. char x = (char)((unsigned char)y + (unsigned char)z) is too ugly to live. OG.
Re: Do C++ signed types have modulo semantics?
Steven Bosscher wrote: On Wednesday 29 June 2005 20:01, Daniel Berlin wrote: So i would advise anyone arguing against turning on -fwrapv simply because it doesn't seem to hurt us at O2. wtf, "doesn't seem to hurt us at -O2". Look again at the 64 bits numbers! Losing 5% on the fp benchmarks is a serious regression. Even without exercising the heavy-ammo loop optimizers, -fwrapv is a serious performance-hurter. Still these figures are very interesting for Ada, where I am afraid the kludge we do in the front end for required overflow checking is much more expensive than these figures indicate. Of course this will be *highly* target dependent.
Re:
Tom Tromey wrote: I'm checking this in on the trunk. If I remember I'll put it on the 4.0 branch once it reopens (there are a fair number of patches pending for it ... I hope it reopens soon). Mark, The extended freeze of the 4.0 branch is making things difficult for libgcj because we have a large backlog of runtime patches which we are unable to apply at this time. The longer the freeze continues, the more difficult it becomes for us to keep track of what needs applying and increases the chances that something will be forgotten, resulting in problems and wasted time further down the line. Could we get an exemption from the freeze rules for low-risk, runtime only libgcj fixes as determined by the libgcj maintainers? Alternatively, could we rethink our release policy to ensure that the duration of freezes is minimized in the future? I do think that a hard freeze in the days leading up to a release is useful and important, but keeping it in place for weeks just because a couple of bugs remain unfixed doesn't seem helpful. Thanks Bryce
Re: Do C++ signed types have modulo semantics?
On Wed, 29 Jun 2005, Daniel Berlin wrote: So i would advise anyone arguing against turning on -fwrapv simply because it doesn't seem to hurt us at O2. And i'll again point out that the exact opposite is the default in every other compiler i'm aware of. Sorry, I couldn't parse those sentences... are you saying the -fwrapv behaviour (ie. wrap-on-signed-integer-overflow) is the default or not the default in these other compilers? XLC at O2 has qstrict_induction on by default (the equivalent), and warns the user when it sees a loop where it's making the assumption[1] Which assumption? The XLC people told me since they turned this on in 1998, they have had one real piece of code where it actually mattered, and that was a char induction variable. ICC does the same, though i don't think it bothers to warn. Open64 does the same, but no warning. Not sure about Sun CC, but i'd be very surprised if they did it. Personally, i only care about wrapping for induction variables. If you guys want to leave regular variables to do whatever, fine. Are you saying you don't want induction variables to have to wrap, but you don't care about non-induction variables? Sorry if I'm being dim... I think it's excellent you're discussing what other compilers do, I just can't understand what you've said as expressed :) Nick
Re:
Bryce McKinlay wrote: Mark, Could we get an exemption from the freeze rules for low-risk, runtime only libgcj fixes as determined by the libgcj maintainers? I don't think we want to do that. First, we're close to a release. We've been waiting on one fix since Friday, and Jeff Law has promised to fix it today. Second, the whole point of freezes is to ensure stability. We're going to RC3 on this release because we've found bad problems in RC1 and RC2. If a change were to be checked in, for whatever reason, happens to break something, then we need an RC4, and everybody loses. Alternatively, could we rethink our release policy to ensure that the duration of freezes is minimized in the future? I do think that a hard freeze in the days leading up to a release is useful and important, but keeping it in place for weeks just because a couple of bugs remain unfixed doesn't seem helpful. Obviously, I'd like to have a shorter freeze period. This is the longest pre-release freeze I can remember since I've been running releases. That reflects the fact that a lot of critical problems were uncovered in 4.0.0, including some during the 4.0.1 release process. The way to get a shorter freeze period is to have fewer bugs and to fix them more quickly. I think that this release is somewhat exceptional in that after we released 4.0.0, a lot of people started using a lot of new technology, and, unsurprisingly, there are more bugs in 4.0.0 than in the average release. Please hang in there. Thanks, -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: Pro64-based GPLed compiler
Vladimir Makarov wrote: > Marc Gonzalez-Sigler wrote: > >> I've taken PathScale's source tree (they've removed the IA-64 code >> generator, and added an x86/AMD64 code generator), and tweaked the >> Makefiles. >> >> I thought some of you might want to take a look at the compiler. >> >> http://www-rocq.inria.fr/~gonzalez/vrac/open64-alchemy-src.tar.bz2 > > This reference doesn't work. The directory vrac looks empty. Fixed. I'll never understand how AFS ACLs work ;-( This message was sent using IMP, the Internet Messaging Program.
Question on tree-ssa-loop-ivopts.c:constant_multiple_of
Isn't it the case that *any* conversion can be stripped for the purpose of this routine? I get an ICE compiling the Ada RTS a-strfix.adb because of that. The following seems to fix it, but is it right? *** tree-ssa-loop-ivopts.c 26 Jun 2005 21:21:32 - 2.82 --- tree-ssa-loop-ivopts.c 29 Jun 2005 21:38:29 - *** constant_multiple_of (tree type, tree to *** 2594,2599 bool negate; ! STRIP_NOPS (top); ! STRIP_NOPS (bot); if (operand_equal_p (top, bot, 0)) --- 2594,2603 bool negate; ! /* For determining the condition above, we can ignore all conversions, not ! just those that don't change the mode, so can't use STRIP_NOPS here. */ ! while (TREE_CODE (top) == NOP_EXPR || TREE_CODE (top) == CONVERT_EXPR) ! top = TREE_OPERAND (top, 0); ! while (TREE_CODE (bot) == NOP_EXPR || TREE_CODE (bot) == CONVERT_EXPR) ! bot = TREE_OPERAND (bot, 0); if (operand_equal_p (top, bot, 0))
Re: Statement expression with function-call variable lifetime
On 6/29/05, Daniel Jacobowitz <[EMAIL PROTECTED]> wrote: > On Wed, Jun 29, 2005 at 10:34:20AM -0700, Shaun Jackman wrote: > > the statement expression macro? My best idea was to use alloca, but it > > wouldn't look pretty. Can someone confirm that memory allocated with > > alloca would last the lifetime of the function call, and not the > > lifetime of the statement expression? > > Yes, that's correct (and the only way to do this). Great! I've rewritten the vfork macro to use alloca. Here's the big catch, the pointer returned by alloca is stored on the stack, and it gets trashed upon leaving the statement expression. Where can I store the pointer returned by alloca? Please don't say allocate some memory for the pointer using alloca. My best idea so far was to make the pointer returned by alloca (struct context *c, see below) static, so that each time the macro is invoked, a little static memory is put aside for it. I think this would work, but it doesn't seem to me to be the best solution. Here's the revised vfork macro that uses alloca: jmp_buf *vfork_jmp_buf; struct context { jmp_buf *prev_jmp_buf; jmp_buf new_jmp_buf; }; # define vfork() ({ \ int setjmp_ret; \ struct context *c = alloca(sizeof *c); \ c->prev_jmp_buf = vfork_jmp_buf; \ vfork_jmp_buf = &c->new_jmp_buf; \ if( (setjmp_ret = setjmp(*vfork_jmp_buf)) != 0 ) \ vfork_jmp_buf = c->prev_jmp_buf; \ setjmp_ret; \ }) Cheers, Shaun
Add clog10 to builtins.def, round 2
The fortran front-end needs to recognize clog10, clog10f and clog10l a proper built-ins. Attached patch tries to add them to clog10, under a new category: DEF_EXT_C99RES_BUILTIN (as suggested by jsm28). Can someone review this? Is it OK? Thanks, François-Xavier Index: gcc/builtins.def === RCS file: /cvsroot/gcc/gcc/gcc/builtins.def,v retrieving revision 1.104 diff -u -3 -p -r1.104 builtins.def --- gcc/builtins.def 25 Jun 2005 01:59:13 - 1.104 +++ gcc/builtins.def 29 Jun 2005 21:25:38 - @@ -119,6 +119,13 @@ Software Foundation, 51 Franklin Street, DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE, \ true, true, !flag_isoc99, ATTRS, TARGET_C99_FUNCTIONS, true) +/* Builtin that C99 reserve the name for future use. We can still recognize + the builtin in C99 mode but we can't produce it implicitly. */ +#undef DEF_EXT_C99RES_BUILTIN +#define DEF_EXT_C99RES_BUILTIN(ENUM, NAME, TYPE, ATTRS) \ + DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE, \ + true, true, true, ATTRS, false, true) + /* Allocate the enum and the name for a builtin, but do not actually define it here at all. */ #undef DEF_BUILTIN_STUB @@ -436,6 +443,9 @@ DEF_C99_BUILTIN(BUILT_IN_CIMAGL, DEF_C99_BUILTIN(BUILT_IN_CLOG, "clog", BT_FN_COMPLEX_DOUBLE_COMPLEX_DOUBLE, ATTR_MATHFN_FPROUNDING) DEF_C99_BUILTIN(BUILT_IN_CLOGF, "clogf", BT_FN_COMPLEX_FLOAT_COMPLEX_FLOAT, ATTR_MATHFN_FPROUNDING) DEF_C99_BUILTIN(BUILT_IN_CLOGL, "clogl", BT_FN_COMPLEX_LONGDOUBLE_COMPLEX_LONGDOUBLE, ATTR_MATHFN_FPROUNDING) +DEF_EXT_C99RES_BUILTIN (BUILT_IN_CLOG10, "clog10", BT_FN_COMPLEX_DOUBLE_COMPLEX_DOUBLE, ATTR_MATHFN_FPROUNDING) +DEF_EXT_C99RES_BUILTIN (BUILT_IN_CLOG10F, "clog10f", BT_FN_COMPLEX_FLOAT_COMPLEX_FLOAT, ATTR_MATHFN_FPROUNDING) +DEF_EXT_C99RES_BUILTIN (BUILT_IN_CLOG10L, "clog10l", BT_FN_COMPLEX_LONGDOUBLE_COMPLEX_LONGDOUBLE, ATTR_MATHFN_FPROUNDING) DEF_C99_BUILTIN(BUILT_IN_CONJ, "conj", BT_FN_COMPLEX_DOUBLE_COMPLEX_DOUBLE, ATTR_CONST_NOTHROW_LIST) DEF_C99_BUILTIN(BUILT_IN_CONJF, "conjf", BT_FN_COMPLEX_FLOAT_COMPLEX_FLOAT, ATTR_CONST_NOTHROW_LIST) DEF_C99_BUILTIN(BUILT_IN_CONJL, "conjl", BT_FN_COMPLEX_LONGDOUBLE_COMPLEX_LONGDOUBLE, ATTR_CONST_NOTHROW_LIST)
Re: Re:
On Wed, 2005-06-29 at 13:39 -0700, Mark Mitchell wrote: > First, we're close to a release. We've been waiting on one fix since > Friday, and Jeff Law has promised to fix it today. The fix is written, I'm just waiting on test results. Someone mucked up the hpux11 target description (*) which caused libstdc++ to fail to build and I had to re-bootstrap after working around that bit of lameness. I'd be surprised if I had results tonight. More likely early tomorrow assuming the machines keep running (I killed one in Raleigh yesterday to go along with the one in my basement...) Jeff (*) The hpux11 target description assumes that the linker shipped with hpux11 supports +init as an option. Well, that may work OK for some versions of hpux11, but it certainly doesn't work for hpux11.00 with the 990903 version of the linker. Grrr.
Re:
Mark Mitchell <[EMAIL PROTECTED]> writes: > Bryce McKinlay wrote: > > > Mark, > > > Could we get an exemption from the freeze rules for low-risk, > > runtime only libgcj fixes as determined by the libgcj maintainers? > > I don't think we want to do that. > > First, we're close to a release. We've been waiting on one fix since > Friday, and Jeff Law has promised to fix it today. > > Second, the whole point of freezes is to ensure stability. We're > going to RC3 on this release because we've found bad problems in RC1 > and RC2. If a change were to be checked in, for whatever reason, > happens to break something, then we need an RC4, and everybody loses. This kind of conflict is solved in version control systems by the use of branches...
Scheduler questions (related to PR17808)
Hi, I have a question about the scheduler. Forgive me if I'm totally missing the point here, this scheduling business is not my thing ;-) Consider the following snippet that I've derived from PR17808 with a few hacks in the compiler to renumber insns and dump RTL with all the dependencies before scheduling. There is a predicate register that gets set, then a few cond_exec insns, then a jump, and finally a set using some of the registers that may be set by the cond_exec insns. This is the RTL before scheduling: (insn 8 7 9 0 (set (reg:BI 262 p6 [353]) (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348]) (const_int 0 [0x0]))) 226 {*cmpsi_normal} (insn_list:REG_DEP_TRUE 7 (nil)) (nil)) (insn 9 8 10 0 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) (reg/v/f:DI 112 r32 [orig:351 t ] [351]))) 680 {sync_lock_releasedi+5} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) (insn 10 9 11 0 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v:SI 17 r17 [orig:346 iftmp ] [346]) (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) (insn 11 10 12 0 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v:SI 16 r16 [orig:349 i ] [349]) (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) (jump_insn 12 11 13 0 (set (pc) (if_then_else (eq (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (label_ref:DI 39) (pc))) 235 {*br_true} (insn_list:REG_DEP_TRUE 8 (insn_list:REG_DEP_ANTI 7 (nil))) (expr_list:REG_DEAD (reg:BI 262 p6 [353]) (expr_list:REG_BR_PROB (const_int 3300 [0xce4]) (nil ;; End of basic block 0, registers live: 1 [r1] 12 [r12] 14 [r14] 15 [r15] 16 [r16] 17 [r17] 18 [r18] 112 [r32] 320 [b0] 331 [ar.pfs] ;; Start of basic block 1, registers live: 1 [r1] 12 [r12] 14 [r14] 16 [r16] 17 [r17] 18 [r18] 112 [r32] 320 [b0] 331 [ar.pfs] (note 13 12 14 1 [bb 1] NOTE_INSN_BASIC_BLOCK) (insn 14 13 15 1 (set (mem:SI (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) [2 S4 A32]) (reg/v:SI 17 r17 [orig:346 iftmp ] [346])) 4 {*movsi_internal} (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_ANTI 7 (nil))) (expr_list:REG_DEAD (reg/v:SI 17 r17 [orig:346 iftmp ] [346]) (expr_list:REG_DEAD (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) (nil Then the ia64 machine-reorg scheduler gets to work, and it produces: (insn:TI 8 70 12 0 (set (reg:BI 262 p6 [353]) (ne:BI (reg/v:SI 15 r15 [orig:348 b1 ] [348]) (const_int 0 [0x0]))) 226 {*cmpsi_normal} (insn_list:REG_DEP_TRUE 7 (nil)) (nil)) (jump_insn 12 8 77 0 (set (pc) (if_then_else (eq (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (label_ref:DI 39) (pc))) 235 {*br_true} (insn_list:REG_DEP_TRUE 8 (insn_list:REG_DEP_ANTI 7 (nil))) (expr_list:REG_DEAD (reg:BI 262 p6 [353]) (expr_list:REG_BR_PROB (const_int 3300 [0xce4]) (nil (note 77 12 69 1 [bb 1] NOTE_INSN_BASIC_BLOCK) (... other non-jump, non-cond_exec insns ...) (insn 14 15 16 1 (set (mem:SI (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) [2 S4 A32]) (reg/v:SI 17 r17 [orig:346 iftmp ] [346])) 4 {*movsi_internal} (insn_list:REG_DEP_TRUE 12 (insn_list:REG_DEP_ANTI 7 (nil))) (expr_list:REG_DEAD (reg/v:SI 17 r17 [orig:346 iftmp ] [346]) (expr_list:REG_DEAD (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) (nil (... other non-jump, non-cond_exec insns ...) (insn 10 18 11 1 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v:SI 17 r17 [orig:346 iftmp ] [346]) (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) (insn 11 10 67 1 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v:SI 16 r16 [orig:349 i ] [349]) (const_int 0 [0x0]))) 679 {sync_lock_releasedi+4} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) (... other non-jump, non-cond_exec insns ...) (insn:TI 9 60 65 1 (cond_exec (ne (reg:BI 262 p6 [353]) (const_int 0 [0x0])) (set (reg/v/f:DI 14 r14 [orig:347 t16 ] [347]) (reg/v/f:DI 112 r32 [orig:351 t ] [351]))) 680 {sync_lock_releasedi+5} (insn_list:REG_DEP_TRUE 8 (nil)) (nil)) Notice how the conditional sets of r14 and r17 in insns 9 and 10 have been moved past insn 14, which uses these registers. Shouldn't there be true dependencies on insns 9 and 10 for insn 14? Gr. Steven
PARM_DECL of DECL_SIZE 0, but TYPE_SIZE of 96 bits
Why is that C++ can't create normal DECL's like everyone else? Case in point: (gdb) unit size ... addressable used BLK file minimal.c line 194 size unit size $11 = void (gdb) y So we've got a parm decl that if you ask it for the DECL_SIZE, says 0, but has a TYPE_SIZE of 12 bytes, and we access fields in it, etc. This is causing a bug in detecting what portions of structures are used in tree-ssa-alias, because we use DECL_SIZE, and this causes us to think we use no part of the structure, since it gets min/max of 0! I'm going to work around this by using TYPE_SIZE, but it would be nice if somebody could explain the purpose for this behavior (if it's a bug, i'll file a bug report). I would imagine we don't have truly empty things in C++, so you could simply assert that TREE_INT_CST_LOW of whatever you are setting DECL_SIZE to is not 0 and find these that way. --Dan
Re: Re:
On Wed, Jun 29, 2005 at 03:53:18PM -0600, Jeffrey A Law wrote: > (*) The hpux11 target description assumes that the linker shipped with > hpux11 supports +init as an option. Well, that may work OK for some > versions of hpux11, but it certainly doesn't work for hpux11.00 with > the 990903 version of the linker. Grrr. I have an hpux11.0 box available, and could run tests of your patch if you like.
Re: Scheduler questions (related to PR17808)
On Jun 29, 2005, at 3:46 PM, Steven Bosscher wrote: I have a question about the scheduler. Forgive me if I'm totally missing the point here, this scheduling business is not my thing ;-) Consider the following snippet that I've derived from PR17808 with a few hacks in the compiler to renumber insns and dump RTL with all the dependencies before scheduling. There is a predicate register that gets set, then a few cond_exec insns, then a jump, and finally a set using some of the registers that may be set by the cond_exec insns. This is the RTL before scheduling: Notice how the conditional sets of r14 and r17 in insns 9 and 10 have been moved past insn 14, which uses these registers. Shouldn't there be true dependencies on insns 9 and 10 for insn 14? I think so. This is figured out in sched_analyze_insn in sched-deps.c, I'd suggest stepping through there.
Re:
Hi Guys, I don't any feedback but I am having problem sending to [EMAIL PROTECTED] Is there a trick to send email? I signed up for the mailing list today. Any suggestion would be helpful. Best regards, ~Aalok _ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Re: Re: hpux regression vs 4.0.1
On Wed, 2005-06-29 at 16:17 -0700, Joe Buck wrote: > On Wed, Jun 29, 2005 at 03:53:18PM -0600, Jeffrey A Law wrote: > > (*) The hpux11 target description assumes that the linker shipped with > > hpux11 supports +init as an option. Well, that may work OK for some > > versions of hpux11, but it certainly doesn't work for hpux11.00 with > > the 990903 version of the linker. Grrr. > > I have an hpux11.0 box available, and could run tests of your patch > if you like. I doubt it would have (or will make) much of a difference at this point. I don't think anyone could have predicted the muck-up with the arguments passed to ld and the re-bootstrap that had to run after working around it. As long as I don't kill this 3rd machine I think we'll be OK. jeff
Re: PARM_DECL of DECL_SIZE 0, but TYPE_SIZE of 96 bits
On Wed, Jun 29, 2005 at 06:57:12PM -0400, Daniel Berlin wrote: > I'm going to work around this by using TYPE_SIZE, but it would be nice > if somebody could explain the purpose for this behavior (if it's a bug, > i'll file a bug report). I would imagine we don't have truly empty > things in C++, so you could simply assert that TREE_INT_CST_LOW of > whatever you are setting DECL_SIZE to is not 0 and find these that way. It is most definitely a bug. I'm surprised about the 0 instead of a NULL there. The later would easily be explicable by forgetting to call layout_decl. My only guess is that this decl had an incomplete type at some point. Is the function in question a template? I could see as how maybe we need to call relayout_decl after instantiation, or simply re-order how the parm_decls are created. r~
Re: Re: hpux regression vs 4.0.1
> > I have an hpux11.0 box available, and could run tests of your patch > > if you like. > I doubt it would have (or will make) much of a difference at this > point. I don't think anyone could have predicted the muck-up with > the arguments passed to ld and the re-bootstrap that had to run > after working around it. HP bug fixed in one of their many linker updates. The linker requirements are documented but I can appreciate one just expects things to work in an emergency... > As long as I don't kill this 3rd machine I think we'll be OK. If you have problems, please send the patch for testing. Dave -- J. David Anglin [EMAIL PROTECTED] National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Re: PARM_DECL of DECL_SIZE 0, but TYPE_SIZE of 96 bits
On Wed, 2005-06-29 at 16:55 -0700, Richard Henderson wrote: > On Wed, Jun 29, 2005 at 06:57:12PM -0400, Daniel Berlin wrote: > > I'm going to work around this by using TYPE_SIZE, but it would be nice > > if somebody could explain the purpose for this behavior (if it's a bug, > > i'll file a bug report). I would imagine we don't have truly empty > > things in C++, so you could simply assert that TREE_INT_CST_LOW of > > whatever you are setting DECL_SIZE to is not 0 and find these that way. > > It is most definitely a bug. I'm surprised about the 0 instead > of a NULL there. The later would easily be explicable by > forgetting to call layout_decl. Okay, well, here's what happens: 1. layout_decl gets called on the original PARM_DECL, when it's still a template parameter, which sets the size to type_size, which is still 0 at that point, so we get DECL_SIZE of INTEGER_CST (0). 2. We copy the parm_decl when instantiating the template. 3. layout_decl gets called later on the copy of the PARM_DECL in the instantiated function, which has a DECL_SIZE of INTEGER_CST (0) still. layout_decl does nothing to DECL_SIZE and DECL_SIZE_UNIT of this PARM_DECL in this case, even though TYPE_SIZE has changed from 0 to 96 bits. I imagine the correct thing to do here is to 1. In require_complete_types_for_parms, in the C++ FE, reset DECL_SIZE to NULL before we call layout_decl on the parm and let layout_decl figure out what to do. or 2. Add code in layout_decl to copy TYPE_SIZE/TYPE_SIZE_UNIT if the DECL_SIZE is integer_cst (0) or 3. Not call layout_decl on the template types until they are completed. Doing 1 fixes the bug I'm seeing and seems to do the correct thing, but I'm not a C++ person, so there may be dragons this way. --Dan
Re: Pro64-based GPLed compiler
On Wed, 2005-06-29 at 14:01 -0400, Vladimir Makarov wrote: > Marc Gonzalez-Sigler wrote: > > > Hello everyone, > > > > > > I've taken PathScale's source tree (they've removed the IA-64 code > > generator, and added an x86/AMD64 code generator), and tweaked the > > Makefiles. > > > > I thought some of you might want to take a look at the compiler. > > > > http://www-rocq.inria.fr/~gonzalez/vrac/open64-alchemy-src.tar.bz2 > > > This reference doesn't work. The directory vrac looks empty. > The only other interesting thing they've done is add a simdizer. I diff'd the pathscale compiler and the open64 compiler source, and the main differences are: A bunch of random code #ifdef KEY'd A SIMDizer, which doesn't look like it's as good as ours, it just has better dependence and alias info to work with ATM.
Re: Question on tree-ssa-loop-ivopts.c:constant_multiple_of
On Jun 29, 2005, at 5:44 PM, Richard Kenner wrote: Isn't it the case that *any* conversion can be stripped for the purpose of this routine? I get an ICE compiling the Ada RTS a-strfix.adb because of that. The following seems to fix it, but is it right? This is PR 22212: http://gcc.gnu.org/PR22212, Thanks, Andrew Pinski
Re: Statement expression with function-call variable lifetime
Shaun Jackman wrote: Hello, I'm implementing a tiny vfork/exit implementation using setjmp and longjmp. Since the function calling setjmp can't return (if you still want to longjmp to its jmp_buf) I implemented vfork using a statement expression macro. Here's my implementation of vfork. jmp_buf *vfork_jmp_buf; #define vfork() ({ \ int setjmp_ret; \ jmp_buf *prev_jmp_buf = vfork_jmp_buf, new_jmp_buf; \ vfork_jmp_buf = &new_jmp_buf; \ if( (setjmp_ret = setjmp(*vfork_jmp_buf)) != 0 ) \ vfork_jmp_buf = prev_jmp_buf; \ setjmp_ret; \ }) Unfortunately, after tracing a nasty bug I found that the same problem applies to leaving a statement expression as does returning from a function. The storage allocated for prev_jmp_buf and new_jmp_buf is unallocated as soon as we leave the scope of the statement expression. gcc quickly reuses that stack space later in the same function, overwriting the saved jmp_buf. Does anyone have a suggestion how I can allocate some storage space here for prev_jmp_buf and new_jmp_buf that will last the lifetime of the function call instead of the lifetime of the statement expression macro? My best idea was to use alloca, but it wouldn't look pretty. Can someone confirm that memory allocated with alloca would last the lifetime of the function call, and not the lifetime of the statement expression? Please cc me in your reply. Thanks, Shaun Alloca is like creating a stack variable, except it just gives you some generic bytes that don't mean anything. Exiting the local scope will trash the local variables and anything done with alloca(). You'll need to store some information in a global variable. This C exceptions method stores things as a linked list in nested stack frames and keeps a pointer to the list in a global variable. Well worth studying: http://ldeniau.home.cern.ch/ldeniau/html/exception/exception.html
Ada is broken in a clean directory
Ada is now broken on the mainline by: 2005-06-28 Paul Brook <[EMAIL PROTECTED]> * Makefile.in: Set and use UNWIND_H. Install as unwind.h. * c-decl.c (finish_decl): Call default_init_unwind_resume_libfunc. * except.c (add_ehspec_entry): Generate arm eabi filter lists. (assign_filter_values): Ditto. ... The error is: /Users/pinskia/src/cool/gcc/gcc/ada/raise.c:98:20: unwind.h: No such file or directory /Users/pinskia/src/cool/gcc/gcc/ada/raise.c:109: error: parse error before "__gnat_Unwind_RaiseException" /Users/pinskia/src/cool/gcc/gcc/ada/raise.c:109: warning: type defaults to `int' in declaration of `__gnat_Unwind_RaiseException' Thanks, Andrew Pinski
initializing wider-than-pointers
For some chips, like xstormy16 and m16c, function pointers are wider than other pointers. These types of chips can use thunks to get around this for gcc, but there are still a few cases when you want to know the real (>16 bit) address of the function (reset vectors, for example). If you use code like this: long x = (long) &main; (long long suffices for 32 bit machines, or -m32 on x86_64) You stumble across two problems. First, the tree for this has not one but two conversions, so this code in varasm.c: case CONVERT_EXPR: case NOP_EXPR: { tree src; tree src_type; tree dest_type; src = TREE_OPERAND (value, 0); src_type = TREE_TYPE (src); dest_type = TREE_TYPE (value); needs to look something like this: src = TREE_OPERAND (value, 0); while (TREE_CODE (src) == CONVERT_EXPR || TREE_CODE (src) == NOP_EXPR) src = TREE_OPERAND (src, 0); Once past that, there's a second bit of code in varasm.c (output_constant): thissize = int_size_in_bytes (TREE_TYPE (exp)); Which means the example code gets assembled like this (for example): .short main .zero 2 What we need, however, is this: .long main First, this allows the linker to put in the real address of main rather than the 16 bit thunk, and second, the code in output_constant() always puts the zeros after the short, even on big endian machines (visual inspection, not tested). Ideas?
[C++] Re: PARM_DECL of DECL_SIZE 0, but TYPE_SIZE of 96 bits
On Wed, Jun 29, 2005 at 09:17:07PM -0400, Daniel Berlin wrote: > 1. In require_complete_types_for_parms, in the C++ FE, reset DECL_SIZE > to NULL before we call layout_decl on the parm and let layout_decl > figure out what to do. This is what relayout_decl does. > 2. Add code in layout_decl to copy TYPE_SIZE/TYPE_SIZE_UNIT if the > DECL_SIZE is integer_cst (0) Bad. > 3. Not call layout_decl on the template types until they are completed. Certainly an option; not doing extra work is good. 4. Make sure that template types are incomplete. That is, with TYPE_SIZE/TYPE_SIZE_UNIT unset. A C++ front end maintainer should help choose. r~
Re: Statement expression with function-call variable lifetime
On 6/29/05, Russell Shaw <[EMAIL PROTECTED]> wrote: > Alloca is like creating a stack variable, except it just gives you some > generic bytes that don't mean anything. Exiting the local scope will > trash the local variables and anything done with alloca(). You'll need > to store some information in a global variable. This C exceptions method > stores things as a linked list in nested stack frames and keeps a pointer > to the list in a global variable. Well worth studying: > >http://ldeniau.home.cern.ch/ldeniau/html/exception/exception.html Thanks! That linked was a very good reference. Here's the new and improved vfork macro. It works nicely! If anyone's curious enough, I can post the exit and waitpid macros that accompany it. I'm using these macros to implement a Busybox system that does not need a fork system call. Thanks for all your help, Daniel and Russell! Cheers, Shaun struct vfork_context { struct vfork_context *prev; jmp_buf jmp; } *vfork_context; # define vfork() ({ \ int setjmp_ret; \ struct vfork_context *c = alloca(sizeof *c); \ c->prev = vfork_context; \ vfork_context = c; \ if( (setjmp_ret = setjmp(c->jmp)) != 0 ) \ vfork_context = vfork_context->prev; \ setjmp_ret; \ })