Re: gcc.c-torture/execute/stdarg-2.c: long vs int
On Mon, Aug 22, 2005 at 08:38:01PM -0400, DJ Delorie wrote: > > This test assumes that integer constants passed as varargs are > promoted to a type at least as big as "long", which is not valid on 16 > bit hosts. For example: > > void > f1 (int i, ...) > { > va_start (gap, i); > x = va_arg (gap, long); > > > int > main (void) > { > f1 (1, 79); > if (x != 79) > abort (); > > > Shouldn't those constants be 79L, not just 79? That change fixes one > m32c failure, but given that it's a test case I'm not going to make > any assumptions about it. This certainly wasn't my intention, please change it to 79L. Jakub
Re: Warning Behavior
Ivan Novick <[EMAIL PROTECTED]> writes: > How come the following code would not be considered a Warning? Try -Wextra. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: please update the gcj main page
* Gerald Pfeifer: > On Sun, 31 Jul 2005, Daniel Berlin wrote: >> For code. >> I have never seen such claims made for documentation, since it's much >> easier to remove and deal with infringing docs than code. > > I have seen such statements, by RMS himself. The official position might have changed (e.g. copyright assignments and documentation).
Re: Memory usage reduction in loop.c ?
Christophe Jaillet <[EMAIL PROTECTED]> wrote: > I think that the structure 'struct loop_info' in loop.c could be > shrinked a bit if all the 'int has_XXX' fields where turned into a > bitfield just as in 'struct iv_class' or 'struct induction' in the > same file. > > I don't know if it worse it (in term of memory usage reduction) > neither the impact in performance. > > If anyone interested, I can try it and do a bootstrap but I don't > have the tools to perform benchmark (memory usage or speed of the > compiler) loop.c is a dead man walking. It'll be probably removal in GCC 4.2, so I wouldn't waste my time on it. If you want to improve RTL loop optimizers, look into the new RTL loop optimizer (loop-*.c). Giovanni Bajo
Successful build off gcc-3.4.4 on Mac OS X 10.2.8
Not that it come as a big surprise, but I successfully compiled gcc-3.4.4 on darwin 6.8 (as specified by uname -a) config.guess reports powerpc-apple-darwin.6.8 gcc -v reports Configured with ../gcc-3.4.4/configure --program-suffix=-3.4.4 --enable-languages=c,c++,f77,java,objc Thread model: posix gcc version: 3.4.4 compiled for c, c++, fortran and java bootstraped from apple's gcc-3.3 No problems to signal, clean compile. Xavier
Question about an rtx expression.
Hello, Is it true that in a SET, a search for a _use_ of a register in the LHS should be done only inside a memory address? Like in this SET: (set (mem:SI (plus:DI (reg:DI 159) (reg/v/f:DI 150 ))) (subreg/s:SI (reg/v:DI 142 [ j ]) 4)) -1 (nil) Registers 142, 159 and 150 are used and no register is defined. Thanks, Leehod.
Re: Question about an rtx expression.
Leehod Baruch wrote: Hello, Is it true that in a SET, a search for a _use_ of a register in the LHS should be done only inside a memory address? Also within the second and third arguments of a ZERO_EXTRACT. And its first argument may be a MEM, in which case you should look into it. Look at df_uses_record in df.c for more information. But you can simply use the data flow info you compute, and just avoid uses that have the DF_REF_READ_WRITE flag set (because they occur in the LHS, or within an autoincrement/autodecrement)? Paolo
[GCC 4.x][AMD64 ABI] variadic function
Hi to everyone, I cannot figure out how variadic function are practically implemented. In the called (variadic) function after few 'push's %rsp is suddenly decremented by N bytes: the red area starts 128 bytes below the NEW rsp or %rsp-N above? Is it possible to find the register save area and the overflowing arguments within the called function without using %ebp (that means with -fomit-frame-pointer set) and knowing nothing of the caller? The -so called- spill reg area is placed at fixed address? Thanx in advance, Matteo Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
m64
Hello, can anyone tell me how to use option -m64 in g++ (GCC) 3.4.3 20050227 (Red Hat 3.4.3-22.1)? when I input the command line: >g++ -m64 -o test test.cc error message was output: /tmp/ccyjpGIh.o(.text+0x900): In function `main': : relocation truncated to fit: R_X86_64_32 . . . best regards Jian Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
Re: Warning Behavior
Andreas Schwab wrote: > Try -Wextra. Ah thanks! I have already lost time several times due to this almost invisible mistake and I didn't know -Wextra would catch it. However, it seems to only work for the C compiler, not for C++. (Using GCC 3.4.4) (Oops, sorry Andreas, I actually meant to only send the message to the list) jlh signature.asc Description: OpenPGP digital signature
Re: [GCC 4.x][AMD64 ABI] variadic function
* Matteo Emanuele: > Is it possible to find the register save area and the > overflowing arguments within the called function > without using %ebp (that means with > -fomit-frame-pointer set) and knowing nothing of the > caller? You mean, if the caller called the function as it were a non-variadic function?
Re: [RFA] Nonfunctioning split in rs6000 back-end
David Edelsohn wrote: Paolo Bonzini writes: Paolo> I'm testing a patch that does this replacement, and I can post it Paolo> tomorrow morning. It has triggered only a dozen times so far (half in Paolo> libgcc, half in the compiler), but it may be worth keeping it. It would be nice to keep this type of optimization if the re-engineered version works. Here it is, bootstrapped and regtested on powerpc-apple-darwin8.1.0. Ok for mainline? Paolo 2005-08-22 Paolo Bonzini <[EMAIL PROTECTED]> * config/rs6000/predicates.md (equality_operator): New. * config/rs6000/rs6000.md: Rewrite as a peephole2 the split for comparison with a large constant. Index: config/rs6000/predicates.md === RCS file: /cvs/gcc/gcc/gcc/config/rs6000/predicates.md,v retrieving revision 1.23 diff -p -u -r1.23 predicates.md --- config/rs6000/predicates.md 11 Aug 2005 21:18:11 - 1.23 +++ config/rs6000/predicates.md 22 Aug 2005 20:44:32 - @@ -710,6 +710,10 @@ (define_predicate "boolean_or_operator" (match_code "ior,xor")) +;; Return true if operand is an equality operator. +(define_special_predicate "equality_operator" + (match_code "eq,ne")) + ;; Return true if operand is MIN or MAX operator. (define_predicate "min_max_operator" (match_code "smin,smax,umin,umax")) Index: rs6000.md === RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v retrieving revision 1.400 diff -p -u -r1.400 rs6000.md --- rs6000.md 20 Aug 2005 04:17:17 - 1.400 +++ rs6000.md 22 Aug 2005 20:41:44 - @@ -10727,32 +10727,43 @@ [(set_attr "type" "cmp")]) ;; If we are comparing a register for equality with a large constant, -;; we can do this with an XOR followed by a compare. But we need a scratch -;; register for the result of the XOR. - -(define_split - [(set (match_operand:CC 0 "cc_reg_operand" "") - (compare:CC (match_operand:SI 1 "gpc_reg_operand" "") - (match_operand:SI 2 "non_short_cint_operand" ""))) - (clobber (match_operand:SI 3 "gpc_reg_operand" ""))] - "find_single_use (operands[0], insn, 0) - && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ - || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" - [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) - (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] - " -{ - /* Get the constant we are comparing against, C, and see what it looks like - sign-extended to 16 bits. Then see what constant could be XOR'ed - with C to get the sign-extended value. */ - - HOST_WIDE_INT c = INTVAL (operands[2]); +;; we can do this with an XOR followed by a compare. But this is profitable +;; only if the large constant is only used for the comparison (and in this +;; case we already have a register to reuse as scratch). + +(define_peephole2 + [(set (match_operand:GPR 0 "register_operand") +(match_operand:GPR 1 "logical_operand" "")) + (set (match_dup 0) (match_operator:GPR 3 "boolean_or_operator" + [(match_dup 0) + (match_operand:GPR 2 "logical_operand" "")])) + (set (match_operand:CC 4 "cc_reg_operand" "") +(compare:CC (match_operand:GPR 5 "gpc_reg_operand" "") +(match_dup 0))) + (set (pc) +(if_then_else (match_operator 6 "equality_operator" + [(match_dup 4) (const_int 0)]) + (match_operand 7 "" "") + (match_operand 8 "" "")))] + "peep2_reg_dead_p (3, operands[0])" + [(set (match_dup 0) (xor:GPR (match_dup 5) (match_dup 9))) + (set (match_dup 4) (compare:CC (match_dup 0) (match_dup 10))) + (set (pc) (if_then_else (match_dup 6) (match_dup 7) (match_dup 8)))] + +{ + /* Get the constant we are comparing against, and see what it looks like + when sign-extended from 16 to 32 bits. Then see what constant we could + XOR with SEXTC to get the sign-extended value. */ + rtx cnst = simplify_const_binary_operation (GET_CODE (operands[3]), + GET_MODE (operands[3]), + operands[1], operands[2]); + HOST_WIDE_INT c = INTVAL (cnst); HOST_WIDE_INT sextc = ((c & 0x) ^ 0x8000) - 0x8000; HOST_WIDE_INT xorv = c ^ sextc; - operands[4] = GEN_INT (xorv); - operands[5] = GEN_INT (sextc); -}") + operands[9] = GEN_INT (xorv); + operands[10] = GEN_INT (sextc); +}) (define_insn "*cmpsi_internal2" [(set (match_operand:CCUNS 0 "cc_reg_operand" "=y")
Re: Problem with the special live analyzer in global alloc
Hello, sorry for the late answer. > Vlad promised to update it to use df.c once it wasn't "1% slower", which > would make it easily reusable elsewhere, but never did. > Of course, you could reuse it without that, but then someone will > invariably come along and mess with it. Ok I understand that implementing the special lifeness analyzers in global alloc using the df.c framework would ease reusing it somewhere else. But my question was more basic. So do you agree that using one lifeness analyzer for checking what an optimizer step has done based on a second lifeness analyzers output is wrong? If so what is the way to fix this? Going back to the normal analyzer to be used in global alloc would make global alloc creating worse code. But on the other hand using the global alloc lifeness analyzer everywhere else would be a change which nobody would agree with in the current development stage. Because this is a regression from 4.0 to 4.1 this should be fixed as soon as possible. Bye, -Andreas-
Re: Problem with the special live analyzer in global alloc
Andreas Krebbel wrote: Ok I understand that implementing the special lifeness analyzers in global alloc using the df.c framework would ease reusing it somewhere else. But my question was more basic. So do you agree that using one lifeness analyzer for checking what an optimizer step has done based on a second lifeness analyzers output is wrong? If so what is the way to fix this? Going back to the normal analyzer to be used in global alloc would make global alloc creating worse code. But on the other hand using the global alloc lifeness analyzer everywhere else would be a change which nobody would agree with in the current development stage. Jim Wilson once suggested we should just emit insns to make sure every register is initialized and be done with it - problem solved. I had started to work on that, if people think it's a good idea I can dig that stuff out again. Bernd
Re: Problem with the special live analyzer in global alloc
On Tue, 2005-08-23 at 16:44 +0200, Bernd Schmidt wrote: > Andreas Krebbel wrote: > > > Ok I understand that implementing the special lifeness analyzers in global > > alloc > > using the df.c framework would ease reusing it somewhere else. But my > > question > > was more basic. > > So do you agree that using one lifeness analyzer for checking what > > an optimizer step has done based on a second lifeness analyzers output > > is wrong? If so what is the way to fix this? Going back to the normal > > analyzer to > > be used in global alloc would make global alloc creating worse code. But on > > the other hand > > using the global alloc lifeness analyzer everywhere else would be a change > > which nobody would agree with in the current development stage. > > Jim Wilson once suggested we should just emit insns to make sure every > register is initialized and be done with it - problem solved. But doesn't this actually the information you get worse? Partial liveness gives you an answer, which is "It's not really live here, because it's not defined" If you make them all defined, then it's going to be live where it wasn't before, even though it's not really *used* over those paths. > I had > started to work on that, if people think it's a good idea I can dig that > stuff out again. > > > Bernd
Re: Problem with the special live analyzer in global alloc
Daniel Berlin wrote: If you make them all defined, then it's going to be live where it wasn't before, even though it's not really *used* over those paths. The idea is to put the initialization insns only on the paths where the register will be uninitialized. Bernd
Re: Problem with the special live analyzer in global alloc
On Tuesday 23 August 2005 17:06, Bernd Schmidt wrote: > The idea is to put the initialization insns only on the paths where the > register will be uninitialized. int foo (int n) { int a; while (--n) a = n; return a; } Not knowing n, how can you be sure whether "a" is uninitialized for the "return" statement or not? Gr. Steven
Re: Problem with the special live analyzer in global alloc
On Tue, 2005-08-23 at 17:06 +0200, Bernd Schmidt wrote: > Daniel Berlin wrote: > > > If you make them all defined, then it's going to be live where it wasn't > > before, even though it's not really *used* over those paths. > > The idea is to put the initialization insns only on the paths where the > register will be uninitialized. Again, that will just make the register live over those paths, when it wasn't before, which makes your information about liveness worse. IE if you had int foo(void) { int a; if (blah) a = 5; } and you transform this to: int foo(void) { int a; if (blah) a = 5; else a = 0; } a is now considered live over both paths of the branch.whereas, with the partial availability liveness, it will only be considered live over the path it is actually initialized before use, which is the if branch. Conservatively initialization will also lead to sets you can't eliminate, and will generate real code, even if unreachable in practice. Consider: int argc; int foo(void) { int a; while (argc--) a = } Because you don't know the value of argc, dataflow will tell you it may be uninitialized here. To make it initialized, you'd have to conservatively transform this to: int a; a = 0; while (argc--) a = Because you still don't know the value of argc, you won't be able to remove the a = 0. Besides not being able to remove them, you have to worry about placement when it comes to loops. Consider the simple nested loops for i = 1 to 10 { while (argv--) { a = } } If you just "stupidly" place the initializations (IE don't do LCM like dataflow to determine where they can go), you will transform this into: for i = 1 to 10 { a = 0; while (argv--) { a = } } You could avoid all but the "worse information" problem by tracking which sets you added, and thus, know you can remove the sets if things get bad, since they don't affect the original program. However, this probably ends up being just as ugly as the partial liveness stuff. --Dan
Re: Question about an rtx expression.
Leehod Baruch <[EMAIL PROTECTED]> writes: > Is it true that in a SET, a search for a _use_ of a register > in the LHS should be done only inside a memory address? See refers_to_regno_p for an example of a function which looks for all uses of a register. Ian
Re: Bug in builtin_floor optimization
On Mon, 22 Aug 2005, Dale Johannesen wrote: > There is some clever code in convert_to_real that converts > > double d; > (float)floor(d) > > to > > floorf((float)d) > ... > > Comments? Should I preserve the buggy behavior with -ffast-math? Good catch. This is indeed a -ffast-math (or more precisely a flag_unsafe_math_optimizations) transformation. I'd prefer to keep these transformations with -ffast-math, as Jan described them as significantly helping SPEC's mesa when they were added. My one comment is that we should try to make sure that we continue to optimize the common safe case (even without -ffast-math): float x, y; x = floor(y); i.e. that (float)floor((double)y) is the same as floorf(y). Hmm, it might be good to know the relative merits of the safe vs. unsafe variants. If the majority of the benefit is from the "safe" form, I wouldn't be opposed to removing the "unsafe" form completely, if people think its an optimization "too far". Thanks for investigating this. Roger --
Re: please update the gcj main page
--- Florian Weimer <[EMAIL PROTECTED]> wrote: > * Gerald Pfeifer: > > > On Sun, 31 Jul 2005, Daniel Berlin wrote: > >> For code. > >> I have never seen such claims made for documentation, since it's much > >> easier to remove and deal with infringing docs than code. > > > > I have seen such statements, by RMS himself. > > The official position might have changed (e.g. copyright assignments > and documentation). > I had one thing I'd like to add to this thread: I spend some amount of time updating various GNU/Linux-related docs on the web. Before wiki's became popular (or, at least, before I knew about them), updating a project's docs meant figuring out how to get the site's source via cvs, learning LinuxDoc/DocBook, and sending patches or getting commit access. I never got involved with that. Now that many projects are using wiki's, I can log in, make corrections/additions, and log out. Not to mention how simple most wiki formatting rules are. It's a piece of cake. The only thing that bugs me is that sometimes the wiki police trample over some nicely crafted bit of work I've done, but that's not too often. Devs on these mailing lists have reapeatedly mentioned how receptive they are to having more newb-friendly docs contributed, but it's just *so* *darn* *easy* to work with a wiki that I'm spoiled rotten, and I'm quickly getting too lazy to start doing it the old way. (It occurs to me to wonder if tldp is beginning to see fewer updates to their docs because folks are preferring to use wiki's.) IMO, it's best to keep wiki's editable only by folks/accounts that've been approved somehow. It shouldn't be too much trouble for a wiki maintainer to enable/disable users as-needed. (Though some folks have mentioned that they monitor the wiki continuously and are emailed notifications every time a change is made, so maybe it's not necessary to only allow approved contributors.) Anyhow, that's my opinion FWIW, coming from someone who writes pretty good newb-friendly docs, on various wiki's, every now and again. IMO, if there's some issue with licensing/copyright and wiki's for GNU projects, it should be straightened out so everyone can easily start contributing to the docs, wiki-style. That seems to be the future of web docs AFAICT. ---John Start your day with Yahoo! - make it your home page http://www.yahoo.com/r/hs
Re: Bug in builtin_floor optimization
On Tue, Aug 23, 2005 at 09:28:50AM -0600, Roger Sayle wrote: > Good catch. This is indeed a -ffast-math (or more precisely a > flag_unsafe_math_optimizations) transformation. I'd prefer to > keep these transformations with -ffast-math, as Jan described them > as significantly helping SPEC's mesa when they were added. Are you sure it was "(float)floor(d)"->"floorf((float)d)" that helped mesa and not "(float)floor((double)f)"->"floorf(f)" ? It wouldn't bother me if the first transformation went away even for -ffast-math. It seems egregeously wrong. r~
Assembling pending decls before writing their debug info
Hi Guys, There is a problem with unit-at-a-time compilation and DWARF debug info generation. Consider this small test case which has been derived from GDB's observer.c source file: int observer_test_first_observer = 0; int observer_test_second_observer = 0; int observer_test_third_observer = 0; void observer_test_first_notification_function (void) { observer_test_first_observer++; } void observer_test_second_notification_function (void) { observer_test_second_observer++; } void observer_test_third_notification_function (void) { observer_test_third_observer++; } When compiled with the current mainline gcc sources for an x86 native target and with "-g -O2 -dA" on the command line the following debug info is produced: [snip] .long .LASF0 # DW_AT_name: "observer_test_first_observer" .byte 0x1 # DW_AT_decl_file .byte 0x1 # DW_AT_decl_line .long 0x37# DW_AT_type .byte 0x1 # DW_AT_external .byte 0x5 # DW_AT_location .byte 0x3 # DW_OP_addr .long observer_test_first_observer .uleb128 0x3# (DIE (0x37) DW_TAG_base_type) .ascii "int\0" # DW_AT_name .byte 0x4 # DW_AT_byte_size .byte 0x5 # DW_AT_encoding .uleb128 0x4# (DIE (0x3e) DW_TAG_variable) .long .LASF1 # DW_AT_name: "observer_test_second_observer" .byte 0x1 # DW_AT_decl_file .byte 0x2 # DW_AT_decl_line .long 0x37# DW_AT_type .byte 0x1 # DW_AT_external .byte 0x0 # DW_AT_const_value [snip] Note how observer_test_first_observer is correctly defined as having a DW_AT_location and a DW_OP_addr whereas observer_test_second_observer is incorrectly defined as having a DW_AT_const_value. ie the debug info is saying that it is a variable without a location in memory. The reason for this behaviour is that the debug information is being written out before the variables have been fully resolved. In particular DECL_SET() for the second and third observer functions is NULL when the debug info is generated, which is why they are being given the DW_AT_const_value attribute. In trying to solve this I found that switching the order of the calls to lang_hooks.decls.final_write_globals() and cgraph_varpool_assemble_pending_decls() in compile_file() worked, and this seemed to be intuitively correct. But when I reran the gcc testsuite I found that the change introduced a regression: gcc.dg/varpool-1.c now had the variable "unnecessary_static_initialized_variable" still defined at the end of compilation. I have investigated some more but not gotten much further, so I am asking for help. Can anyone suggest where the conflict between generating the debug info and deciding if the variable is going to be emitted should really be resolved ? Cheers Nick
Re: Assembling pending decls before writing their debug info
> > Hi Guys, > > There is a problem with unit-at-a-time compilation and DWARF debug > info generation. Consider this small test case which has been > derived from GDB's observer.c source file: There was even more issues with uninitialized variables a month ago. This was all caused by Mark's patch to fix PR 18556. This is a regression from 3.4.x. Thanks, Andrew Pinski
Re: Bug in builtin_floor optimization
On Aug 23, 2005, at 9:53 AM, Richard Henderson wrote: On Tue, Aug 23, 2005 at 09:28:50AM -0600, Roger Sayle wrote: Good catch. This is indeed a -ffast-math (or more precisely a flag_unsafe_math_optimizations) transformation. I'd prefer to keep these transformations with -ffast-math, as Jan described them as significantly helping SPEC's mesa when they were added. Are you sure it was "(float)floor(d)"->"floorf((float)d)" that helped mesa and not "(float)floor((double)f)"->"floorf(f)" ? All the floor calls in mesa seem to be of the form (int)floor((double)f) or (f - floor((double)f)). (the casts to double are implicit, actually.) It wouldn't bother me if the first transformation went away even for -ffast-math. It seems egregeously wrong. I think I'd prefer this, given that it is not useful in mesa. Will put together a patch.
Re: Problem with the special live analyzer in global alloc
On Tue, 2005-08-23 at 07:44, Bernd Schmidt wrote: > Jim Wilson once suggested we should just emit insns to make sure every > register is initialized and be done with it - problem solved. I had > started to work on that, if people think it's a good idea I can dig that > stuff out again. I'd like this because of an IA-64 specific problem. IA-64 has Not-a-Thing (NaT) bits, which are used for speculation. If a speculative load fails, the NaT bit is set, which indicates that we must refetch the value before using it. NaT bits propagate through most operations, allowing us to speculate a series of instructions instead of just loads. However, they will generate an illegal instruction exception if used in an operation with side-effects, like a store. So the problem here is that any use of an uninitialized register may generate an exception, if the instruction has side-effects, and the uninitialized register just happens to have the NaT bit set. Mostly we get by because gcc doesn't have speculation support yet, but it is only a matter of time before someone writes it. Meanwhile, there are some hand-written glibc routines that do use speculation, and could potentially trigger this problem. This is a disaster waiting to happen for anyone using gcc on IA-64 machines. I created PR 2 for this problem, and it contains an artificial testcase that demonstrates the problem using bitfield assignments. -- Jim Wilson, GNU Tools Support, http://www.specifix.com
Automake versions (was: Patch to make libgcj work with autoreconf again)
--- Tom Tromey <[EMAIL PROTECTED]> wrote: > > "KC" == Kelley Cook <[EMAIL PROTECTED]> writes: > > KC> 2005-08-19 Kelley Cook <[EMAIL PROTECTED]> > KC> * Makefile.am (ACLOCAL_AMFLAGS): Also include "..". > KC> * acinclude.m4: Delete. Extract CHECK_FOR_BROKEN_MINGW_LD to > ... > KC> * mingwld.m4: ... this new file. > KC> * aclocal.m4, Makefile.in, gcj/Makefile.in: Regenerate. > KC> * include/Makefile.in, testsuite/Makfile.in: Regenerate. > > You used automake 1.9.4 to build Makefile.in. Yes, Andrew had used 1.9.4 in his patch (http://gcc.gnu.org/ml/gcc-cvs/2005-08/msg00618.html) from a few days before, so I did also. I actually had to download that version first. > AIUI, with the exception of libgfortran, the tree is currently > standardized on automake 1.9.3. I wouldn't mind an update, but it > ought to be done globally and install.texi ought to be updated. > Meanwhile, having folks using different versions causes cvs churn... Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating throughout the tree. I propose standardizing the entire tree on 1.9.6, as it is the current release; moreover the 1.9 branch has only had a few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might be stable for a while. > > Probably we should have a script in contrib/ that downloads and > builds all the currently-required tool versions. This would be very cool.
Re: Automake versions (was: Patch to make libgcj work with autoreconf again)
Thanks Tom for pointing this out. We have to all keep these autotools versions synced: it bugs everybody to have extraneous differences in trees due to version mis-match. >Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating >throughout the tree. How did this happen? >I propose standardizing the entire tree on 1.9.6, >as it is the current release; moreover the 1.9 branch has only had a >few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might >be stable for a while. I am in support of this. The sooner the better. >> Probably we should have a script in contrib/ that downloads and >> builds all the currently-required tool versions. >This would be very cool. That seems like the only solution to end this continual issue. I'm strongly in favor of it. -benjamin
Re: gcc.c-torture/execute/stdarg-2.c: long vs int
> This certainly wasn't my intention, please change it to 79L. How's this? It passes both m32c and x86-64. 2005-08-23 DJ Delorie <[EMAIL PROTECTED]> * gcc.c-torture/execute/stdarg-2.c (main): Make sure long constants have the L suffix. Index: gcc.c-torture/execute/stdarg-2.c === RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/stdarg-2.c,v retrieving revision 1.2 diff -p -U3 -r1.2 stdarg-2.c --- gcc.c-torture/execute/stdarg-2.c3 Nov 2004 21:53:39 - 1.2 +++ gcc.c-torture/execute/stdarg-2.c23 Aug 2005 18:27:57 - @@ -143,8 +143,8 @@ f12 (int i, ...) int main (void) { - f1 (1, 79); - if (x != 79) + f1 (1, 79L); + if (x != 79L) abort (); f2 (0x4002, 13, -14.0); if (bar_arg != 0x4002)
Re: Automake versions (was: Patch to make libgcj work with autoreconf again)
> Thanks Tom for pointing this out. We have to all keep these > autotools versions synced: it bugs everybody to have extraneous > differences in trees due to version mis-match. Could we modify the CVS commit filters to *require* the right versions? If it detects a commit with the wrong version (at least, assuming the old rev had the right version), it can just reject it.
pushl vs movl + movl on x86
For this code (from PR23525): extern int waiting_for_initial_map; extern int cp_pipe[2]; extern int pc_pipe[2]; extern int close (int __fd); void first_map_occurred(void) { close(cp_pipe[0]); close(pc_pipe[1]); waiting_for_initial_map = 0; } gcc -march=i686 -O2 generates: movlcp_pipe, %eax movl%eax, (%esp) callclose movlpc_pipe+4, %eax movl%eax, (%esp) callclose The Intel compiler with the same flags generates: pushl cp_pipe #9.11 call close #9.5 pushl 4+pc_pipe #10.11 call close #10.5 gcc -march=i686 -Os generates similar code to the Intel compiler. Is there a performance difference between the movl + movl and pushl code sequences? If not maybe then gcc should generate pushl for -O2 too because it is smaller code. Thanks
Re: Automake versions (was: Patch to make libgcj work with autoreconf again)
> Could we modify the CVS commit filters to *require* the right > versions? If it detects a commit with the wrong version (at least, > assuming the old rev had the right version), it can just reject it. Dunno if this is possible, but this would be great. It would be nice if there was a way to set different versions per branch. For instance, the gcc-4_0-branch, gcc-3_4-branch, and mainline might have different autotools requirements. -benjamin
Re: Problem with the special live analyzer in global alloc
Steven Bosscher wrote: On Tuesday 23 August 2005 17:06, Bernd Schmidt wrote: The idea is to put the initialization insns only on the paths where the register will be uninitialized. int foo (int n) { int a; while (--n) a = n; return a; } Not knowing n, how can you be sure whether "a" is uninitialized for the "return" statement or not? In this case, assuming nothing interesting happens to the loop, you'll have to conservatively initialize "a" near the top of the function. In many cases you can do better and either initialize just before the use, or initialize on an edge on which the register is uninitialized. For register allocation purposes however, this should be as good as using Vlad's new liveness analysis. As Jim points out, we may have to do that for IA64 anyway, so we could consider doing it on all targets. Dan is correct that this can introduce new code that won't be eliminated. One question is how often this is going to occur in practice. Bernd
Re: [RFA] Nonfunctioning split in rs6000 back-end
Paolo Bonzini <[EMAIL PROTECTED]> wrote: > While researching who is really using flow's computed LOG_LINKS, I > found > a define_split in the rs6000 back-end that uses them through > find_single_use. It turns out the only users are combine, this split, > and a function in regmove. See also: http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02371.html Giovanni Bajo
Re: pushl vs movl + movl on x86
On Tue, Aug 23, 2005 at 11:40:16AM -0700, Dan Nicolaescu wrote: > Is there a performance difference between the movl + movl and pushl > code sequences? In this case, no. > If not maybe then gcc should generate pushl for -O2 > too because it is smaller code. It's not quite as simple as you make out. You can get pushes out of gcc with -mno-accumulate-outgoing-args, but then we have to add other compensation code elsewhere. IIRC, it was fairly well explored that we get equal or better performance by not using pushes on P2 class machines and later. r~
RE: pushl vs movl + movl on x86
Dan, > Is there a performance difference between the movl + movl and > pushl code sequences? Not in this example, but movl is faster in some circumstances than pushl. A sequence of pushl has an implicit dependency chain on %esp, as it changes after each pushl, whereas a sequence of movl could enjoy better ILP. However, movl is quite longer than pushl, as you pointed out, which may affect cache efficiency. Therefore, the sweet spot is somewhere in the middle. It's more important to use movl wisely in prologs and epilogs than when passing arguments though. For, as RTH mentioned, -maccumulate-outgoing-args is desirable to avoid frequent stack maintenance. That being said, it depends largely on the underlying architecture implementation. HTH -- ___ Evandro MenezesAMD Austin, TX
Re: Automake versions (was: Patch to make libgcj work with autoreconf again)
> "KC" == Kelley Cook <[EMAIL PROTECTED]> writes: KC> Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating KC> throughout the tree. I propose standardizing the entire tree on 1.9.6, KC> as it is the current release; moreover the 1.9 branch has only had a KC> few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might KC> be stable for a while. This sounds great to me. >> Probably we should have a script in contrib/ that downloads and >> builds all the currently-required tool versions. KC> This would be very cool. I submitted one. Tom
SSE builtins for ia32
Two things I'm wondering about: 1. Why do _builtin_ia32_paddusb and similar functions take signed vector arguments, when the hardware primitive is defined to operate on unsigned vectors? 2. Why are there no sse equivalents of those functions, ones that operate on 128 bit values (i.e., paddusb for v16qi vectors)? paul
Re: gcc.c-torture/execute/stdarg-2.c: long vs int
DJ Delorie wrote: This certainly wasn't my intention, please change it to 79L. How's this? It passes both m32c and x86-64. 2005-08-23 DJ Delorie <[EMAIL PROTECTED]> * gcc.c-torture/execute/stdarg-2.c (main): Make sure long constants have the L suffix. OK. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: Searching for a branch for the see optimization.
Steven Bosscher wrote: On Monday 22 August 2005 14:46, Leehod Baruch wrote: Hello, I would like to know if someone knows a suitable branch for the sign extension optimization pass. Why not just maintain it in a local tree and post refined versions every now and then, until stage 1 for GCC 4.2 opens? Branches are for major work and a new pass is not that major. It's also fine to create a new branch for this work. That let's other people see what you're working on. -- Mark Mitchell CodeSourcery, LLC [EMAIL PROTECTED] (916) 791-8304
Re: Automake versions (was: Patch to make libgcj work with autoreconf again)
Benjamin Kosnik <[EMAIL PROTECTED]> writes: > > Could we modify the CVS commit filters to *require* the right > > versions? If it detects a commit with the wrong version (at least, > > assuming the old rev had the right version), it can just reject it. > > Dunno if this is possible, but this would be great. This is possible--the file to modify is CVSROOT/commitinfo, to run some script for a specific set of files. It would be nice if > there was a way to set different versions per branch. For instance, the > gcc-4_0-branch, gcc-3_4-branch, and mainline might have different > autotools requirements. I'm not sure this is available. It might be possible to look in CVS/Tag to find the branch tag for the file. I don't know whether that file is certain to exist when commitinfo is run, but it seems that it might. Ian
Re: SSE builtins for ia32
On Tue, Aug 23, 2005 at 04:32:42PM -0400, Paul Koning wrote: > 1. Why do _builtin_ia32_paddusb and similar functions take signed >vector arguments, when the hardware primitive is defined to operate >on unsigned vectors? Because the interface you're actually supposed to be using is _mm_adds_pu8, which uses an opaque type. The underlying builtins all use signed vectors because it was simple to make them all the same. > 2. Why are there no sse equivalents of those functions, ones that >operate on 128 bit values (i.e., paddusb for v16qi vectors)? There are. See _mm_adds_epu8 in emmintrin.h. r~
gcc-3.4-20050823 is now available
Snapshot gcc-3.4-20050823 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/3.4-20050823/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 3.4 CVS branch with the following options: -rgcc-ss-3_4-20050823 You'll find: gcc-3.4-20050823.tar.bz2 Complete GCC (includes all of below) gcc-core-3.4-20050823.tar.bz2 C front end and core compiler gcc-ada-3.4-20050823.tar.bz2 Ada front end and runtime gcc-g++-3.4-20050823.tar.bz2 C++ front end and runtime gcc-g77-3.4-20050823.tar.bz2 Fortran 77 front end and runtime gcc-java-3.4-20050823.tar.bz2 Java front end and runtime gcc-objc-3.4-20050823.tar.bz2 Objective-C front end and runtime gcc-testsuite-3.4-20050823.tar.bz2The GCC testsuite Diffs from 3.4-20050816 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-3.4 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Problem with the special live analyzer in global alloc
On Tue, 2005-08-23 at 21:26 +0200, Bernd Schmidt wrote: > As Jim points out, we may have to do that for IA64 anyway, so we could > consider doing it on all targets. Dan is correct that this can > introduce new code that won't be eliminated. One question is how often > this is going to occur in practice. The IBM iSeries (aka AS/400) compiler actually inserts definitions on edges where a pseudo/register is undefined. However, unlike the discussion here, our "pseudo" definitions never lead to generated code. Our pseudo definitions were added to simplify some analysis phases in the compiler (eg, liveness can be simplified down to LIVE rather than LIVE & AVAL). Note that we needed to handle these pseudo definitions specially in some cases so they don't reduce optimization opportunities. If I remember correctly (it's been a while since I left the team): 1) All pseudo defs get the value of so rematerialization, etc. are not pessimized. 2) Pseudo definitions are ignored during the interference graph construction (ie, they never cause edges to be added to the interference graph). 3) More things I can't think of at the moment. This was a win for the iSeries compiler since a fair number of applications were/are written in RPG which is essentially a one procedure application, so the number of basic blocks and live ranges/webs can be quite high. I recall one program we ran into that had about 150K basic blocks and about 1.5M live ranges. I know we used to have a white paper describing the internals of the iSeries compiler (titled "The AS/400 Optimizing Translator"), but all of the links I can find are stale. However, I did come across their patent (5,761,514) describing the idea: "Register allocation method and apparatus for truncating runaway lifetimes of program variables in a computer system". I have no idea whether this was one of the patents made available by IBM for use by the OSS community or not. Peter -- Peter Bergner Linux on Power Toolchain IBM Linux Technology Center
Re: Problem with the special live analyzer in global alloc
On Tue, 2005-08-23 at 22:10 -0500, Peter Bergner wrote: > On Tue, 2005-08-23 at 21:26 +0200, Bernd Schmidt wrote: > > As Jim points out, we may have to do that for IA64 anyway, so we could > > consider doing it on all targets. Dan is correct that this can > > introduce new code that won't be eliminated. One question is how often > > this is going to occur in practice. > > The IBM iSeries (aka AS/400) compiler actually inserts definitions > on edges where a pseudo/register is undefined. However, unlike the > discussion here, our "pseudo" definitions never lead to generated > code I listed that as a possible option, the problem is that you have to know that they are pseudo definitions, and teach other things this too. This is the part i alluded to being probably uglier than partial liveness analysis itself. > . Our pseudo definitions were added to simplify some analysis > phases in the compiler (eg, liveness can be simplified down to LIVE > rather than LIVE & AVAL). Note that we needed to handle these pseudo > definitions specially in some cases so they don't reduce optimization > opportunities. Like this :) Is LIVE & AVAIL really that much slower these days for most programs? I imagine if you have 300k bb's or 1.5 million live pseudos to consider, it probably makes a real difference, but that's not *too* common in our supported languages (30k bb's/150k pseudos is probably the practical upper limit of what we see, though i'm sure someone is going to say they've seen larger :P) > I know we used to have a white paper describing the internals of the > iSeries compiler (titled "The AS/400 Optimizing Translator"), but all > of the links I can find are stale. However, I did come across their > patent (5,761,514) describing the idea: "Register allocation method > and apparatus for truncating runaway lifetimes of program variables > in a computer system". I have no idea whether this was one of the > patents made available by IBM for use by the OSS community or not. Just FYI, I've read this patent, and regardless of whether you think this is something should have patented, etc, the claims are broad enough to cover any way as long as you are doing liveness analysis and then inserting something into the instruction stream to truncate the ranges (real, fake, whatever) . However, if this is what you guys want to do, please don't let that stop you. Let me know if you want to go this route, and we'll work on getting IBM to release it. --Dan