Re: Mudflap and freeing C runtime memory upon exit (feature request)
[EMAIL PROTECTED] (Frank Ch. Eigler) wrote: > Vesselin Peev" <[EMAIL PROTECTED]> writes: > [...] I have a feature request for mudflap. It should have an > option to run glibc's _libc_freeres function that forces the C > runtime library to free all of its memory [...] Good idea. (It should not take more than a dozen lines of code - a threshold below which one may not even need a copyright assignment in order to contribute.) I get the message :). Thanks for the pointer, I'll go for it. Once done, I'll most likely have to get back to the appropriate list to help resolve the "mudflap warning: unaccessed registered object" problem that I described.
Re: libgcc-math specification
On 8/4/06, Sashan Govender <[EMAIL PROTECTED]> wrote: Hi Is there a specification that describes a set of routines for libgcc-math? I read through previous emails on this topic and it seems that it has been removed from head. I'd like to contribute but not sure what direction to go in. Is there a specific branch that needs checking out? It is not on a branch at the moment, but is supposed to re-appear once 4.2 branches. Also there is not a fixed set of routines at the moment, but what is used by the patch supporting vectorization of math intrinsics. Richard.
Can't use exceptions on non-mainstream targets
It appears that sjlj exceptions are broken (PRs 19774/25266/28493). I'm not sure this is true for _every_ target, but it appears to be true for many. So it seems that those of us affected by this bug have three workarounds: 1) compile everything with -fno-exceptions and remove all eh from project code 2) implement dwarf2 eh for the target 3) fix the #19774 regression Given the relative significance of tasks #1 and #2, I'm surprised that a fix for #19774 isn't higher priority. In any case, I am attempting to fix it myself. I have been dumping trees and have narrowed it down to either the eh or vregs RTL pass, and would be glad to hear if anyone else has information on this problem. Aaron
Re: Can't use exceptions on non-mainstream targets
Aaron Graham wrote: It appears that sjlj exceptions are broken (PRs 19774/25266/28493). I'm not sure this is true for _every_ target, but it appears to be true for many. The real bug is PR28493, not PR19774. The latter (and PR25266 which is related) only affect less common cases, i.e. using alloca or variable-length arrays. Unlike PR19774, which is as old as 3.4, PR28493 is new to 4.1.0 (as you know because you are the reporter). I think it is low priority only because nobody has yet confirmed it (most likely because most GCC developers work on dw2 targets). Paolo
gcc-4.2-20060805 is now available
Snapshot gcc-4.2-20060805 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20060805/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 115951 You'll find: gcc-4.2-20060805.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20060805.tar.bz2 C front end and core compiler gcc-ada-4.2-20060805.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20060805.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20060805.tar.bz2 C++ front end and runtime gcc-java-4.2-20060805.tar.bz2 Java front end and runtime gcc-objc-4.2-20060805.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20060805.tar.bz2The GCC testsuite Diffs from 4.2-20060729 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Modifying ARM code generator for elimination of 8bit writes - need help
Rask, On Friday 21 July 2006 15:26, Rask Ingemann Lambertsen wrote: > I found that this peephole optimization improves the code a whole > lot: Done. > Another way of improving the code was to swap the order of the two > last alternatives of _arm_movqi_insn_swp. Done. Anyway, the problems with reload continues...error: unrecognizable insn First, I have had a problem with loading a register with a constant. (no clobber). I have solved this problem by adding > (define_insn "_arm_movqi_insn_const" > [(set (match_operand:QI 0 "register_operand" "=r") > (match_operand:QI 1 "const_int_operand" ""))] > "TARGET_ARM && TARGET_SWP_BYTE_WRITES >&& ( register_operand (operands[0], QImode))" > "@ >mov%?\\t%0, %1" > [(set_attr "type" "*") >(set_attr "predicable" "yes")] > ) I am very shure that this does only cure the symptoms, and it will better to fix this in the reload stage, but at least, it worked, and I was able to compile the whole linux kernel! After testing that the kernel is running, I have tried to compile uCLinux. And there is the next problem > ../ncurses/./base/lib_set_term.c: In function '_nc_setupscreen': > ../ncurses/./base/lib_set_term.c:470: error: unrecognizable insn: > (insn 1199 1198 696 37 ../ncurses/./base/lib_set_term.c:429 (parallel > [ (set (mem/s/j:QI (reg/f:SI 3 r3 [491]) [0 ._clear+0 S1 > A8] > ) (reg:QI 0 r0)) > (clobber (subreg:QI (reg:DI 11 fp) 0)) > ]) -1 (nil) > (nil)) > ../ncurses/./base/lib_set_term.c:470: internal compiler error: in > extract_insn, > at recog.c:2020 P The source code line is: >newscr->_clear = TRUE; Obviously, TRUE is loaded in r0, but I don't know why this construct (storing a byte into a struct member referenced by a pointer) is not evaluated. I fear that these problems are creating an endless story, and sorry for generating traffic on this list, because I'm still no gcc expert... On the other hand, the compiler now has generated code from hundreds of files, and maybe I'm very near to success now. regards Wolfgang -- We're back to the times when men were men and wrote their own device drivers. (Linus Torvalds)
Re: Modifying ARM code generator for elimination of 8bit writes - need help
On Sat, Aug 05, 2006 at 09:03:34PM +0200, Wolfgang Mües wrote: > First, I have had a problem with loading a register with a constant. > (no clobber). I have solved this problem by adding > > > (define_insn "_arm_movqi_insn_const" [cut] > > I am very shure that this does only cure the symptoms, and it will > better to fix this in the reload stage, but at least, it worked, and I > was able to compile the whole linux kernel! Yes, it only cures the symptom, but it could take a lot of time to find the cause, and the gain is small, so I think it is OK to leave it like this for now. > After testing that the kernel is running, I have tried to compile > uCLinux. And there is the next problem > > > ../ncurses/./base/lib_set_term.c: In function '_nc_setupscreen': > > ../ncurses/./base/lib_set_term.c:470: error: unrecognizable insn: > > (insn 1199 1198 696 37 ../ncurses/./base/lib_set_term.c:429 (parallel > > [ (set (mem/s/j:QI (reg/f:SI 3 r3 [491]) [0 ._clear+0 S1 > > A8] > > ) (reg:QI 0 r0)) > > (clobber (subreg:QI (reg:DI 11 fp) 0)) > > ]) -1 (nil) > > (nil)) > > ../ncurses/./base/lib_set_term.c:470: internal compiler error: in > > extract_insn, > > at recog.c:2020 P This insn was generated from the "reload_outqi" pattern. I don't completely understand why it isn't recognized. The (subreg:QI (reg:DI 11 fp) 0) part won't be matched by (match_scratch ...), but simplify_gen_subreg() should have simplified it to (reg:QI 11 fp) since this is one of the main purposes of having simplify_(gen_)subreg() in the first place. Try changing operands[3] = simplify_gen_subreg (QImode, operands[2], DImode, 0); into operands[3] = gen_rtx_REG (QImode, REGNO (operands[2])); (in "reload_outqi") and see if that works. > I fear that these problems are creating an endless story, and sorry for > generating traffic on this list, because I'm still no gcc expert... You shouldn't be sorry about that. GCC provides a good, solid foundation for learning something new every day. > On the other hand, the compiler now has generated code from hundreds of > files, and maybe I'm very near to success now. I think so too. -- Rask Ingemann Lambertsen
___divti3 and ___umodti3 missing on Darwin
While testing the state of gfortran in gcc trunk at -m64 on MacOS X 10.4 I discovered a huge number of test failures (848 compared to 26 with -m32). Almost all of these failures appear to be due to two undefined symbols in libgfortran's shared library in the ppc64 version... http://gcc.gnu.org/ml/fortran/2006-08/msg00112.html The symbols, ___divti3 and ___umodti3, are not present in darwin-libgcc.10.4.ver or darwin-libgcc.10.5.ver found in gcc/config/rs6000 but they are present in the libgcc-std.ver in the gcc directory. I haven't found anything in bugzilla about this issue, however it really should be addressed before gcc 4.2 is released. Jack
___divti3 and ___umodti3 missing on Darwin
I can produce the first missing symbol with current gcc trunk on MacOS X 10.4 using this testcase... main() { __int128_t a, b; b= a % 10; } When compiled with "gcc-4 -m64 modulo.c", this produces the linkage failure of... can't resolve symbols: ___modti3, referenced from: _main in cc87vdlF.o ld64 failed: symbol(s) not found collect2: ld returned 1 exit status Hopefully this helps. Jack
RE: ___divti3 and ___umodti3 missing on Darwin
The second missing symbol generated by gcc trunk on MacOS X 10.4 when compiling with -m64 can be demonstrated with this test case... main() { __int128_t a, b; b= a / 10; } which when compiled with "gcc-4 -m64 division.c" produces the linkage error... can't resolve symbols: ___divti3, referenced from: _main in ccDcwJYL.o ld64 failed: symbol(s) not found collect2: ld returned 1 exit status Hopefully fixes for these already reside in Apple's gcc branch and can be ported over without much grief. Jack
fancy x87 ops, SSE and -mfpmath=sse,387 performance
Basically i'd like to have the cake and also eat it. With g++-4.2-20060805/cygwin on a k8 box on some software path with lots of sp float ops but no transcendentals or library calls -mfpmath=sse,387: 5.2 Mray/s -mfpmath=sse: 6 Mray/s That 15% performance difference is no surprise when you see things like 4037c8: flds 0x4(%esp) 4037cc: mulss %xmm5,%xmm2 4037d0: fsubrp %st,%st(1) 4037d2: movss %xmm1,0x4(%esp) 4037d8: addss 0x278(%esp,%ecx,4),%xmm0 4037e1: flds 0x4(%esp) 4037e5: fsubrp %st,%st(1) 4037e7: addss %xmm2,%xmm0 4037eb: movss %xmm0,0x4(%esp) 4037f1: flds 0x4(%esp) 4037f5: fdivrp %st,%st(1) 4037f7: fcomi %st(1),%st 4037f9: fldz 4037fb: setae %dl 4037fe: fcomip %st(1),%st 403800: seta %al 403803: or %al,%dl 403805: je 4036ca Therefore -mfpmath=sse is the way to go and is in fact on par or better than what i get out of icc 9.1 for the same code. Where it gets ugly is when, for example, you throw some cosf() into the same compilation unit as with -mfpmath=sse you pay for some really really slow library function calls (at least on cygwin). Wishful thinking got me trying -march=k8 -mfpmath=sse -mfancy-math-387, to no avail :( Is there a way to enable such exotic codegen for 32bit environments?