Re: Invoking static library automatically
> I have built a static runtime library and i want the linker to access > it automatically without having to pass it explicitly. Wrong list, this one is for GCC development, not development with GCC. Try [EMAIL PROTECTED] instead. -- Eric Botcazou
-m{arch,tune}=native and Core Duo
Currently, the way the native CPU detection code in driver-i386.c is set up, using -m{arch,tune}=native with an Intel Core Duo (*not Core 2 Duo*) processor will result in -m{arch,tune}=prescott. Is this the correct setting for this chip? There seems to be a lot of confusion across the net as to whether a Core Duo (aka Yonah, aka Centrino Duo) should be using -march=prescott or -march=pentium-m. Some argue that because Core chips share a lot more in common with with the P6 microarchitecture than with Netburst, -march=pentium-m should be the correct choice. Others (myself included) point out that just because the design is based on the Pentium M doesn't make it a Pentium M. One major argument supporting -march=prescott is that using -march=pentium-m will greatly prefer using x87 over SSE scalar code, since the Pentium M sucked at SSE (~30% slower than x87 according to Intel's Optimization Manual). Since then, things like improved decoding and micro-op fusion have made SSE a definite win on Netburst and Core CPUs. Some have come to the conclusion that -march=pentium-m -mfpmath=sse -msse3 is the best solution. So anyways, should -m{arch,tune}=native be setting pentium-m for Core CPU's, or is prescott really the better choice in the end? -- by design, by neglect dirtyepic gentoo orgfor a fact or just for effect 9B81 6C9F E791 83BB 3AB3 5B2D E625 A073 8379 37E8 (0x837937E8) signature.asc Description: PGP signature
CEA (France) has signed assignment of copyright to FSF on GCC
Dear All, Sorry to disturb uninterested people with following information For information (and for future reference) my employing organisation CEA http://www.cea.fr/ "Commissariat a l'Energie Atomique, a French state-owned research entity with a scientific, technical, and industrial activity duly organized under the laws of France, ", has signed an assignment of copyright in the GNU Compiler Collection The copyright assignment (signed on november 20th 2006 by R.Cammoun, Director of CEA-LIST) has reference RT306238 ; in my understanding (but I am not a lawyer) it could be applicable to any of the 14,910 employees of CEA who is or will submitting patches to GCC. I apologize for sending this here, but I thought that it could be the place to tell this, and it could interest (now or in the future) some distant CEA collegues (CEA is big enough for that) that I do not know and cannot reach otherwise. BTW, I am surprised that it is not easy to know which organizations exactly has signed such legal papers. It could happen (in big organizations) that such an assignment has been signed, and a putative minor contributor to GCC does not know about it yet. Regards -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faïencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
block reordering at GIMPLE level?
Hello, While working on our CLI port, I realized that we were missing, among others, the block reordering pass. This is because we emit CLI code before the RTL passes are reached. Looking at the backend optimizations, it is clear that some modify the CFG. But my understanding is that loop optimizations and unrolling are also being moved to GIMPLE. I do not know about others. Could it be that sometime all the optimizations that modify the CFG are run on GIMPLE? Is there any plan/interest in having a block layout pass running at GIMPLE level? Cheers, -- Erven.
Re: -m{arch,tune}=native and Core Duo
On Fri, Dec 01, 2006 at 03:36:59AM -0600, Ryan Hill wrote: > Currently, the way the native CPU detection code in driver-i386.c > is set up, using -m{arch,tune}=native with an Intel Core Duo (*not > Core 2 Duo*) processor will result in -m{arch,tune}=prescott. Is this > the correct setting for this chip? There seems to be a lot of confusion > across the net as to whether a Core Duo (aka Yonah, aka Centrino Duo) > should be using -march=prescott or -march=pentium-m. Some argue > that because Core chips share a lot more in common with with the P6 > microarchitecture than with Netburst, -march=pentium-m should be the > correct choice. Others (myself included) point out that just because > the design is based on the Pentium M doesn't make it a Pentium M. One > major argument supporting -march=prescott is that using > -march=pentium-m will greatly prefer using x87 over SSE scalar code, > since the Pentium M sucked at SSE (~30% slower than x87 according to > Intel's Optimization Manual). Since then, things like improved > decoding and micro-op fusion have made SSE a definite win on Netburst > and Core CPUs. Some have come to the conclusion that > -march=pentium-m -mfpmath=sse -msse3 is the best solution. > > So anyways, should -m{arch,tune}=native be setting pentium-m for Core > CPU's, or is prescott really the better choice in the end? It should be -march=prescott -mtune=generic. I will look into it. H.J.
Re: block reordering at GIMPLE level?
Looking at the backend optimizations, it is clear that some modify the CFG. But my understanding is that loop optimizations and unrolling are also being moved to GIMPLE. I do not know about others. Loop optimizations are performed on GIMPLE only because they are really hard to perform on RTL (in the case of ivopts) or because they are high-level optimizations (such as loop interchange). There are indeed some parts of the back-end that could be done on GIMPLE too (for example load PRE). Could it be that sometime all the optimizations that modify the CFG are run on GIMPLE? Some more, for sure. For example, I'm working in my spare time on moving switch statement expansion to trees so that the tree passes can see the decision trees. But not all of them; GCSE for example can modify the CFG (it will be preceded by a critical edge split pass on RTL in the future). Is there any plan/interest in having a block layout pass running at GIMPLE level? I don't think so, especially considering that tree->RTL expansion itself modifies the CFG. But basic block reordering should be pretty easy to port from RTL to trees for your own treee. Paolo
Re: block reordering at GIMPLE level?
Hi, I know little about CLI, but assuming that your backend is nonstandard enought so it seems to make sense to replace the RTL bits I guess it would make sense to make the bb-reorder run on GIMPLE level too, while keeping bb-reorder on RTL level for common compilation path. This is example of pass that has very little dependency on the particular IL so our CFG manipualtion abstraction can be probably extended rather easilly to make it pracically IL independent. The reason why it is run late is that the RTL backend modify CFG enough to make this important. CLI might have the same property. What you might want to consider is to simply port our CFG code to CLI IL representation, whatever it is and share the pass. The tracer pass, very similar to bb-rorder in nature, has been ported to work on gimple, but the implementation is not in mainline yet. You might want to take a look at changes neccesary as bb-reorder should be about the same (minus the SSA updating since you probably want to bb-reroder after leaving SSA form) Honza > > Hello, > > While working on our CLI port, I realized that we were missing, among > others, the block reordering pass. This is because we emit CLI code > before the RTL passes are reached. > Looking at the backend optimizations, it is clear that some modify the > CFG. But my understanding is that loop optimizations and unrolling are > also being moved to GIMPLE. I do not know about others. > > Could it be that sometime all the optimizations that modify the CFG are > run on GIMPLE? > Is there any plan/interest in having a block layout pass running at > GIMPLE level? > > Cheers, > > -- > Erven.
Re: CEA (France) has signed assignment of copyright to FSF on GCC
BTW, I am surprised that it is not easy to know which organizations exactly has signed such legal papers. It could happen (in big organizations) that such an assignment has been signed, and a putative minor contributor to GCC does not know about it yet. There is a copyright list on gnu.org machines that people with accounts there have access to. It lists every person and organization with a copyright assignment. Personally, I think the list should be somewhere that *all* gcc maintainers have access to (not all of us have gnu.org accounts).
Re: -m{arch,tune}=native and Core Duo
On Fri, Dec 01, 2006 at 06:43:46AM -0800, H. J. Lu wrote: > On Fri, Dec 01, 2006 at 03:36:59AM -0600, Ryan Hill wrote: > > Currently, the way the native CPU detection code in driver-i386.c > > is set up, using -m{arch,tune}=native with an Intel Core Duo (*not > > Core 2 Duo*) processor will result in -m{arch,tune}=prescott. Is this > > the correct setting for this chip? There seems to be a lot of confusion > > across the net as to whether a Core Duo (aka Yonah, aka Centrino Duo) > > should be using -march=prescott or -march=pentium-m. Some argue > > that because Core chips share a lot more in common with with the P6 > > microarchitecture than with Netburst, -march=pentium-m should be the > > correct choice. Others (myself included) point out that just because > > the design is based on the Pentium M doesn't make it a Pentium M. One > > major argument supporting -march=prescott is that using > > -march=pentium-m will greatly prefer using x87 over SSE scalar code, > > since the Pentium M sucked at SSE (~30% slower than x87 according to > > Intel's Optimization Manual). Since then, things like improved > > decoding and micro-op fusion have made SSE a definite win on Netburst > > and Core CPUs. Some have come to the conclusion that > > -march=pentium-m -mfpmath=sse -msse3 is the best solution. > > > > So anyways, should -m{arch,tune}=native be setting pentium-m for Core > > CPU's, or is prescott really the better choice in the end? > > It should be -march=prescott -mtune=generic. I will look into it. I opened http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30040 H.J.
Re: CEA (France) has signed assignment of copyright to FSF on GCC
Personally, I think the list should be somewhere that *all* gcc maintainers have access to (not all of us have gnu.org accounts). I agree that in principle, GCC "code maintainers" would need to check it after approving a patch of somebody who has no CVS access. But the FSF does not care about who can commit what and under what conditions, and in fact several maintainers (not random people who have access to the subversion repo!) are complete unknowns to the FSF because it's their employer who signed the copyright assignment. The GCC "FSF-appointed maintainer" is the SC, and in fact several people on the SC have access to the list. There could be privacy problems too. I don't know the relevant legislation, but the list includes personal data (year of birth, citizenship, employer) and, in Italy, I would have to sign a form if I had access to such data. The legislation in the US, however, is probably very different: for example, the copyright assignment form would have a separate signature, or a click-through if done via web, where you accept that the FSF keeps the data on a computer in order to process the assignment. Paolo
mainline slowdown
My bootstrap/make check cycle took about 10 hours with yesterdays checkout (way longer than expected). A quick investigation shows C++ compilation timed are through the roof. Using quick (in theory) and trusty cpgram.ii, I get: tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall 4045 kB ( 1%) ggc TOTAL :1283.62 9.97 1381.98 451745 kB Is this new code, or is this the old issue we had a few weeks ago? I lost track. Andrew
SPEC CFP2000 and polyhedron runtime scores dropped from 13. november onwards
Hello! At least on x86_64 and i686 SPEC score [1] and polyhedron [2] scores dropped noticeably. For SPEC benchmarks, mgrid, galgel, ammp and sixtrack tests are affected and for polygedron, ac (second regression in the peak) and protein (?) regressed in that time frame. [1] http://www.suse.de/~aj/SPEC/amd64/CFP/summary-britten/recent.html [2] http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt-2-0.html Does anybody have any idea what is going on there? Uros.
Re: mainline slowdown
On Fri, 2006-12-01 at 10:40 -0500, Andrew MacLeod wrote: > My bootstrap/make check cycle took about 10 hours with yesterdays > checkout (way longer than expected). A quick investigation shows C++ > compilation timed are through the roof. > > Using quick (in theory) and trusty cpgram.ii, I get: > > tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall > 4045 kB ( 1%) ggc > TOTAL :1283.62 9.97 1381.98 > 451745 kB > > > Is this new code, or is this the old issue we had a few weeks ago? I > lost track. btw, x86 linux... Andrew
Re: CEA (France) has signed assignment of copyright to FSF on GCC
Paolo Bonzini wrote: The legislation in the US, however, is probably very different: for example, the copyright assignment form would have a separate signature, or a click-through if done via web, where you accept that the FSF keeps the data on a computer in order to process the assignment. Just to be clear, this example refers to Italy too. Paolo
Re: mainline slowdown
On Fri, 2006-12-01 at 10:56 -0500, Andrew MacLeod wrote: > On Fri, 2006-12-01 at 10:40 -0500, Andrew MacLeod wrote: > > My bootstrap/make check cycle took about 10 hours with yesterdays > > checkout (way longer than expected). A quick investigation shows C++ > > compilation timed are through the roof. > > > > Using quick (in theory) and trusty cpgram.ii, I get: > > > > tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall > >4045 kB ( 1%) ggc > > TOTAL :1283.62 9.97 1381.98 > > 451745 kB > > > > > > Is this new code, or is this the old issue we had a few weeks ago? I > > lost track. > > btw, x86 linux... Crap. And its with checking enabled as well. I haven't checked with it disabled. I hope thats all the relevant info :-) Andrew
Re: SPEC CFP2000 and polyhedron runtime scores dropped from 13. november onwards
On 12/1/06, Uros Bizjak <[EMAIL PROTECTED]> wrote: Hello! At least on x86_64 and i686 SPEC score [1] and polyhedron [2] scores dropped noticeably. For SPEC benchmarks, mgrid, galgel, ammp and sixtrack tests are affected and for polygedron, ac (second regression in the peak) and protein (?) regressed in that time frame. [1] http://www.suse.de/~aj/SPEC/amd64/CFP/summary-britten/recent.html [2] http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt-2-0.html Does anybody have any idea what is going on there? It correlates with the PPRE introduction (enabled at -O3 only) which might increase register pressure, but also improves Polyhedron rnflow a lot. Richard.
Re: Help on compiling with Japanese Text
* Alan Ong ([EMAIL PROTECTED]) [20061201 03:35]: > Hello, >I am trying to compile my code with hard-coded Japanese Kanji and > full-width katakana string text but the compiler view some of the text > as escape characters. Please ask on [EMAIL PROTECTED] This list deals with the development of GCC. Philipp
Re: SPEC CFP2000 and polyhedron runtime scores dropped from 13. november onwards
On 12/1/06, Richard Guenther <[EMAIL PROTECTED]> wrote: On 12/1/06, Uros Bizjak <[EMAIL PROTECTED]> wrote: > Hello! > > At least on x86_64 and i686 SPEC score [1] and polyhedron [2] scores > dropped noticeably. For SPEC benchmarks, mgrid, galgel, ammp and > sixtrack tests are affected and for polygedron, ac (second regression > in the peak) and protein (?) regressed in that time frame. > > [1] http://www.suse.de/~aj/SPEC/amd64/CFP/summary-britten/recent.html > [2] http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt-2-0.html > > Does anybody have any idea what is going on there? It correlates with the PPRE introduction (enabled at -O3 only) which might increase register pressure, but also improves Polyhedron rnflow a lot. Feel free to disable it and let me know if it helps. If it's really affecting scores that badly, i'm happy to turn it off until we can deal with the register pressure (though i thought we had out-of-ssa changes to help with this now).
Re: [Bug middle-end/29695] [4.1/4.2/4.3 Regression] Folding breaks (a & 0x80) ? 0x80 : 0 for unsigned char or unsigned short a
Sure. Sorry for the huge log. I will use your method for the merge commit message in the future. Thanks a lot! Regards, Chao-ying - Original Message - From: "Jakub Jelinek" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: Sent: Thursday, November 30, 2006 11:24 PM Subject: Re: [Bug middle-end/29695] [4.1/4.2/4.3 Regression] Folding breaks (a & 0x80) ? 0x80 : 0 for unsigned char or unsigned short a > On Fri, Dec 01, 2006 at 12:07:18AM -, chaoyingfu at gcc dot gnu dot org wrote: > > > > > > --- Comment #6 from chaoyingfu at gcc dot gnu dot org 2006-12-01 00:07 --- > > Subject: Bug 29695 > > > > Author: chaoyingfu > > Date: Fri Dec 1 00:05:26 2006 > > New Revision: 119383 > > > > URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=119383 > > Log: > > Merged revisions 118455-118543 via svnmerge from > > svn+ssh://[EMAIL PROTECTED]/svn/gcc/trunk > > Please, when svn merging to private branches never use all the ChangeLog entries > in svn commit messages (it is excessively huge anyway and > Merged revisions 118455-118543 via svnmerge from > svn+ssh://[EMAIL PROTECTED]/svn/gcc/trunk > information is all that is needed), or at least remove > all PR references from it, as it spams all the PR pages. > > Jakub >
IA64 psABI discussion group created
FYI, I created an IA64 psABI discussion group: http://groups-beta.google.com/group/ia64-abi H.J.
The Linux binutils 2.17.50.0.8 is released
This is the beta release of binutils 2.17.50.0.8 for Linux, which is based on binutils 2006 1201 in CVS on sourceware.org plus various changes. It is purely for Linux. Starting from the 2.17.50.0.8 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.17.50.0.8 to [EMAIL PROTECTED] and http://www.sourceware.org/bugzilla/ If you don't use # rpmbuild -ta binutils-xx.xx.xx.xx.xx.tar.bz2 to compile the Linux binutils, please read patches/README in source tree to apply Linux patches if there are any. Changes from binutils 2.17.50.0.7: 1. Update from binutils 2006 1201. 2. Fix "objcopy --only-keep-debug" crash. PR 3609. 3. Fix various ARM ELF bugs. 4. Fix various xtensa bugs. 5. Update x86 disassembler. Changes from binutils 2.17.50.0.6: 1. Update from binutils 2006 1127. 2. Properly set ELF output segment address when the first section in input segment is removed. 3. Better merging of CIEs in linker .eh_frame optimizations. 4. Support .cfi_personality and .cfi_lsda assembler directives. 5. Fix an ARM linker crash. PR 3532. 6. Fix various PPC64 ELF bugs. 7. Mark discarded debug info more thoroughly in linker output. 8. Fix various MIPS ELF bugs. 9. Fix readelf to display program interpreter path > 64 chars. PR 3384. 10. Add support for PowerPC SPU. 11. Properly handle cloned symbols used in relocations in assembler. PR 3469. 12. Update opcode for POPCNT in amdfam10 architecture. Changes from binutils 2.17.50.0.5: 1. Update from binutils 2006 1020. 2. Don't make debug symbol dynamic. PR 3290. 3. Don't page align empty SHF_ALLOC sections, which leads to very large executables. PR 3314. 4. Use a different section index for section relative symbols against removed empty sections. 5. Fix a few ELF EH frame handling bugs. 6. Don't ignore relocation overflow on branches to undefweaks for x86-64. PR 3283. 7. Rename MNI to SSSE3. 8. Properly append symbol list for --dynamic-list. lists. 9. Various ARM ELF fixes. 10. Correct 64bit library search path for Linux/x86 linker with 64bit support. 11. Fix ELF linker to copy OS/PROC specific flags from input section to output section. 12. Fix DW_FORM_ref_addr handling in linker dwarf reader. PR 3191. 13. Fix ELF indirect symbol handling. PR 3351. 14. Fix PT_GNU_RELRO segment handling for SHF_TLS sections. Don't add PT_GNU_RELRO segment when there are no relro sections. PR 3281. 15. Various MIPS ELF fixes. 16. Various Sparc ELF fixes. 17. Various Xtensa ELF fixes. Changes from binutils 2.17.50.0.4: 1. Update from binutils 2006 0927. 2. Fix linker regressions of section address and section relative symbol with empty output section. PR 3223/3267. 3. Fix "strings -T". PR 3257. 4. Fix "objcopy --only-keep-debug". PR 3262. 5. Add Intell iwmmxt2 support. 6. Fix an x86 disassembler bug. PR 3100. Changes from binutils 2.17.50.0.3: 1. Update from binutils 200
Re: mainline slowdown
On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: My bootstrap/make check cycle took about 10 hours with yesterdays checkout (way longer than expected). A quick investigation shows C++ compilation timed are through the roof. 10 hours? Using quick (in theory) and trusty cpgram.ii, I get: tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall 4045 kB ( 1%) ggc TOTAL :1283.62 9.97 1381.98 451745 kB This is uh, like 20 minutes wall time. So where is 10 hours coming from? Is this new code, or is this the old issue we had a few weeks ago? I lost track. Same issue. Still working on it. The patch i posted to you ended up having a bunch of underlying issues that needed solving, so it's taking longer than expected. I'm down to 1 testsuite failure I'm happy to commit what i have now and fix that one testsuite failure in a followup if that is what we want.
Re: CEA (France) has signed assignment of copyright to FSF on GCC
On Fri, Dec 01, 2006 at 04:35:32PM +0100, Paolo Bonzini wrote: > There could be privacy problems too. I don't know the relevant > legislation, but the [copyright assignment] list includes personal data > (year of birth, citizenship, employer) and, in Italy, I would have to > sign a form if I had access to such data. The legislation in the US, > however, is probably very different: The US laws on the data privacy practices private organizations must follow are far looser than those in the EU. If an organization publishes a privacy policy they must follow it, or if they agreed to keep data confidential when they collected it they must follow that, and there are tougher rules for some kinds of data (particularly financial information), but otherwise the rules are pretty loose (I'll put in the usual IANAL disclaimer here). Just the same, the FSF properly tries to keep access to people's personal information limited, going beyond the legal requirements. Right now, if there is doubt about someone's status, we can get someone who has a gnu.org account to check the list.
Re: mainline slowdown
On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > My bootstrap/make check cycle took about 10 hours with yesterdays > > checkout (way longer than expected). A quick investigation shows C++ > > compilation timed are through the roof. > > 10 hours? read carefully. "bootstrap/make check" > > > > > Using quick (in theory) and trusty cpgram.ii, I get: > > > > tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall > >4045 kB ( 1%) ggc > > TOTAL :1283.62 9.97 1381.98 > > 451745 kB > > This is uh, like 20 minutes wall time. > So where is 10 hours coming from? this says cpgram.ii, not bootstrap/make check cycle. Big difference. > > > > > > > Is this new code, or is this the old issue we had a few weeks ago? I > > lost track. > > Same issue. > > Still working on it. > The patch i posted to you ended up having a bunch of underlying issues > that needed solving, so it's taking longer than expected. > S'OK. like I said, I wasn't paying close attention and didnt know if you had checked anything in or not. I was under the impression you had, so wanted to make sure we weren't seeing a regression > I'm down to 1 testsuite failure > I'm happy to commit what i have now and fix that one testsuite failure > in a followup if that is what we want. Well, I dont feel strongly one way or the other, I just want to make sure it addressed :-) Andrew
Re: SPEC CFP2000 and polyhedron runtime scores dropped from 13. november onwards
On Fri, 2006-12-01 at 11:45 -0500, Daniel Berlin wrote: > On 12/1/06, Richard Guenther <[EMAIL PROTECTED]> wrote: > > On 12/1/06, Uros Bizjak <[EMAIL PROTECTED]> wrote: > > > Hello! > until we can deal with the register pressure (though i thought we had > out-of-ssa changes to help with this now). I don't know why you would think that... I've never supplied any patches or numbers. I have barely even started that work. The out of SSA changes I have (which are not checked in yet) are merely rewriting chunks of out of ssa to speed it up and take care of various warts. (And to make it more suitable for some register pressure experimenting, and a potenital/eventual merge with expand). Andrew
call for 4.3 project reviewer for amdfam10 project
Hello, In accordance with http://gcc.gnu.org/ml/gcc/2006-09/msg00454.html, I am looking for a reviewer for patches that add tuning for AMD's new AMDFAM10 architecture to gcc. The changes are all confined to the i386 backend and are only turned on with -march=amdfam10 and/or -mtune=amdfam10. The patches are ready, but they need to be ported to the latest mainline. We have run tests to make sure that nothing changes for the other -march/tune paths like "generic". There will be some more patches before stage 2 closes, but it would be good to get the ones that are ready & tested checked in sooner. Jan Hubicka has offered to pre-review the patches. If some one else is willing to help with giving the patches a final review that would be very helpful. Thanks, Harsha
Re: mainline slowdown
On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > My bootstrap/make check cycle took about 10 hours with yesterdays > > checkout (way longer than expected). A quick investigation shows C++ > > compilation timed are through the roof. > > 10 hours? read carefully. "bootstrap/make check" Yes, so, i've never seen a bootstrap make check take 10 hours. :)
Re: mainline slowdown
On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > My bootstrap/make check cycle took about 10 hours with yesterdays > > checkout (way longer than expected). A quick investigation shows C++ > > compilation timed are through the roof. > > 10 hours? read carefully. "bootstrap/make check" > > > > > Using quick (in theory) and trusty cpgram.ii, I get: > > > > tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) wall 4045 kB ( 1%) ggc > > TOTAL :1283.62 9.97 1381.98 451745 kB > > This is uh, like 20 minutes wall time. > So where is 10 hours coming from? this says cpgram.ii, not bootstrap/make check cycle. Big difference. BTW, what do you think these have to do with each other? One is a pathological testcase with about 1-5 initializers, the other is a whole bunch of relatively normal code. So why would you attempt to draw conclusions about bootstrap/regtest from cpgram.ii? *particularly* when the other issue you keep harping on has in fact, been shown *not* to increase GCC compile time by the regression testers.
Re: mainline slowdown
On Fri, 2006-12-01 at 15:06 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > > > > > > > > Using quick (in theory) and trusty cpgram.ii, I get: > > > > > > > > tree PTA :1135.48 (88%) usr 5.47 (55%) sys1168.23 (85%) > > > > wall4045 kB ( 1%) ggc > > > > TOTAL :1283.62 9.97 1381.98 > > > > 451745 kB > > > > > > This is uh, like 20 minutes wall time. > > > So where is 10 hours coming from? > > > > this says cpgram.ii, not bootstrap/make check cycle. Big difference. > > BTW, what do you think these have to do with each other? > nothing except thats its C++ code and I had it handy. I thought You had fixed this and was making sure that if you had you were aware there was a regression. Hence I asked: "Is this new code, or is this the old issue we had a few weeks ago? I lost track." > > *particularly* when the other issue you keep harping on has in fact, > been shown *not* to increase GCC compile time by the regression > testers. what other issue have I harped on? One note asking about mainline bootstrap/make check that seemed excessively long? I don't recall ever "harping on" about anything else except possibly discussing this cpgram issue a few weeks ago. Andrew
Re: mainline slowdown
On Fri, 2006-12-01 at 14:59 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > > > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > > > My bootstrap/make check cycle took about 10 hours with yesterdays > > > > checkout (way longer than expected). A quick investigation shows C++ > > > > compilation timed are through the roof. > > > > > > 10 hours? > > > > read carefully. "bootstrap/make check" > > Yes, so, i've never seen a bootstrap make check take 10 hours. > :) me either!! Hence the question if anyone else was seeing it. Diego says PPC runs are faster than they use to be. Anyone with x86 seeing slowdowns? Maybe my computer just hates me this week. Andrew
Re: mainline slowdown
On 01/12/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Fri, 2006-12-01 at 14:59 -0500, Daniel Berlin wrote: > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > > > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > > > My bootstrap/make check cycle took about 10 hours with yesterdays > > > > checkout (way longer than expected). A quick investigation shows C++ > > > > compilation timed are through the roof. > > > > > > 10 hours? > > > > read carefully. "bootstrap/make check" > > Yes, so, i've never seen a bootstrap make check take 10 hours. > :) me either!! Hence the question if anyone else was seeing it. Diego says PPC runs are faster than they use to be. Anyone with x86 seeing slowdowns? Maybe my computer just hates me this week. I have just finished bootstrapping/make check revision 119259 and it took 9 hours 25 minutes on a dual Pentium III (256KB cache) 1GHz with 1GB of RAM. configure --enable-decimal-float --enable-languages=c,c++,fortran,java,objc make -j3 CFLAGS='-gdwarf-2 -g3' make -k check As far as I know, nothing else has been running on the machine meanwhile. I hope this information helps, Manuel.
[RFC] timers, pointers to functions and type safety
There's a bunch of related issues, some kernel, some gcc, thus the Cc from hell on that one. First of all, in theory the timers in kernel are done that way: * they have callback of type void (*)(unsigned long) * they have data to be passed to it - of type unsigned long * callback is called by the code that even in theory has no chance whatsoever of inlining the call. * one of the constraints on the targets we can port the kernel on is that unsigned long must be uintptr_t. The last one means that we can pass any pointers to these suckers; just cast to unsigned long and cast back in the callback. While that is safe (modulo the portability constraint that affects much more code than just timers), it ends up very inconvenient and leads to lousy type safety. The thing is, absolute majority of callbacks really want a pointer to some object. There is a handful of cases where we really want a genuine number - not a pointer cast to unsigned long, not an index in array, etc. They certainly can be dealt with. Nearly a thousand of other instances definitely want pointers. Another observation is that quite a few places are playing fast and loose with function pointers. Some are just too lazy and cast void (*)(void) to void (*)(unsigned long). These, IMO, should stop wanking and grow an unused argument. Not worth the ugliness... However, there are other cases, including very interesting timer->function = (void (*)(unsigned long))func; timer->data = (unsigned long)p; with func actually being void (void *) and p being void *. Now, _that_ is definitely not a valid C. Worse, it doesn't take much to come up with realistic architecture that would have different calling conventions for those. Just assume that * there are two groups of registers (A and D) * base address for memory access must be in some A register * both A and D registers can be used for arithmetics * ABI is such that functions with few arguments have them passed via A and D registers, with pointers going via A and numbers via D. Realistic enough? I wouldn't be surprised if such beasts actually existed - embedded processors influenced by m68k are not particulary rare and picking such ABI would make sense for them. Note that this kind of casts is not just in some obscure code; e.g. rpc_init_task() does just that. And that's where it gets interesting. It would be very nice to get to the following situation: * callbacks are void (*)(void *) * data is void * * instances can take void * or pointer to object type * a macro SETUP_TIMER(timer, func, data) sets callback and data and checks if func(data) would be valid. It would be remove a lot of cruft and definitely improve the type safety of the entire thing. It's not hard to do; all it takes [warning: non portable C ahead] is typeof(*data) *p = data; timer->function = (void (*)(void *))func; timer->data = (void *)p; (void)(0 && (func(p),0)); Again, that's not a portable C, even leaving aside the use of typeof. Casts between the incompatible function types are undefined behaviour; rationale is that we might have different calling conventions for those. However, here we are at much safer ground; calling conventions are not likely to change if you replace a pointer to object with void *. It's still possible in theory, but let's face it, we would have far worse problems if it ever came to porting to such targets. Note that here we have absolutely no possibility of eventual call ever being inlined, no matter what kind of analysis compiler might be doing. Call happens when kernel/timer.c gets to structure while trawling the lists and it simply has no way to tell which callback might be there (as the matter of fact, callback can and often does come from a module). IOW, "gcc would ICE if it ever inlined it" kind of arguments (i.e. what had triggered gcc refusing to do direct call via such cast) doesn't apply here. Question to gcc folks: can we expect no problems with the approach above, provided that calling conventions are the same for passing void * and pointers to objects? No versions (including current 4.3 snapshots) create any problems here... BTW, similar trick would be very useful for workqueues - there we already have void * as argument, but extra type safety would be nice to have. Now, there's another question: how do we get there? Or, at least, from current void (*)(unsigned long) to void (*)(void *)... "A fscking huge patch flipping everything at once" is obviously not an answer; too much PITA merging and impossible to review. There is a tempting possibility to do that gradually, with all intermediates still in working state, provided that on all platforms currently supported by the kernel unsigned long and void * are indeed passed the same way (again, only for current kernel targets and existing gcc versions). Namely, we could
gcc-4.1-20061201 is now available
Snapshot gcc-4.1-20061201 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20061201/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 119420 You'll find: gcc-4.1-20061201.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20061201.tar.bz2 C front end and core compiler gcc-ada-4.1-20061201.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20061201.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20061201.tar.bz2 C++ front end and runtime gcc-java-4.1-20061201.tar.bz2 Java front end and runtime gcc-objc-4.1-20061201.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20061201.tar.bz2The GCC testsuite Diffs from 4.1-20061124 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
rtl dumps
Hi, I have noticed that the INSN_CODE for all patterns in the rtl dumps .00.expand are -1 ... does this mean that the .md file was not used for the initial RTL generation? best regards Andrija Radicevic
Re: rtl dumps
On 12/1/06, Andrija Radicevic <[EMAIL PROTECTED]> wrote: Hi, I have noticed that the INSN_CODE for all patterns in the rtl dumps .00.expand are -1 ... does this mean that the .md file was not used for the initial RTL generation? It was used, but it is assumed that the initial RTL produced by 'expand' is valid, i.e. you should be able to call recog() on all insns and not fail. Gr. Steven
[C/C++] same warning/error, different text
The message for the following error: enum e { E3 = 1 / 0 }; is in C: error: enumerator value for 'E3' not integer constant and in C++: error: enumerator value for 'E3' is not an integer constant The code in C is error ("enumerator value for %qE is not an integer constant", name); and in C++ is error ("enumerator value for %qD not integer constant", name); Is there someone against fixing this? What would be the preferred message? This arises because I am working in a patch that fixes PR28986 (overflow warnings in C++). Since currently there are not many testcases in C++ for this (actually there are zero for the above error), I was planning to copy the ones from the C front-end (which are excellent and I guess they should apply to C++ most of the time).
Re: mainline slowdown
On Fri, Dec 01, 2006 at 03:37:22PM -0500, Andrew MacLeod wrote: > On Fri, 2006-12-01 at 14:59 -0500, Daniel Berlin wrote: > > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > > On Fri, 2006-12-01 at 13:49 -0500, Daniel Berlin wrote: > > > > On 12/1/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > > > > My bootstrap/make check cycle took about 10 hours with yesterdays > > > > > checkout (way longer than expected). A quick investigation shows C++ > > > > > compilation timed are through the roof. > > > > > > > > 10 hours? > > > > > > read carefully. "bootstrap/make check" > > > > Yes, so, i've never seen a bootstrap make check take 10 hours. > > :) > > me either!! Hence the question if anyone else was seeing it. Diego says > PPC runs are faster than they use to be. Anyone with x86 seeing > slowdowns? Maybe my computer just hates me this week. These are what I have for Nov. on a dual-cpu P4 3.2GHz with HT: nohup.20061103:7129.07user 1279.83system 42:27.00elapsed 330%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061103:8203.19user 2691.67system 57:45.37elapsed 314%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061108:7082.96user 1306.05system 42:25.73elapsed 329%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061108:8796.31user 2801.81system 53:57.75elapsed 358%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061110:7315.57user 1292.64system 45:59.12elapsed 311%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061110:8326.86user 2727.21system 58:12.37elapsed 316%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061113:7158.86user 1288.96system 45:19.70elapsed 310%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061113:8161.41user 2746.78system 58:02.40elapsed 313%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061115:8554.33user 1277.61system 54:21.33elapsed 301%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061115:8592.50user 2801.66system 53:12.89elapsed 356%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061116:8553.23user 1286.27system 54:27.57elapsed 301%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061116:8816.32user 2784.34system 54:07.30elapsed 357%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061118:7201.91user 1278.13system 45:30.10elapsed 310%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061118:8615.97user 2816.65system 53:31.23elapsed 356%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061120:7251.84user 1297.72system 45:52.32elapsed 310%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061120:8240.75user 2786.54system 58:16.48elapsed 315%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061122:7108.80user 1319.07system 45:23.87elapsed 309%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061122:8135.23user 2772.20system 57:52.77elapsed 314%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061123:7118.26user 1295.97system 45:18.50elapsed 309%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061123:8555.19user 2828.30system 53:29.78elapsed 354%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061125:7075.41user 1278.07system 45:08.26elapsed 308%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061125:8192.55user 2751.37system 56:20.08elapsed 323%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061127:7177.19user 1295.59system 45:34.88elapsed 309%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061127:8173.81user 2752.39system 57:50.67elapsed 314%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061201:6880.76user 1284.99system 41:35.44elapsed 327%CPU (0avgtext+0avgdata 0maxresident)k nohup.20061201:7942.93user 2742.14system 57:23.53elapsed 310%CPU (0avgtext+0avgdata 0maxresident)k The first number is bootstrap and the second one is check. I used --enable-checking=assert. H.J.
Optimizing a 16-bit * 8-bit -> 24-bit multiplication
I would like to multiply a 16-bit number by an 8-bit number and produce a 24-bit result on the AVR. The AVR has a hardware 8-bit * 8-bit -> 16-bit multiplier. If I multiply a 16-bit number by a 16-bit number, it produces a 16-bit result, which isn't wide enough to hold the result. If I cast one of the operands to 32-bit and multiply a 32-bit number by a 16-bit number, GCC generates a call to __mulsi3, which is the routine to multiply a 32-bit number by a 32-bit number and produce a 32-bit result and requires ten 8-bit * 8-bit multiplications. A 16-bit * 8-bit -> 24-bit multiplication only requires two 8-bit * 8-bit multiplications. A 16-bit * 16-bit -> 32-bit multiplication requires four 8-bit * 8-bit multiplications. I could write a mul24_16_8 (16-bit * 8-bit -> 24-bit) function using unions and 8-bit * 8-bit -> 16-bit multiplications, but before I go down that path, is there any way to coerce GCC into generating the code I desire? Cheers, Shaun
Determining if a function has vague linkage
Hi all, I understand that all template functions in GCC should have vague linkage and thus may be exported into numerous translation units where they are used. I have been attempting to use a few different macros on both an instanciated template functions FUNCTION_DECL node and a normal functions FUNCTION_DECL node to see what the results are. I have tried: DECL_WEAK(fndecl) DECL_ONE_ONLY(fndecl) DECL_COMDAT(fndecl) DECL_DEFER_OUTPUT(fndecl) DECL_REPO_AVAILABLE_P(fndecl) IDENTIFIER_REPO_CHOSEN(DECL_ASSEMBLER_NAME(fndecl)) and ALL of these macros are returning false for both the FUNCTION_DECL nodes. Is there any macro i can use to determine if a FUNCTION_DECL node has vague linkage? Or do i need to just assume that it is the sace for a template function? Also what are some other examples of functions that should return true for any of the above macros (I assume inlines do sometimes according to the vague linkage GCC page)? Thanks, Brendon.
Re: [RFC] timers, pointers to functions and type safety
On Fri, 2006-12-01 at 17:21 +, Al Viro wrote: > There's a bunch of related issues, some kernel, some gcc, > thus the Cc from hell on that one. I don't really see how this is a GCC question, rather I see this as a C question which means this should have gone to either [EMAIL PROTECTED] or the C news group. > While that is safe (modulo the portability constraint that affects much > more code than just timers), it ends up very inconvenient and leads to > lousy type safety. Why do you say is inconvenient, that shows up all the time in real code :). > The thing is, absolute majority of callbacks really want a pointer to > some object. There is a handful of cases where we really want a genuine > number - not a pointer cast to unsigned long, not an index in array, etc. > They certainly can be dealt with. Nearly a thousand of other instances > definitely want pointers. Then create an union which contains the two different types of call back. You know: union a { void (*callbackwithulong) (unsigned long); void (*callbackwithptr) (void*); }; And then you just use the correct in the correct place. I don't see why there is a mystery about this? Thanks, Andrew Pinski PS don't cross post and I still don't see a GCC development question in here, only a C one.
Re: [RFC] timers, pointers to functions and type safety
Andrew Pinski wrote: On Fri, 2006-12-01 at 17:21 +, Al Viro wrote: There's a bunch of related issues, some kernel, some gcc, thus the Cc from hell on that one. I don't really see how this is a GCC question, rather I see this as a C question which means this should have gone to either [EMAIL PROTECTED] or the C news group. . . . PS don't cross post and I still don't see a GCC development question in here, only a C one. Andrew, There are times when we should cut people a little slack. If not for the sake of general harmony, then at least to facilitate the improvement of two important programs (the Linux kernel and GCC). There is a lot more that could be said, but I will leave it at that, David Daney
Re: [RFC] timers, pointers to functions and type safety
On 12/1/06, Al Viro <[EMAIL PROTECTED]> wrote: There's a bunch of related issues, some kernel, some gcc, thus the Cc from hell on that one. First of all, in theory the timers in kernel are done that way: * they have callback of type void (*)(unsigned long) * they have data to be passed to it - of type unsigned long * callback is called by the code that even in theory has no chance whatsoever of inlining the call. * one of the constraints on the targets we can port the kernel on is that unsigned long must be uintptr_t. The last one means that we can pass any pointers to these suckers; just cast to unsigned long and cast back in the callback. While that is safe (modulo the portability constraint that affects much more code than just timers), it ends up very inconvenient and leads to lousy type safety. Understandable. I assume you are trying to get more type safety more for error checking than optimization, being that the kernel still defaults to -fno-strict-aliasing. The thing is, absolute majority of callbacks really want a pointer to some object. There is a handful of cases where we really want a genuine number - not a pointer cast to unsigned long, not an index in array, etc. They certainly can be dealt with. Nearly a thousand of other instances definitely want pointers. Another observation is that quite a few places are playing fast and loose with function pointers. Some are just too lazy and cast void (*)(void) to void (*)(unsigned long). These, IMO, should stop wanking and grow an unused argument. Not worth the ugliness... However, there are other cases, including very interesting timer->function = (void (*)(unsigned long))func; timer->data = (unsigned long)p; with func actually being void (void *) and p being void *. Now, _that_ is definitely not a valid C. Worse, it doesn't take much to come up with realistic architecture that would have different calling conventions for those. Just assume that * there are two groups of registers (A and D) * base address for memory access must be in some A register * both A and D registers can be used for arithmetics * ABI is such that functions with few arguments have them passed via A and D registers, with pointers going via A and numbers via D. Realistic enough? I wouldn't be surprised if such beasts actually existed - embedded processors influenced by m68k are not particulary rare and picking such ABI would make sense for them. Note that this kind of casts is not just in some obscure code; e.g. rpc_init_task() does just that. And that's where it gets interesting. It would be very nice to get to the following situation: * callbacks are void (*)(void *) * data is void * * instances can take void * or pointer to object type * a macro SETUP_TIMER(timer, func, data) sets callback and data and checks if func(data) would be valid. It would be remove a lot of cruft and definitely improve the type safety of the entire thing. It's not hard to do; all it takes [warning: non portable C ahead] is typeof(*data) *p = data; timer->function = (void (*)(void *))func; timer->data = (void *)p; (void)(0 && (func(p),0)); Again, that's not a portable C, even leaving aside the use of typeof. Casts between the incompatible function types are undefined behaviour; rationale is that we might have different calling conventions for those. However, here we are at much safer ground; calling conventions are not likely to change if you replace a pointer to object with void *. Is this true of the ports you guys support even if the object is a function pointer or a function? (Though the first case may be insane. I can't think of a *good* reason you'd pass a pointer to a function pointer to a timer callback,). It's still possible in theory, but let's face it, we would have far worse problems if it ever came to porting to such targets. Note that here we have absolutely no possibility of eventual call ever being inlined, no matter what kind of analysis compiler might be doing. Ah, well, here is where you are kinda wrong, but not for the reason you are probably thinking of. Call happens when kernel/timer.c gets to structure while trawling the lists and it simply has no way to tell which callback might be there (as the matter of fact, callback can and often does come from a module). Right, it doesn't know what it will *always* be, but it may add if's and inline *possible* target sites based on profile results. Particularly since the profile will tell us which are *actually* called. This shouldn't matter however, we still shouldn't ICE if we inline it :) IOW, "gcc would ICE if it ever inlined it" kind of arguments (i.e. what had triggered gcc refusing to do direct call via such cast) doesn't apply here. Question to gcc folks: can we expect no problems with the approach above, provided that calling conventions