Re: How to implement pattens with more that 30 alternatives
On Mon, 2009-12-21 at 18:44 +, Paul Brook wrote: > > > > I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of > > > > scheduling framework i have to write the move patterns with more > > > > clarity, so that i could control the scheduling with the help of > > > > attributes. Re-writting the pattern resulted in movsi pattern with 41 > > > > alternatives :( > > > > > > Use rtl expressions instead of alternatives. e.g. arm.md:arith_shiftsi > > > > Or use the more modern iterators approach. > > Aren't iterators for generating multiple insns (e.g. movsi and movdi) from > the > same pattern, whereas in this case we have a single insn that needs to > accept > many different operand combinartions? Yes, but that is often better, I suspect, than having too fancy a pattern that breaks the optimization simplifications that genrecog does. Note that the attributes that were requested could be made part of the iterator as well, using a mode_attribute. R.
Re: [PATCH] ARM: Convert BUG() to use unreachable()
On 12/17/2009 06:17 PM, Richard Guenther wrote: It shouldn't as *(int *)0 = 0; might trap. But if you want to be sure use __builtin_trap (); instead for the whole sequence (the unreachable is implied then). GCC choses a size-optimal trap representation for your target then. Agree that it shouldn't but just to be sure I'd use *(volatile int *)0 = 0; unreachable (); Paolo
Re: How to implement pattens with more that 30 alternatives
> > > Or use the more modern iterators approach. > > > > Aren't iterators for generating multiple insns (e.g. movsi and movdi) > > from the same pattern, whereas in this case we have a single insn that > > needs to accept many different operand combinartions? > > Yes, but that is often better, I suspect, than having too fancy a > pattern that breaks the optimization simplifications that genrecog does. My understanding was that you should not have multiple patterns that match the same RTL. Especially with things like mov, it's important that all alternatives be the same insn so that register allocation and reload work. i.e. the following will work as expected: (define_insn "*my_movsi" (set (match_operand:SI "..." "=a,b") (match_operand:SI "..." "ab,ab"))) However the following will not. Once gcc has picked a particular insn (_a or _b), it will tend to stick with it and not try other patterns. (define_insn "*my_movsi_a" (set (match_operand:SI "..." "=a") (match_operand:SI "..." "ab"))) (define_insn "*my_movsi_b" (set (match_operand:SI "..." "=b") (match_operand:SI "..." "ab"))) Paul
Re: [PATCH] ARM: Convert BUG() to use unreachable()
Russell King - ARM Linux wrote: > On Mon, Dec 21, 2009 at 11:30:43AM -0800, Richard Henderson wrote: >> On 12/17/2009 10:17 AM, Russell King - ARM Linux wrote: >>> How is "size-optimal trap" defined? >> E.g. Sparc and MIPS have "tcc" instructions that trap based on the >> condition codes, and so we eliminate the branch. That's the only >> optimization we apply with __builtin_trap. >> >>> Let me put it another way: I want this function to terminate with an >>> explicit NULL pointer dereference in every case. >> Then just use that. > > That's precisely what we have been using for many years. I don't understand. It should still work just fine; the original version posted appears to simply lack 'volatile' on the (int *) cast. cheers, DaveK
Re: How to implement pattens with more that 30 alternatives
On Tue, Dec 22, 2009 at 12:09:55PM +, Paul Brook wrote: > i.e. the following will work as expected: > > (define_insn "*my_movsi" > (set (match_operand:SI "..." "=a,b") > (match_operand:SI "..." "ab,ab"))) > > However the following will not. Once gcc has picked a particular insn > (_a or _b), it will tend to stick with it and not try other patterns. This is my understanding too - but it's a real nuisance. Suppose you have two optional ISA extensions that have their own move instructions. For the sake of conversation I'll call them Alice and Bob... no, I'll call them TARGET_MAVERICK and TARGET_NEON. Now you need a minimum of three copies of the mov pattern that are mostly the same. It'd be nice if there was a way to compose instruction patterns :-( -- Daniel Jacobowitz CodeSourcery
Re: [PATCH] ARM: Convert BUG() to use unreachable()
On Tue, Dec 22, 2009 at 02:09:02PM +, Dave Korn wrote: > Russell King - ARM Linux wrote: > > On Mon, Dec 21, 2009 at 11:30:43AM -0800, Richard Henderson wrote: > >> On 12/17/2009 10:17 AM, Russell King - ARM Linux wrote: > >>> How is "size-optimal trap" defined? > >> E.g. Sparc and MIPS have "tcc" instructions that trap based on the > >> condition codes, and so we eliminate the branch. That's the only > >> optimization we apply with __builtin_trap. > >> > >>> Let me put it another way: I want this function to terminate with an > >>> explicit NULL pointer dereference in every case. > >> Then just use that. > > > > That's precisely what we have been using for many years. > > I don't understand. It should still work just fine; the original version > posted appears to simply lack 'volatile' on the (int *) cast. Neither do I - AFAIK the existing code works fine. I think this is just a noisy thread about people wanting to use the latest and greated compiler "features" whether they make sense to or not, and this thread should probably die until some problem has actually been identified. If it ain't broke, don't fix.
Re: How to implement pattens with more that 30 alternatives
On Tue, 2009-12-22 at 09:10 -0500, Daniel Jacobowitz wrote: > On Tue, Dec 22, 2009 at 12:09:55PM +, Paul Brook wrote: > > i.e. the following will work as expected: > > > > (define_insn "*my_movsi" > > (set (match_operand:SI "..." "=a,b") > > (match_operand:SI "..." "ab,ab"))) > > > > However the following will not. Once gcc has picked a particular insn > > (_a or _b), it will tend to stick with it and not try other patterns. > > This is my understanding too - but it's a real nuisance. Suppose you > have two optional ISA extensions that have their own move > instructions. For the sake of conversation I'll call them Alice and > Bob... no, I'll call them TARGET_MAVERICK and TARGET_NEON. Now you > need a minimum of three copies of the mov pattern that are > mostly the same. > > It'd be nice if there was a way to compose instruction patterns :-( > There is. Look at attribute "enabled". I've not worked out how to use that properly yet, but it is used in the m68k back-end. R.
Re: How to implement pattens with more that 30 alternatives
On Tue, Dec 22, 2009 at 02:12:48PM +, Richard Earnshaw wrote: > There is. Look at attribute "enabled". > > I've not worked out how to use that properly yet, but it is used in the > m68k back-end. Interesting. This seems to replace needing either (A) a bunch of similar patterns with different insn predicates, or (B) a bunch of new constraints that expand to "not available unless such and such ISA is enabled". That's a nice improvement, although we're back to the number of alternatives getting quite high. This does still leave us with weird operand predicates. For instance, in a patch I'm working on for ARM cmpdi patterns, I ended up needing "cmpdi_lhs_operand" and "cmpdi_rhs_operand" predicates because Cirrus and VFP targets accept different constants. Automatically generating that would be a bit excessive though. -- Daniel Jacobowitz CodeSourcery
Re: [PATCH] ARM: Convert BUG() to use unreachable()
Russell King - ARM Linux wrote: > [ ... ] this thread should probably die until some problem has actually > been identified. > > If it ain't broke, don't fix. Couldn't agree more. Happy Christmas! cheers, DaveK
Re: LTO FAILs: invalid section name - asterisks in DECL_ASSEMBLER_NAME.
Dave Korn wrote: > For the declaration > >> int xxx(void) __asm__("_" "xxx"); > > DECL_ASSEMBLER_NAME returns "*_xxx". > > I see that other parts of LTO are aware of this problem: > >> /* FIXME lto: this is from assemble_name_raw in varasm.c. For some >> architectures we might have to do the same name manipulations that >> ASM_OUTPUT_LABELREF does. */ >> if (name[0] == '*') >> name = &name[1]; > > ... so is there an overarching plan to clean this up, or should I just try > copy'n'pasting the same workaround into produce_asm()? To avoid any wasted or duplicated effort, I'll just mention that I'm preparing a patch along these lines, but it'll probably be after christmas by the time I send it in. cheers, DaveK
Re: How to implement pattens with more that 30 alternatives
On 12/22/09, Daniel Jacobowitz wrote: > in a patch I'm working on for ARM cmpdi patterns, I ended up needing > "cmpdi_lhs_operand" and "cmpdi_rhs_operand" predicates because Cirrus > and VFP targets accept different constants. Automatically generating > that would be a bit excessive though. I wouldn't bother implementaing that if the VFP/Cirrus conflict is the only thing that needs that. GCC's has never been able to generate working code for Cirrus MaverickCrunch for over a dozen separate reasons, from incorrect use of the way the Maverick sets the condition codes to hardware bugs in the 64-bit instructions (or in the way GCC uses them). I eventually cooked up over a dozen patches to make 4.[23] generate reliable crunch floating point code but if you enable the 64-bit insns it still fails the openssl testsuite. M
Re: How to implement pattens with more that 30 alternatives
On Tue, Dec 22, 2009 at 04:24:01PM +, Martin Guy wrote: > I wouldn't bother implementaing that if the VFP/Cirrus conflict is the > only thing that needs that. > GCC's has never been able to generate working code for Cirrus > MaverickCrunch for over a dozen separate reasons, from incorrect use > of the way the Maverick sets the condition codes to hardware bugs in > the 64-bit instructions (or in the way GCC uses them). > > I eventually cooked up over a dozen patches to make 4.[23] generate > reliable crunch floating point code but if you enable the 64-bit insns > it still fails the openssl testsuite. Interesting, I knew you had a lot of Cirrus patches but I didn't realize the state of the checked-in code was so bad. Is what's there useful or actively harmful? -- Daniel Jacobowitz CodeSourcery
Preserving order of variable declarations
Hi, I want to for every pointer in a program append another pointer associated with it. Eg. If the program has, char *p; I want this to be transformed into, char *pa; char *p such that, (&p)-4 is same as &(pa). I am only aware I can insert temporary variables in the gcc gimple pass. But the order of the temporary variables are moved around are not guaranteed to stay in the same location. This made me think I will need to wrap them into a struct or something so every pointer declaration is replaced with a struct with a associated pointer and the original pointer. But this means, I need to overwrite the entire program replacing all pointer accesses by the corresponding pointer in the struct. Is this is the only way it can be done ? Or can I have (non-temporary)variables inserted whos order can be preserved always ? Thanks, Aravinda
Which optimizer should remove redundant subreg of sign_extension?
I came across this RTL on AVR in combine dump (part of va-arg-9.c test) (set (reg:QI 25 r25 [+1 ]) (subreg:QI (sign_extend:HI (reg:QI 49)) 1)) The sign extension is completely redundant - the upper part of register is not used elsewhere - but the RTL remains unchanged through all the optimizers and sign_extension appears in final code. Which RTL optimisation should be taking care of this? Propagation? It would help me look in the right place to understand and perhaps fix issue. I suspect the presence of hard register is why it does not get removed. (the hard register is the function return value) Andy
Re: Which optimizer should remove redundant subreg of sign_extension?
On 12/22/09 11:16, Andrew Hutchinson wrote: I came across this RTL on AVR in combine dump (part of va-arg-9.c test) (set (reg:QI 25 r25 [+1 ]) (subreg:QI (sign_extend:HI (reg:QI 49)) 1)) The sign extension is completely redundant - the upper part of register is not used elsewhere - but the RTL remains unchanged through all the optimizers and sign_extension appears in final code. Which RTL optimisation should be taking care of this? Propagation? It would help me look in the right place to understand and perhaps fix issue. I suspect the presence of hard register is why it does not get removed. (the hard register is the function return value) I'd look at combine, though I think it's more concerned with determining that an extension is redundant because the bits already have the proper value rather than the bits not being used later. It might be the case that you can extend what's already in combine to do what you want. jeff
Re: How to implement pattens with more that 30 alternatives
On 12/22/09, Daniel Jacobowitz wrote: > Interesting, I knew you had a lot of Cirrus patches but I didn't > realize the state of the checked-in code was so bad. > > Is what's there useful or actively harmful? Neither useful nor harmful except in that it adds noise to the arm backend. It's useful if you want to get a working compiler by applying my patches... The basic insn description is ok but the algorithms to use the insns are defective; I suppose it's passively harmful since until it's fixed it just adds noise and size to the arm backend. I did the copyright assignment thing but I haven't mainlined the code, partly because it currently has an embarassing -mcirrus-di flag to enable the imperfect 64-bit int support, partly out of laziness (the dejagnu testsuite for all insns it can generate and for the more interesting resolved bugs). Maybe one day... M
gcc-4.4-20091222 is now available
Snapshot gcc-4.4-20091222 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20091222/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.4 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch revision 155409 You'll find: gcc-4.4-20091222.tar.bz2 Complete GCC (includes all of below) gcc-core-4.4-20091222.tar.bz2 C front end and core compiler gcc-ada-4.4-20091222.tar.bz2 Ada front end and runtime gcc-fortran-4.4-20091222.tar.bz2 Fortran front end and runtime gcc-g++-4.4-20091222.tar.bz2 C++ front end and runtime gcc-java-4.4-20091222.tar.bz2 Java front end and runtime gcc-objc-4.4-20091222.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.4-20091222.tar.bz2The GCC testsuite Diffs from 4.4-20091215 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.4 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: How to implement pattens with more that 30 alternatives
2009/12/22 Richard Earnshaw : > > On Mon, 2009-12-21 at 18:44 +, Paul Brook wrote: >> > > > I am doing a port in GCC 4.4.0 for a 32 bit target. As a part of >> > > > scheduling framework i have to write the move patterns with more >> > > > clarity, so that i could control the scheduling with the help of >> > > > attributes. Re-writting the pattern resulted in movsi pattern with 41 >> > > > alternatives :( >> > > >> > > Use rtl expressions instead of alternatives. e.g. arm.md:arith_shiftsi >> > >> > Or use the more modern iterators approach. >> >> Aren't iterators for generating multiple insns (e.g. movsi and movdi) from >> the >> same pattern, whereas in this case we have a single insn that needs to >> accept >> many different operand combinartions? > > Yes, but that is often better, I suspect, than having too fancy a > pattern that breaks the optimization simplifications that genrecog does. > > Note that the attributes that were requested could be made part of the > iterator as well, using a mode_attribute. > I can't find a back-end that does this. Can you show me a example? Regards, Shafi
Question on PR36873
Hi, We just got a similar problem on Blackfin GCC recently. Let me take the test code from the bug as an example: typedef unsigned short u16; typedef unsigned int u32; u32 a(volatile u16* off) { return *off; } u32 b(u16* off) { return *off; } compiled with mingw32-gcc-4.3.0.exe -c -O2 -fomit-frame-pointer -mtune=core2 test.c it produces: <_a>: 0: 8b 44 24 04 mov0x4(%esp),%eax 4: 0f b7 00movzwl (%eax),%eax 7: 0f b7 c0movzwl %ax,%eax <== The redundant insn a: c3 ret 0010 <_b>: 10: 8b 44 24 04 mov0x4(%esp),%eax 14: 0f b7 00movzwl (%eax),%eax 17: c3 ret I don't understand Richard's comment. What do we not optimize volatile accesses in this test case. I know we cannot do many optimizations on volatile accesses, but I think it's OK to remove the redundant insn in this case. Could someone provide me a case in which we cannot remove it. Thanks, Jie
Package hosting sites for MPC
If anyone would like to obtain MPC from a pre-built package, there is a page on the MPC website listing known providers for various OSes. http://www.multiprecision.org/index.php?prog=mpc&page=packages It's still missing several of our important platforms, including solaris, aix and hpux. If you can convince your favorite package site (e.g. blastwave.org or hpux.connect.org.uk) to offer binaries for these systems, please do and notify the MPC mailing list (not me) about it here: http://lists.gforge.inria.fr/mailman/listinfo/mpc-discuss Thanks, --Kaveh -- Kaveh R. Ghazi