http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263
--- Comment #7 from Oleg Endo <oleg.e...@t-online.de> 2011-10-09 23:34:45 UTC --- (In reply to comment #6) > Yep, maintenance burden but I don't mean ack/nak for anything. > If it's enough fruitful, we should take that route. When it > gives 5% improvement in the usual working set like as CSiBE, > hundreds lines would be OK, but if it's ~0.5% or less, it doesn't > look worth to add many patterns for that. > > > Isn't there a way to tell the combine pass not to do so, but instead first > > look > > deeper at what is in the MD? > > I don't know how to do it cleanly. > I've tried out a couple of things and got some CSiBE numbers based on trunk rev 179430. Unfortunately only code size comparisons, no run time performance numbers. The tests were compiled with -ml -m4-single -Os -mnomacsave -mpretend-cmove -mfused-madd -freg-struct-return Option 1) Use many (~10) patterns in the MD and some cost calculation tuning. The last patch required some adaptation, because the combine pass started trying to match things slightly differently. I've also noticed that it requires a special case for one pattern on SH4A... size of all modules: 2916390 -> 2909026 -7364 / -0.252504 % avg difference over all modules: -409.111111 / -0.273887 % max: compiler 22808 -> 22812 +4 / +0.017538 % min: libpng-1.2.5 99120 -> 97804 -1316 / -1.327684 % Option 2) I've added another combine pass which has the make_compound_operation function turned off. The make_compound_operation function is used to produce zero_extract patterns. If the resulting "simplified" pattern does not match anything in the MD, combine reverts the transformation and proceeds with the next insn. That way, it never tries to match the tst #imm pattern in the MD. With this option only ~5 patterns seem to be required and a small extension of the costs calculation. size of all modules: 2916390 -> 2909170 -7220 / -0.247566 % avg difference over all modules: -401.111111 / -0.254423 % max: compiler 22808 -> 22812 +4 / +0.017538 % min: libpng-1.2.5 99120 -> 97836 -1284 / -1.295400 % Not so spectacular on average. It highly depends on the type of SW being compiled, but it hits quite a lot of files in CSiBE. Option 2 seems more robust even if it seems less effective, what do you think?