On Sat, Oct 4, 2014 at 12:49 PM, Richard Sandiford <rdsandif...@googlemail.com> wrote: > Uros Bizjak <ubiz...@gmail.com> writes: >> On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov <vmaka...@redhat.com> >> wrote: >> >>>>>> I guess we achieved the consensus about the following patch to fix >>>>>> PR61360 >>>>>> >>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 >>>>>> >>>>>> The patch was successfully bootstrapped and tested (w/wo >>>>>> -march=amdfam10) on x86/x86-64. >>>>>> >>>>>> Is it ok to commit to trunk? >>>>> >>>>> >>>>> I've tested your patch and unfortunately it doesn't work: >>>>> >>>>> In file included from >>>>> /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: >>>>> /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void >>>>> Process(JSContext*, JSObject*, const char*, bool)’: >>>>> /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler >>>>> error: in lra_update_insn_recog_data, at lra.c:1221 >>>>> } >>>>> ^ >>>>> 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) >>>>> ../../gcc/gcc/lra.c:1220 >>>>> 0xab450f eliminate_regs_in_insn >>>>> ../../gcc/gcc/lra-eliminations.c:1077 >>>>> 0xab450f process_insn_for_elimination >>>>> ../../gcc/gcc/lra-eliminations.c:1344 >>>>> 0xab450f lra_eliminate(bool, bool) >>>>> ../../gcc/gcc/lra-eliminations.c:1408 >>>>> 0xa9f2da lra(_IO_FILE*) >>>>> ../../gcc/gcc/lra.c:2270 >>>>> 0xa5d659 do_reload >>>>> ../../gcc/gcc/ira.c:5311 >>>>> 0xa5d659 execute >>>>> ../../gcc/gcc/ira.c:5470 >>>> >>>> >>>> Testcase is attached: >>>> >>>> % g++ -c -march=amdfam10 -w -O2 js.ii >>>> js.ii: In function ‘void RunFile(C)’: >>>> js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at >>>> lra.c:1221 >>>> >>> >>> Thanks for reporting this, Marcus. The problem now is in >>> optimize_function_for_size_p. It is false, when we define and cache enable >>> attributes at early stages (instantation of virtual regs) and true later. >>> >>> It is returning us to the same problem. I believe that we should not have >>> enable attributes depending on optimization options or on the current >>> running pass if we don't want the current solution in the trunk (with >>> recog_init). Setting right value for optimize_function_for_size_p does not >>> solve the problem as we can have different options for different functions >>> in the same compilation file. >>> >>> So minimal solution would be removing optimize_function_for_size_p from the >>> attribute definition. But I guess we could remove all condition. >>> Unfortunately, Ganesh did not post is it really beneficial to switch off >>> this alternative for AMD CPUs even if AMD optimization guide recommends it. >> >> I propose to split the pattern into size-optimized and normal pattern. >> The patch implements this proposal. > > An alternative would be to add two new enabled-like attributes, > "good_for_size" and "good_for_speed", that say whether it is efficient > to use a particular alternative. These attributes wouldn't ever stop > an existing instruction from being recognised; they would just say > whether the RA and optimisers should consider the alternative when > optimising for size or speed respectively. The attributes would have > the same restrictions as the enabled attribute and could be cached in > the same way. > > The "enabled" attribute would then be purely about whether an > instruction is available, not whether it's efficient in a particular > situation. > > The main advantage is that it would be possible to make the size/speed > choice at a basic block level rather than a function level. In the worst > case we might move an instruction from a hot block to a cold block or > vice versa, but with intelligent spilling that shouldn't happen too > often and at least it wouldn't trigger an ICE. > > If that sounds OK, I'll try to get something together next week. > I think Ramana said he had a use for this on ARM too.
I think that this would be way better than duplication of patterns. Perhaps we can name these attributes "enable_for_size" and "enable_for_speed", and have them in addition to "enable" attribute. The final enable condition would be an union of "enable", "enable_for_speed" and "enable_for_size" attributes. Uros.