http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423
--- Comment #19 from chrbr at gcc dot gnu.org 2012-07-11 15:24:27 UTC --- (In reply to comment #18) > (In reply to comment #17) > > Created attachment 27775 [details] > > plus add combine > > > > Here is the patch that I've been running since some time, it also use the > > same > > combine pattern matcher, but the goal of this patch was originally to fix up > > chains of multiple mult-add instructions. > > Optimizing the cst+reg addressing mode appears as a nice effects. Out of > > range > > indexes seems to be handled as afar as I can see. > > > > This brings a EEMBC telecom speedup of 10%.FFMPEG code size reduced to 30% > > on a > > few objects. > > Validated on whole linux distribution, with only improvements (few > > regression > > only bellow noise). > > Interesting. > BTW, do you happen to have any (runtime) numbers for GCC 4.7.x vs current GCC > 4.8? > for now I only track the 4.6 and 4.7 branches. the 4.8 is moving too fast, but I could easily cheery-pick your the other SH changes (like your fix for PR53911) btw I only bench on the SH4 and SH4A. > > This patch is only for comments/illustration. Need a few polishing before > > proposing. I'm having a look at your implementation to see how they compare > > and > > possibly combined together. Both approaches look interesting. > > I guess folding the mul-add sequences like you did should be more useful than > just > handling one mem:SI pattern. In any case, if you find my impl useful please > let me know, > because then I'd also pop in patterns for mem:QI and mem:HI patterns. sure. by the way, my patch is not complete to fix the original problem. I need to extract other chunks that unleash it. Will post.