4.8 Regression] [SH] performance regression: lost mov @(disp,Rn)

olegendo at gcc dot gnu.org Wed, 11 Jul 2012 08:09:29 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423


--- Comment #18 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-11 
15:09:02 UTC ---
(In reply to comment #17)
> Created attachment 27775 [details]
> plus add combine
> 
> Here is the patch that I've been running since some time, it also use the same
> combine pattern matcher, but the goal of this patch was originally to fix up
> chains of multiple mult-add instructions.
> Optimizing the cst+reg addressing mode appears as a nice effects. Out of range
> indexes seems to be handled as afar as I can see.
> 
> This brings a EEMBC telecom speedup of 10%.FFMPEG code size reduced to 30% on 
> a
> few objects. 
> Validated on whole linux distribution, with only improvements (few regression
> only bellow noise).

Interesting.  
BTW, do you happen to have any (runtime) numbers for GCC 4.7.x vs current GCC
4.8?

> This patch is only for comments/illustration. Need a few polishing before
> proposing. I'm having a look at your implementation to see how they compare 
> and
> possibly combined together. Both approaches look interesting.

I guess folding the mul-add sequences like you did should be more useful than
just
handling one mem:SI pattern.  In any case, if you find my impl useful please
let me know,
because then I'd also pop in patterns for mem:QI and mem:HI patterns.

[Bug target/39423] [4.6/4.7/4.8 Regression] [SH] performance regression: lost mov @(disp,Rn)

Reply via email to