On Tue, 2020-09-15 at 10:49 +0930, Alan Modra via Gcc-patches wrote:
> This patch series fixes a number of issues in rs6000_rtx_costs, the
> aim being to provide costing somewhat closer to reality.  Probably
> the
> most important patch of the series is patch 4, which just adds a
> comment.  Without the analysis that went into that comment, I found
> myself making what seemed to be good changes but which introduced
> regressions.
> 
> So far these changes have not introduced any testsuite regressions
> on --with-cpu=power8 and --with-cpu=power9 all lang bootstraps on
> powerpc64le-linux.  Pat spec tested on power9 against a baseline
> master from a few months ago, seeing a few small improvements and no
> degradations above the noise.

I've read through all the patches in this series, (including the tests
that were sent a bit later).  Your use of comments does a good job
helping describe whats going on.

One comment/question/point of clarity for the AND patch that I'll send
separately.  

That said, the series lgtm.  :-) 

thanks, 
-Will


> 
> Some notes:
> 
> Examination of varasm.o shows quite a number of cases where
> if-conversion succeeds due to different seq_cost.  One example:
> 
> extern int foo ();
> int
> default_assemble_integer (unsigned size)
> {
>   extern unsigned long rs6000_isa_flags;
> 
>   if (size > (!((rs6000_isa_flags & (1UL << 35)) != 0) ? 4 : 8))
>     return 0;
>   return foo ();
> }
> 
> This rather horrible code turns the rs6000_isa_flags value into
> either
> 4 or 8:
>       rldicr 9,9,28,0
>       srdi 9,9,28
>       addic 9,9,-1
>       subfe 9,9,9
>       rldicr 9,9,0,61
>       addi 9,9,8
> Better would be
>       rldicl 9,9,29,63
>       sldi 9,9,2
>       addi 9,9,4
> 
> There is also a "rlwinm ra,rb,3,0,26" instead of "rldicr ra,rb,3,60",
> and "li r31,0x4000; rotldi r31,r31,17" vs.
> "lis r31,0x8000; clrldi r31,r31,32".
> Neither of these is a real change.  I saw one occurrence of a 5 insn
> sequence being replaced with a load from memory in
> default_function_rodata_section, for ".rodata", and others elsewhere.
> 
> Sometimes correct insn cost leads to unexpected results.  For
> example:
> 
> extern unsigned bar (void);
> unsigned
> f1 (unsigned a)
> {
>   if ((a & 0x01000200) == 0x01000200)
>     return bar ();
>   return 0;
> }
> 
> emits for a & 0x01000200
>  (set (reg) (and (reg) (const_int 0x01000200)))
> at expand time (two rlwinm insns) rather than the older
>  (set (reg) (const_int 0x01000200))
>  (set (reg) (and (reg) (reg)))
> which is three insns.  However, since 0x01000200 is needed later the
> older code after optimisation is smaller.

Reply via email to