Re: Indirect memory addresses vs. lra
Hi John, On Mon, Aug 12, 2019 at 08:47:43AM +0200, John Darrington wrote: > On Sat, Aug 10, 2019 at 11:12:18AM -0500, Segher Boessenkool wrote: > On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote: > > Choosing alt 5 in insn 14: (0) m (1) m {*movsi} > >14: [r40:PSI+0x20]=[r41:PSI] > > Inserting insn reload before: > >48: r40:PSI=r34:PSI > >49: r41:PSI=[y:PSI+0x2f] > > insn 14 is a mem-to-mem move (another feature not many more modern / > more RISCy CPUs have). That requires both of your address registers. > So far, so good. The reloads (insn 48 and 49) require address > registers themselves; that isn't necessarily a problem either. > > So far as I can see, insn 48 is completely redundant. It's copying a > pseudo reg (74) into another pseudo reg (40). > This is pointless and a waste, since insn 14 does not modify 74. > I don't understand why lra feels the need to do it. LRA always does this, I think... it reloads all inputs to all insns that may need reloading. It later optimises most of that away again, but this gives it a lot of freedom to move things around. Or that is what it always looked like to me. I haven't looked at the code to see if that is the real reason, blush. > If lra knew about (mem (mem ...)) style addressing, then insn 49 would > also be redundant (which is why I raised the topic). Yes. But it probably should be able to deal with things like this, too, or some other testcases will die a horrible death. > In summary, what we have is: > > (insn 48 84 49 2 (set (reg/f:PSI 40 [34]) > (reg/f:PSI 74 [34])) > (nil)) > (insn 49 48 14 2 (set (reg:PSI 41) > (mem/f/c:PSI (plus:PSI (reg/f:PSI 9 y) > (const_int 47 [0x2f])) [3 p+0 S4 A8])) > (nil)) > (insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34]) > (const_int 32 [0x20])) [2 S4 A64]) > (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8])) > > where, like you say, insns 48 and 49 are reloads. But these two reloads > are unnecessary and cause the machine to run out of PSImode registers. Anyway, please have patience, and see what Vladimir comes up with. These things take time. Segher
GCC 9.2 Released
The GNU Compiler Collection version 9.2 has been released. GCC 9.2 is a bug-fix release from the GCC 9 branch containing important fixes for regressions and serious bugs in GCC 9.1 with more than 68 bugs fixed since the previous release. This release is available from the FTP servers listed at: http://www.gnu.org/order/ftp.html Please do not contact me directly regarding questions or comments about this release. Instead, use the resources available from http://gcc.gnu.org. As always, a vast number of people contributed to this GCC release -- far too many to thank them individually!
GCC 9.3 Status Report (2019-08-12)
Status == GCC 9.2 has been released and the branch is again open for regression and documentation fixes. History makes us expect a GCC 9.3 release at the end of this or the beginning of next year. Quality Data Priority # Change from last report --- --- P10 P2 198 P3 51 + 2 P4 141 P5 24 --- --- Total P1-P3 249 + 2 Total 414 + 2 Previous Report === https://gcc.gnu.org/ml/gcc/2019-08/msg00016.html
Re: Indirect memory addresses vs. lra
On 2019-08-10 2:05 a.m., John Darrington wrote: On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote: If you provide LRA dump for such test (it is better to use -fira-verbose=15 to output full RA info into stderr), I probably could say more. I've attached such a dump (generated from gcc/testsuite/gcc.c-torture/compile/pr53410-2.c). Unfortunately, this info is not enough for me to say what is the problem. I only found suspicious that LRA is trying to assign a few registers to a pseudo register and fails even though these registers are not assigned to anything. Probably HARD_REGNO_MODE_OK prevents this. So it would be interesting to know how many registers of Pmode are actually available. In any case I'll try to look at this problem more on this week using your built gcc on gcc135. The less regs the architecture has, thoke easier to run into such error message if something described wrong in the back-end.?? I see your architecture is 16-bit micro-controller with only 8 regs, some of them is specialized.?? So your architecture is really register constrained. That's not quite correct. It is a 24-bit micro-controller (the address space is 24 bits wide). There are 2 address registers (plus stack pointer and program counter) and there are 8 general purpose data registers (of differing sizes).
Re: Expansion of narrowing math built-ins into power instructions
Hi, I have the following code in my rs6000.md (I haven't used new TARGET_* yet) : (define_expand "add_truncdfsf3" [(set (match_operand:SF 0 "gpc_reg_operand") (float_truncate:SF (plus:DF (match_operand:DF 1 "gpc_reg_operand") (match_operand:DF 2 "gpc_reg_operand"] "TARGET_HARD_FLOAT" "") (define_insn "*add_truncdfsf3_fpr" [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") (float_truncate:SF (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d,wa") (match_operand:DF 2 "gpc_reg_operand" "d,wa"] "TARGET_HARD_FLOAT" "@ fadds %0,%1,%2 xsaddsp %x0,%x1,%x2" [(set_attr "type" "fp")]) with following optab in optabs.def : OPTAB_CD(fadd_optab, "add_trunc$b$a3") (what is the difference between $b$a and $a$b?) I have also tried adding fadd, add_truncdfsf3 in rs6000-builtin.def, examined rtl dumps multiple times but couldn't get fadd to be exapanded. What am I missing here? Thanks, Tejas On Sun, 11 Aug 2019 at 22:29, Segher Boessenkool wrote: > > Hi Tejas, > > On Sat, Aug 10, 2019 at 04:00:53PM +0530, Tejas Joshi wrote: > > +(define_expand "add_truncdfsf3" > > + [(set (float_extend:DF (match_operand:SF 0 "gpc_reg_operand")) > > + (plus:DF (match_operand:DF 1 "gpc_reg_operand") > > + (match_operand:DF 2 "gpc_reg_operand")))] > > + "TARGET_HARD_FLOAT" > > + "") > > float_extend on the LHS is never correct. I think the following should > work, never mind that it looks like it does double rounding, because it > doesn't (famous last words ;-) ): > > (define_expand "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand") > (float_truncate:SF > (plus:DF (match_operand:DF 1 "gpc_reg_operand") >(match_operand:DF 2 "gpc_reg_operand"] > "TARGET_HARD_FLOAT" > "") > > > +(define_insn "*add_truncdfsf3_fpr" > > + [(set (float_extend:DF (match_operand:SF 0 "gpc_reg_operand" "=")) > > + (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%") > > + (match_operand:DF 2 "gpc_reg_operand" "")))] > > + "TARGET_HARD_FLOAT" > > + "fadd %0,%1,%2" > > + [(set_attr "type" "fp")]) > > The constraints should be "f", "%d", "d", respectively. says to > display something for the mode in a mode iterator. There is no mode > iterator here. (In what you copied this from, there was SFDF). > > You want to output "fadds", not "fadd". > > Maybe it is easier to immediately write the VSX scalar version for this > as well? That's xsaddsp. Oh, and you need to restrict all of this to > more recent CPUs, we'll have to do some new TARGET_* flag for that I > think. > > Finally: please send patches to gcc-patches@ (not gcc@). > > Thanks, > > > Segher
Re: Expansion of narrowing math built-ins into power instructions
On Mon, Aug 12, 2019 at 11:01:11PM +0530, Tejas Joshi wrote: > I have the following code in my rs6000.md (I haven't used new TARGET_* yet) : > > (define_expand "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand") >(float_truncate:SF >(plus:DF (match_operand:DF 1 "gpc_reg_operand") > (match_operand:DF 2 "gpc_reg_operand"] > "TARGET_HARD_FLOAT" > "") > > (define_insn "*add_truncdfsf3_fpr" > [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") >(float_truncate:SF >(plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d,wa") > (match_operand:DF 2 "gpc_reg_operand" "d,wa"] > "TARGET_HARD_FLOAT" > "@ >fadds %0,%1,%2 >xsaddsp %x0,%x1,%x2" > [(set_attr "type" "fp")]) Those look fine. You can also merge them into one: (define_insn "add_truncdfsf3" [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") (float_truncate:SF (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d,wa") (match_operand:DF 2 "gpc_reg_operand" "d,wa"] "TARGET_HARD_FLOAT" "@ fadds %0,%1,%2 xsaddsp %x0,%x1,%x2" [(set_attr "type" "fp")]) > with following optab in optabs.def : > > OPTAB_CD(fadd_optab, "add_trunc$b$a3") (what is the > difference between $b$a and $a$b?) Which of the two modes becomes $a and which becomes $b? It depends on the definition of fadd_optab what order is expected, I think. Segher
Re: Expansion of narrowing math built-ins into power instructions
On Mon, 12 Aug 2019, Segher Boessenkool wrote: > (define_insn "add_truncdfsf3" > [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") > (float_truncate:SF > (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d,wa") > (match_operand:DF 2 "gpc_reg_operand" "d,wa"] That sort of pattern is incorrect for a fused operation such as fadd, because combine could match it for code that is supposed to do separate addition and narrowing conversion. The RTL needs to be something that does *not* match the combination of separate operations (just as fma has its own RTL, and a separate pass is responsible for converting separate operations to fused ones in the -ffp-contract=fast case where it's permitted). -- Joseph S. Myers jos...@codesourcery.com
Re: Expansion of narrowing math built-ins into power instructions
On Mon, Aug 12, 2019 at 09:20:18PM +, Joseph Myers wrote: > On Mon, 12 Aug 2019, Segher Boessenkool wrote: > > > (define_insn "add_truncdfsf3" > > [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa") > > (float_truncate:SF > > (plus:DF (match_operand:DF 1 "gpc_reg_operand" "%d,wa") > >(match_operand:DF 2 "gpc_reg_operand" "d,wa"] > > That sort of pattern is incorrect for a fused operation such as fadd, > because combine could match it for code that is supposed to do separate > addition and narrowing conversion. The RTL needs to be something that > does *not* match the combination of separate operations (just as fma has > its own RTL, and a separate pass is responsible for converting separate > operations to fused ones in the -ffp-contract=fast case where it's > permitted). Ugh, we allow disabling contraction, I forgot. Rats. Segher