Re: Expansion of narrowing math built-ins into power instructions
Segher Boessenkool writes: > On Thu, Aug 15, 2019 at 01:47:47PM +0100, Richard Sandiford wrote: >> Tejas Joshi writes: >> > Hello. >> > I just wanted to make sure that I am looking at the correct code here. >> > Except for rtl.def where I should be introducing something like >> > float_contract (or float_narrow?) and also simplify-rtx.c, breakpoints > > I like that "float_narrow" name :-) > >> > set on functions around expr.c, cfgexpand.c where I grep for >> > float_truncate/FLOAT_TRUNCATE did not hit. >> > Also, in what manner should float_contract/narrow be different from >> > float_truncate as both are trying to do similar things? (truncation >> > from DF to SF) >> >> I think the code should instead be a fused addition and truncation, >> a bit like FMA is a fused addition and multiplication. Describing it as >> a DFmode addition followed by some conversion to SF would still involve >> double rounding. > > How so? It would *mean* there is only single rounding, even! That's > the whole point of it. But a PLUS should behave as a PLUS in any context. Making its behaviour dependent on the containing rtxes (if any) would be a can of worms. Richard
Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
On Thu, Aug 15, 2019 at 02:23:45PM -0400, Vladimir Makarov wrote: > I tried this solution earlier. But unfortunately it makes things worse. What happens is it libgcc cannot > even be built -- ICEs occur on a memory from address reg insn such as: > (insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309) > (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8]) > (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi} > I see.?? Then for the insn, you could try to create a pattern "memory,special memory constraint".?? The special memory constraint should satisfy only spilled pseudo (pseudo with reg_renumber == -1).?? I believe lra-constraints.c can spill the pseudo and the end you will have mem[disp1 + r8|r9|sp] = mem[disp1+sp]. You mean something like this: (define_special_memory_constraint "a" "My special memory constraint" (match_operand 0 "my_special_predicate") ) (define_predicate "my_special_predicate" (match_operand 0 "memory_operand") { debug_rtx (op); if (MEM_P (op)) { op = XEXP (op, 0); if (GET_CODE (op) == PLUS) { op = XEXP (op, 0); if (REG_P (op)) { fprintf (stderr, "Reg number is %d\n", REGNO (op)); if (REGNO (op) >= 0) return false; } } } return true; }) When I use this I get lots of the following ICEs "internal compiler error: maximum number of generated reload insns per insn achieved (90)" It seems logical to me that this would happen since the constraint is not going to match any operand with resolved registers. Thus it will continually reload. ... which makes me think I've probably misunderstood what you are saying. J' -- Avoid eavesdropping. Send strong encrypted email. PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://sks-keyservers.net or any PGP keyserver for public key.
Re: Special Memory Constraint [was Re: Indirect memory addresses vs. lra]
On 2019-08-16 7:23 a.m., John Darrington wrote: On Thu, Aug 15, 2019 at 02:23:45PM -0400, Vladimir Makarov wrote: > I tried this solution earlier. But unfortunately it makes things worse. What happens is it libgcc cannot > even be built -- ICEs occur on a memory from address reg insn such as: > (insn 117 2981 3697 5 (set (mem/f:PSI (plus:PSI (reg:PSI 1309) > (const_int 102 [0x66])) [3 fs_129(D)->pc+0 S4 A8]) > (reg:PSI 1310)) "/home/jmd/Source/GCC2/libgcc/unwind-dw2.c":977:9 96 {movpsi} > I see.?? Then for the insn, you could try to create a pattern "memory,special memory constraint".?? The special memory constraint should satisfy only spilled pseudo (pseudo with reg_renumber == -1).?? I believe lra-constraints.c can spill the pseudo and the end you will have mem[disp1 + r8|r9|sp] = mem[disp1+sp]. You mean something like this: (define_special_memory_constraint "a" "My special memory constraint" (match_operand 0 "my_special_predicate") ) (define_predicate "my_special_predicate" (match_operand 0 "memory_operand") { debug_rtx (op); if (MEM_P (op)) { op = XEXP (op, 0); if (GET_CODE (op) == PLUS) { op = XEXP (op, 0); if (REG_P (op)) { fprintf (stderr, "Reg number is %d\n", REGNO (op)); if (REGNO (op) >= 0) return false; } } } return true; }) No I meant something like that (define_special_memory_constraint "a" ...) (define_predicate "my_special_predicate" ... { if (lra_in_progress_p) return REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO(op)] < 0; return true if memory with sp addressing; }) I think LRA spills pseudo-register and it will be memory addressed by sp at the end of LRA. When I use this I get lots of the following ICEs "internal compiler error: maximum number of generated reload insns per insn achieved (90)" It seems logical to me that this would happen since the constraint is not going to match any operand with resolved registers. Thus it will continually reload. ... which makes me think I've probably misunderstood what you are saying. J'
gcc-8-20190816 is now available
Snapshot gcc-8-20190816 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/8-20190816/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch revision 274590 You'll find: gcc-8-20190816.tar.xzComplete GCC SHA256=f5ad4a42df2ce767e050faca2ba8c7c45b72e834fed5afeccdc5071995e020e3 SHA1=3090a37707212d2deb9deece7fc3885fb3fd4f7b Diffs from 8-20190809 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Question about pass_fix_loops or other loop fixing classes
Greetings, I was wondering why the execute function in the file tree-ssa-loop.c does not for all the pass fixing classes run it on a separate thread. Seems a good idea to call execute for these classes on another thread and then join up with the main thread into order to avoid shared state or waiting on complex loops to be optimized. Maybe I'm missing something so if anyone has a good reason why not let me know, Nick
Re: Expansion of narrowing math built-ins into power instructions
Hi, > It's just a different name, nothing more, nothing less. Because it is > a different name it can not be accidentally generated from actual > truncations. I have introduced float_narrow but I could not find appropriate places to generate it for a call to fadd instead it to generate a CALL. I used GDB to set breakpoints which hit fold_rtx and cse_insn but I got confused with the rtx codes and passes which generate respective RTL. It should not be similar to FLOAT_TRUNCATE if we want to avoid it generating for actual truncations? Thanks, Tejas On Fri, 16 Aug 2019 at 15:53, Richard Sandiford wrote: > > Segher Boessenkool writes: > > On Thu, Aug 15, 2019 at 01:47:47PM +0100, Richard Sandiford wrote: > >> Tejas Joshi writes: > >> > Hello. > >> > I just wanted to make sure that I am looking at the correct code here. > >> > Except for rtl.def where I should be introducing something like > >> > float_contract (or float_narrow?) and also simplify-rtx.c, breakpoints > > > > I like that "float_narrow" name :-) > > > >> > set on functions around expr.c, cfgexpand.c where I grep for > >> > float_truncate/FLOAT_TRUNCATE did not hit. > >> > Also, in what manner should float_contract/narrow be different from > >> > float_truncate as both are trying to do similar things? (truncation > >> > from DF to SF) > >> > >> I think the code should instead be a fused addition and truncation, > >> a bit like FMA is a fused addition and multiplication. Describing it as > >> a DFmode addition followed by some conversion to SF would still involve > >> double rounding. > > > > How so? It would *mean* there is only single rounding, even! That's > > the whole point of it. > > But a PLUS should behave as a PLUS in any context. Making its > behaviour dependent on the containing rtxes (if any) would be a > can of worms. > > Richard