Why doesn't this pattern match?

2022-01-06 Thread Andras Tantos
Hello!

My name is Andras Tantos and I just joined this list, so if I'm asking
something off-topic or not following the rules of the community, please
let me know.

What I'm working on is to port GCC (and Binutils) to a new CPU ISA, I
call 'brew'. During developing for this target, I got the following
error:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
first.c: In function ‘test_call’:
first.c:61:52: error: insn does not satisfy its constraints:
   61 | int test_call(int a, int b) { return test_qm(a,b); }
  |^
(insn 25 8 9 (set (reg:SI 6 $r6)
(reg:SI 0 $pc)) "first.c":61:38 17 {*movsi}
 (nil))
during RTL pass: final
first.c:61:52: internal compiler error: in final_scan_insn_1, at
final.c:2811
0x6c4c23 _fatal_insn(char const*, rtx_def const*, char const*, int,
char const*)
../../brew-gcc/gcc/rtl-error.c:108
0x6c4c4f _fatal_insn_not_found(rtx_def const*, char const*, int, char
const*)
../../brew-gcc/gcc/rtl-error.c:118
0x643585 final_scan_insn_1
../../brew-gcc/gcc/final.c:2811
0xb1ef3f final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
../../brew-gcc/gcc/final.c:2940
0xb1f207 final_1
../../brew-gcc/gcc/final.c:1997
0xb1fbe6 rest_of_handle_final
../../brew-gcc/gcc/final.c:4285
0xb1fbe6 execute
../../brew-gcc/gcc/final.c:4363
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Clearly, the compiler couldn't find a rule that works for this register
move. The relevant section of the .md file is:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(define_expand "movsi"
   [(set (match_operand:SI 0 "general_operand" "")
 (match_operand:SI 1 "general_operand" ""))]
   ""
  "
{
  /* If this is a store, force the value into a register.  */
  if (! (reload_in_progress || reload_completed))
  {
if (MEM_P (operands[0]))
{
  operands[1] = force_reg (SImode, operands[1]);
  if (MEM_P (XEXP (operands[0], 0)))
operands[0] = gen_rtx_MEM (SImode, force_reg (SImode, XEXP
(operands[0], 0)));
}
else 
  if (MEM_P (operands[1])
  && MEM_P (XEXP (operands[1], 0)))
operands[1] = gen_rtx_MEM (SImode, force_reg (SImode, XEXP
(operands[1], 0)));
  }
}")

(define_insn "*movsi"
  [(set (match_operand:SI 0
"nonimmediate_operand""=r,r,r,W,A,B,r,r,r")
(match_operand:SI 1 "brew_general_mov_src_operand"
"O,r,i,r,r,r,W,A,B"))]
  "register_operand (operands[0], SImode)
   || register_operand (operands[1], SImode)"
  "@
   %0 <- %0 - %0
   %0 <- %1
   %0 <- %1
   mem[%0] <- %1
   mem[%0] <- %1
   mem[%0] <- %1
   %0 <- mem[%1]
   %0 <- mem[%1]
   %0 <- mem[%1]"
)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

As you can imagine, I'm fairly new to GCC development, so I must be
making some rookie mistake here, but I would have thought that the
second alternative in the "*movsi" rule above would match the pattern.

brew_general_mov_src_operand is defined as follows:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
(define_predicate "brew_general_mov_src_operand"
  (match_code "mem,const_int,reg,subreg,symbol_ref,label_ref,const")
{
  /* Any (MEM LABEL_REF) is OK.  That is a pc-relative load.  */
  if (MEM_P (op) && GET_CODE (XEXP (op, 0)) == LABEL_REF)
return 1;

  if (MEM_P (op)
  && GET_CODE (XEXP (op, 0)) == PLUS
  && GET_CODE (XEXP (XEXP (op, 0), 0)) == REG
  && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT
  )
return 1;
  /* Any register is good too */
  if (REG_P(op))
return 1;
  /* PC as source is also acceptable */
  if (op == pc_rtx)
return 1;
  return general_operand (op, mode);
})
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Thanks for all the help,
Andras




Re: Why doesn't this pattern match?

2022-01-07 Thread Andras Tantos
Thanks for the help, that's exactly it!

Andras

On Thu, 2022-01-06 at 20:25 -0800, Andrew Pinski wrote:
> On Thu, Jan 6, 2022 at 8:13 PM Andras Tantos  > wrote:
> > Hello!
> > 
> > My name is Andras Tantos and I just joined this list, so if I'm
> > asking
> > something off-topic or not following the rules of the community,
> > please
> > let me know.
> > 
> > What I'm working on is to port GCC (and Binutils) to a new CPU ISA,
> > I
> > call 'brew'. During developing for this target, I got the following
> > error:
> 
> How are the following constraints defined:
> W,A,B
> 
> Does one include the pc register?
> Otherwise you have a mismatch between the predicate
> brew_general_mov_src_operand (which accepts the pc register) and the
> constraint which does not.
> 
> Thanks,
> Andrew Pinski
> 
> 
> > first.c: In function ‘test_call’:
> > first.c:61:52: error: insn does not satisfy its constraints:
> >61 | int test_call(int a, int b) { return test_qm(a,b); }
> >   |^
> > (insn 25 8 9 (set (reg:SI 6 $r6)
> > (reg:SI 0 $pc)) "first.c":61:38 17 {*movsi}
> >  (nil))
> > during RTL pass: final
> > first.c:61:52: internal compiler error: in final_scan_insn_1, at
> > final.c:2811
> > 0x6c4c23 _fatal_insn(char const*, rtx_def const*, char const*, int,
> > char const*)
> > ../../brew-gcc/gcc/rtl-error.c:108
> > 0x6c4c4f _fatal_insn_not_found(rtx_def const*, char const*, int,
> > char
> > const*)
> > ../../brew-gcc/gcc/rtl-error.c:118
> > 0x643585 final_scan_insn_1
> > ../../brew-gcc/gcc/final.c:2811
> > 0xb1ef3f final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
> > ../../brew-gcc/gcc/final.c:2940
> > 0xb1f207 final_1
> > ../../brew-gcc/gcc/final.c:1997
> > 0xb1fbe6 rest_of_handle_final
> > ../../brew-gcc/gcc/final.c:4285
> > 0xb1fbe6 execute
> > ../../brew-gcc/gcc/final.c:4363
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> > Please include the complete backtrace with any bug report.
> > See <https://gcc.gnu.org/bugs/> for instructions.
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > 
> > Clearly, the compiler couldn't find a rule that works for this
> > register
> > move. The relevant section of the .md file is:
> > 
> > (define_expand "movsi"
> >[(set (match_operand:SI 0 "general_operand" "")
> >  (match_operand:SI 1 "general_operand" ""))]
> >""
> >   "
> > {
> >   /* If this is a store, force the value into a register.  */
> >   if (! (reload_in_progress || reload_completed))
> >   {
> > if (MEM_P (operands[0]))
> > {
> >   operands[1] = force_reg (SImode, operands[1]);
> >   if (MEM_P (XEXP (operands[0], 0)))
> > operands[0] = gen_rtx_MEM (SImode, force_reg (SImode, XEXP
> > (operands[0], 0)));
> > }
> > else
> >   if (MEM_P (operands[1])
> >   && MEM_P (XEXP (operands[1], 0)))
> > operands[1] = gen_rtx_MEM (SImode, force_reg (SImode, XEXP
> > (operands[1], 0)));
> >   }
> > }")
> > 
> > (define_insn "*movsi"
> >   [(set (match_operand:SI 0
> > "nonimmediate_operand""=r,r,r,W,A,B,r,r,r")
> > (match_operand:SI 1 "brew_general_mov_src_operand"
> > "O,r,i,r,r,r,W,A,B"))]
> >   "register_operand (operands[0], SImode)
> >|| register_operand (operands[1], SImode)"
> >   "@
> >%0 <- %0 - %0
> >%0 <- %1
> >%0 <- %1
> >mem[%0] <- %1
> >mem[%0] <- %1
> >mem[%0] <- %1
> >%0 <- mem[%1]
> >%0 <- mem[%1]
> >%0 <- mem[%1]"
> > )
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > 
> > As you can imagine, I'm fairly new to GCC development, so I must be
> > making some rookie mistake here, but I would have thought that the
> > second alternative in the "*movsi" rule above would match the
> > pattern.
> > 
> > brew_general_mov_src_operand is defined as follows:
> > 
> > (define_predicate "brew_general_mov_src_operand"
> >   (match_code
> > "mem,const_int,reg,subreg,symbol_ref,label_ref,const")
> > {
> >   /* Any (MEM LABEL_REF) is OK.  That is a pc-relative load.  */
> >   if (MEM_P (op) && GET_CODE (XEXP (op, 0)) == LABEL_REF)
> > return 1;
> > 
> >   if (MEM_P (op)
> >   && GET_CODE (XEXP (op, 0)) == PLUS
> >   && GET_CODE (XEXP (XEXP (op, 0), 0)) == REG
> >   && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT
> >   )
> > return 1;
> >   /* Any register is good too */
> >   if (REG_P(op))
> > return 1;
> >   /* PC as source is also acceptable */
> >   if (op == pc_rtx)
> > return 1;
> >   return general_operand (op, mode);
> > })
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > 
> > Thanks for all the help,
> > Andras
> > 
> > 



What's wrong with this RTL?

2022-01-09 Thread Andras Tantos
All!

I'm trying to port GCC to a new target, I call 'brew'. I've based it on
the Moxie target mostly because of it's simplicity.

I must be doing something horribly wrong as the following C code crokes
in the LRA path:

   long long foo (long long a, long long *w)
   {
 return __builtin_add_overflow (a, a, w);
   }

The error message I get is the following:

   during RTL pass: reload
   ../brew-gcc-build/second.c: In function ‘foo’:
   ../brew-gcc-build/second.c:5:1: internal compiler error: maximum
   number of generated reload insns per insn achieved (90)
   5 | }
 | ^
   0xd23854 lra_constraints(bool)
../../brew-gcc/gcc/lra-constraints.c:5095
   0xd10322 lra(_IO_FILE*)
../../brew-gcc/gcc/lra.c:2336
   0xcc86d9 do_reload
../../brew-gcc/gcc/ira.c:5932
   0xcc86d9 execute
../../brew-gcc/gcc/ira.c:6118
   Please submit a full bug report,
   with preprocessed source if appropriate.
   Please include the complete backtrace with any bug report.
   See  for instructions.

   The repro seems to go away, if:

   1. If I don't use the return value from __builtin_add_overflow
   2. If I don't return a long long value
   3. If I use some other function, which is not inlined
   4. If I write and call some other (somewhat non-trivial) force-inline
   function myself

   In other words, this is pretty much the minimal repro, I've found so
   far.
  
   The initial RTL passed in to LRA appears to be:

   (insn 2 4 3 2 (set (mem/f/c:SI (plus:SI (reg/f:SI 16 ?ap)
   (const_int 12 [0xc])) [2 w+0 S4 A32])
   (reg:SI 4 $r4)) "../brew-gcc-build/second.c":2:1 23
   {*movsi_store}
(expr_list:REG_DEAD (reg:SI 4 $r4)
   (nil)))

   (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)

   (insn 8 3 6 2 (clobber (reg:DI 27 [ _5+8 ])) "../brew-gcc-
   build/second.c":3:10 -1
(nil))

   (insn 6 8 7 2 (set (subreg:SI (reg:DI 27 [ _5+8 ]) 0)
   (const_int 0 [0])) "../brew-gcc-build/second.c":3:10 20
   {*movsi_immed}
(nil))

   (insn 7 6 13 2 (set (subreg:SI (reg:DI 27 [ _5+8 ]) 4)
   (const_int 0 [0])) "../brew-gcc-build/second.c":3:10 20
   {*movsi_immed}
(nil))

   (insn 13 7 18 2 (clobber (reg:DI 30)) "../brew-gcc-
   build/second.c":3:10 -1
(nil))

   (insn 18 13 19 2 (clobber (reg:DI 33)) "../brew-gcc-
   build/second.c":3:10 -1
(nil))

   (insn 19 18 20 2 (clobber (reg:DI 36)) "../brew-gcc-
   build/second.c":3:10 -1
(nil))

   (insn 20 19 21 2 (set (subreg:SI (reg:DI 36) 4)
   (plus:SI (subreg:SI (reg:DI 30) 4)
   (subreg:SI (reg:DI 33) 4))) "../brew-gcc-
   build/second.c":3:10 2 {addsi3}
(nil))

   (insn 21 20 22 2 (set (reg:SI 37)
   (const_int 1 [0x1])) "../brew-gcc-build/second.c":3:10 20
   {*movsi_immed}
(nil))

   (jump_insn 22 21 83 2 (set (pc)
   (if_then_else (ltu (subreg:SI (reg:DI 36) 4)
   (subreg:SI (reg:DI 30) 4))
   (label_ref 24)
   (pc))) "../brew-gcc-build/second.c":3:10 38 {cbranchsi4}
(nil)
-> 24)

   (note 83 22 23 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

   (insn 23 83 24 3 (set (reg:SI 37)
   (const_int 0 [0])) "../brew-gcc-build/second.c":3:10 20
   {*movsi_immed}
(nil))

   (code_label 24 23 84 4 4 (nil) [1 uses])

   (note 84 24 25 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

   (insn 25 84 26 4 (set (subreg:SI (reg:DI 36) 0)
   (plus:SI (subreg:SI (reg:DI 30) 0)
   (subreg:SI (reg:DI 33) 0))) "../brew-gcc-
   build/second.c":3:10 2 {addsi3}
(expr_list:REG_DEAD (reg:DI 33)
   (expr_list:REG_DEAD (reg:DI 30)
   (nil

   (insn 26 25 27 4 (set (reg:SI 38)
   (plus:SI (reg:SI 37)
   (subreg:SI (reg:DI 36) 0))) "../brew-gcc-
   build/second.c":3:10 2 {addsi3}
(expr_list:REG_DEAD (reg:SI 37)
   (nil)))

   (insn 27 26 28 4 (set (subreg:SI (reg:DI 36) 0)
   (reg:SI 38)) "../brew-gcc-build/second.c":3:10 21
   {*movsi_move}
(expr_list:REG_DEAD (reg:SI 38)
   (nil)))

   (insn 28 27 29 4 (set (reg:SI 40)
   (mem/c:SI (plus:SI (reg/f:SI 16 ?ap)
   (const_int 4 [0x4])) [1 a+0 S4 A32])) "../brew-gcc-
   build/second.c":3:10 22 {*movsi_load}
(nil))

   (insn 29 28 30 4 (set (subreg:SI (reg:DI 39) 0)
   (xor:SI (reg:SI 40)
   (reg:SI 40))) "../brew-gcc-build/second.c":3:10 15
   {xorsi3}
(expr_list:REG_DEAD (reg:SI 40)
   (nil)))

   (insn 30 29 31 4 (set (reg:SI 41)
   (mem/c:SI (plus:SI (reg/f:SI 16 ?ap)
   (const_int 8 [0x8])) [1 a+4 S4 A32])) "../brew-gcc-
   build/second.c":3:10 22 {*movsi_load}
(nil))

   (insn 31 30 32 4 (set (subreg:SI (reg:DI 39) 4)
   (xor:SI (reg:SI 41)
   (reg:SI 41))) "../brew-gcc-build/second.c":3:10 15
   {xorsi3}
(expr_list:REG_DEAD (reg:SI 41)
   (nil)))

   (insn 32 31 33

Re: What's wrong with this RTL?

2022-01-10 Thread Andras Tantos
On Sun, 2022-01-09 at 22:19 -0800, Andrew Pinski wrote:
> On Sun, Jan 9, 2022 at 8:49 PM Andras Tantos  > wrote:
> > All!
> > 
> > I'm trying to port GCC to a new target, I call 'brew'. I've based
> > it on
> > the Moxie target mostly because of it's simplicity.
> > 
> > I must be doing something horribly wrong as the following C code
> > crokes
> > in the LRA path:
> > 
> >long long foo (long long a, long long *w)
> >{
> >  return __builtin_add_overflow (a, a, w);
> >}
> > 
> > The error message I get is the following:
> > 
> >during RTL pass: reload
> >../brew-gcc-build/second.c: In function ‘foo’:
> >../brew-gcc-build/second.c:5:1: internal compiler error: maximum
> >number of generated reload insns per insn achieved (90)
> >5 | }
> >  | ^
> >0xd23854 lra_constraints(bool)
> > ../../brew-gcc/gcc/lra-constraints.c:5095
> >0xd10322 lra(_IO_FILE*)
> > ../../brew-gcc/gcc/lra.c:2336
> >0xcc86d9 do_reload
> > ../../brew-gcc/gcc/ira.c:5932
> >0xcc86d9 execute
> > ../../brew-gcc/gcc/ira.c:6118
> >Please submit a full bug report,
> >with preprocessed source if appropriate.
> >Please include the complete backtrace with any bug report.
> >See <https://gcc.gnu.org/bugs/> for instructions.
> 
> This usually means the move instruction is being reloaded over and
> over again as you describe below.
> I think you should have one merged movsi instruction which handles
> all
> of the constraints together. mov is "special" in that it needs to be
> done that way otherwise this happens.
> But really there seems to be another issue where (subreg:SI (reg:DI))
> is not being accepted for the xor set too.
> What regclasses are being chosen for the reg DI mode? Etc.
> 
> Thanks,
> Andrew Pinski
> 

That's what it was: after merging all my movsi variants, the problem
went away.

Thanks for the help!
Andras





How to generate a call inst. sequence?

2022-01-18 Thread Andras Tantos

All,

I'm working on porting GCC to a processor architecture that doesn't have 
a (HW) stack nor a call instruction. This means that for calls, I need 
to generate the following instruction sequence:


    // move stack-pointer:
    $sp <- $sp-4
    // load return address:
    $r3 <- return_label
    // store return address on stack:
    mem[$sp] <- $r3
    // jump to callee:
    $pc <- 
  return_label:

Now, I can do all of that as a multi-instruction string sequence in my 
.md file (which is what I'm doing right now), but there are two problems 
with that approach. First, it hard-codes the temp register ($r3 above) 
and requires me to reserve it even though it could be used between calls 
by the register allocator. Second this approach (I think at least) 
prevents any passes from merging stack-frame preparation for the call 
arguments, such as eliminating the stack-pointer update above.


I thought I could circumvent these problems by emitting a piece of RTL 
in the 'call' pattern:


  (define_expand "call"
    [(call
  (match_operand:QI 0 "memory_operand" "")
  (match_operand 1 "" "")
    )]
    ""
  {
    brew_expand_call(Pmode, operands);
  })

where brew_expand_call is:

  void brew_expand_call(machine_mode mode, rtx *operands)
  {
    gcc_assert (MEM_P(operands[0]));

    rtx_code_label *label = gen_label_rtx();
    rtx label_ref = gen_rtx_LABEL_REF(SImode, label);
    rtx temp_reg = gen_reg_rtx(mode);

    // $sp <- $sp - 4
    emit_insn(gen_subsi3(
  stack_pointer_rtx,
  stack_pointer_rtx,
  GEN_INT(4)
    ));
    // $r3 <- 
    emit_insn(gen_move_insn(
  temp_reg,
  label_ref
    ));
    // mem[$sp] <- $r3
    emit_insn(gen_move_insn(
  gen_rtx_MEM(Pmode, stack_pointer_rtx),
  temp_reg
    ));
    emit_jump_insn(gen_jump(operands[0]));
    emit_label(label);
  }

If I try to compile the following test:

  void x(void)
  {
  }

  int main(void)
  {
    x();
    return 0;
  }

I get an assert:

  during RTL pass: expand
  dump file: call.c.252r.expand
  call.c: In function ‘main’:
  call.c:9:1: internal compiler error: in as_a, at is-a.h:242
  9 | }
    | ^
  0x6999b7 rtx_insn* as_a(rtx_def*)
  ../../brew-gcc/gcc/is-a.h:242
  0x6999b7 rtx_sequence::insn(int) const
  ../../brew-gcc/gcc/rtl.h:1439
  0x6999b7 mark_jump_label_1
  ../../brew-gcc/gcc/jump.cc:1077
  0xcfc31f mark_jump_label_1
  ../../brew-gcc/gcc/jump.cc:1171
  0xcfc73d mark_all_labels
  ../../brew-gcc/gcc/jump.cc:332
  0xcfc73d rebuild_jump_labels_1
  ../../brew-gcc/gcc/jump.cc:74
  0x9e8e62 execute
  ../../brew-gcc/gcc/cfgexpand.cc:6845

The reference dump file:

  ;; Function x (x, funcdef_no=0, decl_uid=1383, cgraph_uid=1, 
symbol_order=0)



  ;; Generating RTL for gimple basic block 2


  try_optimize_cfg iteration 1

  Merging block 3 into block 2...
  Merged blocks 2 and 3.
  Merged 2 and 3 without moving.
  Merging block 4 into block 2...
  Merged blocks 2 and 4.
  Merged 2 and 4 without moving.


  try_optimize_cfg iteration 2



  ;;
  ;; Full RTL generated for this function:
  ;;
  (note 1 0 3 NOTE_INSN_DELETED)
  (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (note 2 3 0 2 NOTE_INSN_FUNCTION_BEG)

  ;; Function main (main, funcdef_no=1, decl_uid=1386, cgraph_uid=2, 
symbol_order=1)



  ;; Generating RTL for gimple basic block 2

  ;; Generating RTL for gimple basic block 3



  EMERGENCY DUMP:

  int main ()
  {
  (note 3 1 2 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
  (note 2 3 4 4 NOTE_INSN_FUNCTION_BEG)

  (note 4 2 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
  (insn 5 4 6 2 (set (reg/f:SI 1 $sp)
  (minus:SI (reg/f:SI 1 $sp)
  (const_int 4 [0x4]))) "call.c":7:5 -1
  (nil))
  (insn 6 5 7 2 (set (reg:SI 25)
  (label_ref:SI 9)) "call.c":7:5 -1
  (insn_list:REG_LABEL_OPERAND 9 (nil)))
  (insn 7 6 8 2 (set (mem:SI (reg/f:SI 1 $sp) [0  S4 A32])
  (reg:SI 25)) "call.c":7:5 -1
  (nil))
  (jump_insn 8 7 9 2 (set (pc)
  (label_ref (mem:QI (symbol_ref:SI ("x") [flags 0x3] 
) [0 x S1 A8]))) "call.c":7:5 -1

  (nil))
  (code_label 9 8 10 2 3 (nil) [1 uses])
  (call_insn 10 9 11 2 (call (mem:QI (symbol_ref:SI ("x") [flags 0x3]  
) [0 x S1 A8])

  (const_int 16 [0x10])) "call.c":7:5 -1
  (nil)
  (nil))
  (insn 11 10 12 2 (set (reg:SI 23 [ _3 ])
  (const_int 0 [0])) "call.c":8:12 -1
  (nil))

  (code_label 12 11 13 3 4 (nil) [0 uses])
  (note 13 12 14 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
  (insn 14 13 15 3 (set (reg:SI 24 [  ])
  (reg:SI 23 [ _3 ])) "call.c":9:1 -1
  (nil))
  (jump_insn 15 14 16 3 (set (pc)
  (label_ref 17)) "call.c":9:1 -1
  (nil))

  (code_label 17 16 20 5 2 (nil) [0 uses])
  (note 20 17 18 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
  (insn 18 20 19 5 (set (reg/i:SI 4 $r4)
  (reg:SI 24 [  ])) "call.c":9:1 -1
  (nil))
  (insn 19 18 0 5 (use (reg/i:SI 4 $r4)) "call.c":9:1 -1
  (nil))

  }

As a test to narrow the problem down, I removed the 'emit_jump_insn' 
c

Re: How to generate a call inst. sequence?

2022-01-20 Thread Andras Tantos
On Wed, 2022-01-19 at 10:45 +, Richard Sandiford wrote:
> Andras Tantos  writes:
> > All,
> > 
> > I'm working on porting GCC to a processor architecture that doesn't
> > have 
> > a (HW) stack nor a call instruction. This means that for calls, I
> > need 
> > to generate the following instruction sequence:
> > 
> >  // move stack-pointer:
> >  $sp <- $sp-4
> >  // load return address:
> >  $r3 <- return_label
> >  // store return address on stack:
> >  mem[$sp] <- $r3
> >  // jump to callee:
> >  $pc <- 
> 
> Even though this is internally a jump, it still needs to be
> represented
> as a (call …) rtx in rtl, and emitted using emit_call_insn.
> 
> In other words, the "call" expander must always emit a call_insn
> of some kind.  (But it can emit other instructions too, such as the
> ones you describe above.)
> 
> Richard
> 

Richard,

Thanks for the reply. While what you're saying makes sense, it didn't
solve my problems. The symptoms changed, but didn't completely go away.
At the same time, I realized that - in this architecture - link-
register-style calls are more efficient anyway, so I've changed my call
implementation to that configuration. That got rid of the need for
sloving this particular problem.

So, just to document for people who might be looking at this thread in
the future: this wasn't the complete answer to my problem, but I took a
different route which removed the whole problem class.

Thanks again,
Andras





Benchmark recommendations needed

2022-02-15 Thread Andras Tantos
Hello all!

I'm working on porting GCC to a new processor architecture. I think
I've finally got to a fairly stable stage, so the next logical step
would be to test and optimize. For that, I would need some benchmarks,
and this is where I'm seeking your help.

This being a hobby project, I can't shell out $1000+ for the spec
suite. On top of that, I only have newlib ported as the runtime, which
means very limited support for OS facilities.

I already have dhrystone, but what else would you recommend using?

I'm looking for 'general purpose' payloads, things, where I can judge
object code size, instruction set utilization, look for tuning
opportunities, etc.

I would like to also be able to compare the results with other
architectures (FPGA cores, such as nios2 as well as some low-end cores,
such as 32-bit arm/thumb and riscv-RV32IMFC).

So, can you suggest some benchmarks or applications to be used as ones?

Thanks a bunch!
Andras




Re: Benchmark recommendations needed

2022-02-21 Thread Andras Tantos
That's true, I did notice GCC being rather ... peculiar about
drhystone. Is there a way to make it less clever about the benchmark?

Or is there some alteration to the benchmark I can make to not trigger
the special behavior in GCC?

Andras

On Mon, 2022-02-21 at 03:19 +, Gary Oblock via Gcc wrote:
> Trying to use the dhrystone isn't going to be very useful. It has
> many downsides not the least is that gcc's optimizer can run rings
> about it.
> 
> Gary
> 
> 
> From: Gcc  on
> behalf of gcc-requ...@gcc.gnu.org 
> Sent: Tuesday, February 15, 2022 6:25 AM
> To: gcc@gcc.gnu.org 
> Subject: Re:
> 
> [EXTERNAL EMAIL NOTICE: This email originated from an external
> sender. Please be mindful of safe email handling and proprietary
> information protection practices.]
> 
> 
> Send Gcc mailing list submissions to
> gcc@gcc.gnu.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> https://gcc.gnu.org/mailman/listinfo/gcc
> or, via email, send a message with subject or body 'help' to
> gcc-requ...@gcc.gnu.org
> 
> You can reach the person managing the list at
> gcc-ow...@gcc.gnu.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Gcc digest..."



Multiple types of load/store: how to create .md rules?

2022-05-02 Thread Andras Tantos
All,

Thanks for all the help from the past. I'm (still) working on porting
GCC to a new processor ISA and ran into the following problem: the CPU
supports two kinds of register+offset based loads (and stores).

The generic format accepts any base register and any offset. The syntax
for this type of operation is:

  $rX <- mem[$rB + ]

For code compatness reasons, there's another (shorter) form, that
accepts only $sp and $fp (the stack and frame pointers) as base
registers, and an offset in the range of -256 and 252. Finally this
form inly supports a transfer size of 32-bits. The syntax for this
format is:

  $rX <- mem[tiny $rB + ]

I would like to coerce GCC into using the 'tiny' form, whenever it can.
In order to do that, I've created a new predicate:

  (define_predicate "brew_tiny_memory_operand"
(and
  (match_operand 0 "memory_operand")
  (match_test
"MEM_P(op) && 
brew_legitimate_tiny_address_p(XEXP(op, 0))"
  )
)
  )

The function referenced in the predicate is as follows:

  static bool
  brew_reg_ok_for_tiny_base_p(const_rtx reg)
  {
int regno = REGNO(reg);

return
  regno == BREW_REG_FP ||
  regno == BREW_REG_SP ||
  regno == BREW_QFP ||
  regno == BREW_QAP;
  }

  bool
  brew_legitimate_tiny_address_p(rtx x)
  {
if (
  GET_CODE(x) == PLUS &&
  REG_P(XEXP(x, 0)) &&
  brew_reg_ok_for_tiny_base_p(XEXP(x, 0)) &&
  CONST_INT_P(XEXP(x, 1)) &&
  IN_RANGE(INTVAL(XEXP(x, 1)), -256, 252) &&
  (INTVAL(XEXP(x,1)) & 3) == 0
)
  return true;
return false;
  }

Finally, I've created rules for the use of these new predicates:

  (define_expand "movsi"
[(set
  (match_operand:SI 0 "general_operand" "")
  (match_operand:SI 1 "general_operand" "")
)]
""
"
  {
if (!(reload_in_progress || reload_completed))
  {
if(MEM_P(operands[0]))
  {
// For stores, force the second arg. into a register
operands[1] = force_reg(SImode, operands[1]);
// We should make sure that the address
// generated for the store is based on a
// + pattern
if(MEM_P(XEXP(operands[0], 0)))
  operands[0] = gen_rtx_MEM(
SImode,
force_reg(SImode, XEXP(operands[0], 0))
  );
  }
else if(MEM_P(operands[1]))
  {
// We should make sure that the address
// generated for the load is based on a
// + pattern
if(MEM_P(XEXP (operands[1], 0)))
  operands[1] = gen_rtx_MEM(
SImode,
force_reg(SImode, XEXP(operands[1], 0))
  );
  }
  }
  }")

  (define_insn "*movsi_tiny_store"
[(set
  (match_operand:SI 0 "brew_tiny_memory_operand"  "=m")
  (match_operand:SI 1 "register_operand"  "r")
)]
""
"mem[tiny %0] <- %1"
[(set_attr "length" "2")]
  )

  (define_insn "*movsi_tiny_load"
[(set
  (match_operand:SI 0 "register_operand"  "=r")
  (match_operand:SI 1 "brew_tiny_memory_operand"  "m")
)]
""
"%0 <- mem[tiny %1]"
[(set_attr "length" "2")]
  )

  (define_insn "*movsi_general"
[(set
  (match_operand:SI 0 "nonimmediate_operand"  "=r,r,r,r,m,r")
  (match_operand:SI 1 "general_operand""N,L,i,r,r,m")
)]
""
"@
%0 <- tiny %1
%0 <- short %1
%0 <- %1
%0 <- %1
mem[%0] <- %1
%0 <- mem[%1]"
[(set_attr "length" "2,4,6,2,6,6")]
  )

When I tested this code, I've noticed a funny thing: the function
prologs and epilogs seem to use the 'tiny' versions of loads/stores
just fine. However (I think) some of the spills/reloads for local
variables end up using the extended version. Even more surprising is
that this behavior only manifests itself during optimized (-Os, -O1,
-O2) compilation. It seems that -O0 is free from this problem. Here's
one example:

.file   "dtoa.c"
.text
.global __udivsi3
.p2align1
.global quorem
.type   quorem, @function
  quorem:
mem[tiny $sp + -4] <- $fp ### <--- OK
mem[tiny $sp + -8] <- $lr
mem[tiny $sp + -12] <- $r8
mem[tiny $sp + -16] <- $r9
mem[tiny $sp + -20] <- $r10
mem[tiny $sp + -24] <- $r11
$fp <- $sp
$sp <- short $sp - 48
$r9 <- $a0
$r8 <- mem[$a1 + 16]
$r0 <- mem[$a0 + 16]
if signed $r0 < $r8 $pc <- .L7
$r8 <- tiny $r8 + -1
$r0 <- tiny 2
$r0 <- $r8 << $r0
$r10 <- short $a0 + 20
$r1 <- $r10 + $r0
mem[$fp + -32] <- $r1  ## < should be 'tiny'
$r11 <- mem[$r1]

To a previous problem I've asked, Andrew Pinski replied that I should
merge all *movsi patterns into a single one to avoid (in that case)
strange deletions in the generated assembly. Is that possible here? It
appears to me that I would need the ability to differentiate the
different patterns using constraints, but is t