On Sat, Jun 07, 2025 at 06:53:28PM +0300, Dimitar Dimitrov wrote:
> On Sat, Jun 07, 2025 at 11:38:46AM +0100, Stafford Horne wrote:
> > On Fri, Jun 06, 2025 at 09:54:53PM +0100, Stafford Horne wrote:
> > > On Fri, Jun 06, 2025 at 11:31:00PM +0300, Dimitar Dimitrov wrote:
> > > > On Fri, Jun 06, 2025 at 06:12:13PM +0100, Stafford Horne wrote:
> > > > > On Fri, Jun 06, 2025 at 04:41:21PM +0100, Stafford Horne wrote:
> > > > > > On Fri, Jun 06, 2025 at 04:20:10PM +0100, Richard Sandiford wrote:
> > > > > > > Stafford Horne <[email protected]> writes:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > This seems to be causing a build regression on the or1k port.
> > > > > > > >
> > > > > > > > I have not quite figured it out yet but I have bisected to this
> > > > > > > > commit.
> > > > > > > >
> > > > > > > > The failure is as below, this seems to be caused by the
> > > > > > > > cstoresi4 instruction produced by
> > > > > > > > or1k.md. So I think its likely something we are doing funny in
> > > > > > > > the port.
> > > > > > > >
> > > > > > > > Thanks to Jeff for pointing this out. If you have any idea let
> > > > > > > > me know.
> > > > > > > >
> > > > > > > > Log:
> > > > > > > >
> > > > > > > > during RTL pass: ce1
> > > > > > > > dump file: libgcc2.c.286r.ce1
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/libgcc/libgcc2.c: In
> > > > > > > > function '__muldi3':
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/libgcc/libgcc2.c:538:1:
> > > > > > > > internal compiler error: in emit_move_multi_word, at
> > > > > > > > expr.cc:4492
> > > > > > > > 538 | }
> > > > > > > > | ^
> > > > > > > > 0x1b094df internal_error(char const*, ...)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/diagnostic-global-context.cc:517
> > > > > > > > 0x6459c1 fancy_abort(char const*, int, char const*)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/diagnostic.cc:1815
> > > > > > > > 0x473a2e emit_move_multi_word
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/expr.cc:4492
> > > > > > > > 0x93fd7d emit_move_insn(rtx_def*, rtx_def*)
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/expr.cc:4746
> > > > > > > > 0x9175f1 copy_to_reg(rtx_def*)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/explow.cc:630
> > > > > > > > 0xd18977 gen_lowpart_general(machine_mode, rtx_def*)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/rtlhooks.cc:56
> > > > > > > > 0x9414c7 convert_mode_scalar
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/expr.cc:818
> > > > > > > > 0x9421a1 convert_modes(machine_mode, machine_mode,
> > > > > > > > rtx_def*, int)
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/expr.cc:961
> > > > > > > > 0xc0afa0 prepare_operand(insn_code, rtx_def*, int,
> > > > > > > > machine_mode, machine_mode, int)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/optabs.cc:4637
> > > > > > > > 0xc0b33a prepare_cmp_insn
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/optabs.cc:4538
> > > > > > > > 0xc0f0dd emit_conditional_move(rtx_def*, rtx_comparison,
> > > > > > > > rtx_def*, rtx_def*, machine_mode, int)
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/optabs.cc:5098
> > > > > > > > 0x192cc7b noce_emit_cmove
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:1777
> > > > > > > > 0x193365b try_emit_cmove_seq
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:3410
> > > > > > > > 0x193365b noce_convert_multiple_sets_1
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:3705
> > > > > > > > 0x1933c71 noce_convert_multiple_sets
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:3496
> > > > > > > > 0x1938687 noce_process_if_block
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:4045
> > > > > > > > 0x1938687 noce_find_if_block
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:4726
> > > > > > > > 0x1938687 find_if_header
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:4931
> > > > > > > > 0x1938687 if_convert
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:6078
> > > > > > > > 0x1938d01 rest_of_handle_if_conversion
> > > > > > > >
> > > > > > > > /home/shorne/work/gnu-toolchain/gcc/gcc/ifcvt.cc:6143
> > > > > > > >
> > > > > > > > -Stafford
> > > > > > >
> > > > > > > Can you attach the .i?
> > > > > >
> > > > > > Sure please find attached, I can reproduce with the or1k compiler
> > > > > > using:
> > > > > >
> > > > > > $HOME/work/gnu-toolchain/build-gcc/./gcc/cc1 -O2 __muldi3.i
> > > > > >
> > > > > > I am still debugging, but I notice when reverting the patch the
> > > > > > compiler never
> > > > > > goes to emit_move_multi_word(). I am trying to work backwards as
> > > > > > to where it goes
> > > > > > different, it's just taking a bit of time as I haven't debugged GCC
> > > > > > ICE's for a
> > > > > > while.
> > > > >
> > > > > I made some progress on this. In openRISC we generate cstoresi4 with
> > > > > special
> > > > > register SR_F (condition flag) bit register.
> > > > >
> > > > > This is represented as (reg:BI 34 ?sr_f).
> > > > >
> > > > > The RTL produced for cstoresi4 becomes something like:
> > > > >
> > > > > // The compare
> > > > > (insn 49 48 50 (set (reg:BI 34 ?sr_f)
> > > > > (leu:BI (reg/v:SI 68 [ __x2D.6783 ])
> > > > > (reg/v:SI 69 [ __x1D.6782 ])))
> > > > > "/home/shorne/work/gnu-toolchain/gcc/libgcc/libgcc2.c":532:22 -1
> > > > > (nil)
> > > > > // The conditional branch
> > > > > (jump_insn 50 49 0 (set (pc)
> > > > > (if_then_else (ne (reg:BI 34 ?sr_f)
> > > > > (const_int 0 [0]))
> > > > > (label_ref 0)
> > > > > (pc)))
> > > > > "/home/shorne/work/gnu-toolchain/gcc/libgcc/libgcc2.c":532:22 -1
> > > > > (int_list:REG_BR_PROB 536870916 (nil)))
> > > > >
> > > > > During ce1 the try_emit_cmove_seq would try to convert (reg:BI 34
> > > > > ?sr_f) to SI.
> > > > > Before this patch it was done in gen_lowpart_general by
> > > > > simplify_context::simplify_gen_subreg.
> > > > >
> > > > > The simplity to subreg operation would convert:
> > > > > (reg:BI 34 ?sr_f)
> > > > >
> > > > > To:
> > > > > (subreg:SI (reg:BI 34 ?sr_f) 0)
> > > > >
> > > > > But now the call to validate_subreg is returning false and
> > > > > simplify_gen_subreg
> > > > > returns NULL_RTX, this causes gen_lowpart_general to try to do
> > > > > copy_to_reg,
> > > > > which fails.
> > > > >
> > > > > I need to step away for a bit, but I would next look into why
> > > > > validate_subreg
> > > > > is now failing to allow (reg:BI 34 ?sr_f) to be subreg'd.
> > > >
> > > > I think that the REG_CAN_CHANGE_MODE_P call in validate_subreg is
> > > > rejecting the subreg, because or1k_can_change_mode_class rejects the
> > > > mode change of sr_f from BI to SI.
> > > >
> > > Hi Dimitar,
> > >
> > > That's right, I noticed the same thing and I am testing out a patch to
> > > or1k_can_change_mode_class now, it fixes the ICE. But I'll have to spend
> > > a bit
> > > more time testing.
> >
> > Hi All,
> >
> > The patch I have now is as below and it looks to work ok:
> >
> > --- a/gcc/config/or1k/or1k.cc
> > +++ b/gcc/config/or1k/or1k.cc
> > @@ -1408,8 +1408,9 @@ static bool
> > or1k_can_change_mode_class (machine_mode from, machine_mode to,
> > reg_class_t rclass)
> > {
> > + /* Allow cnoverting special flags to SI mode subregs. */
> > if (rclass == FLAG_REGS)
> > - return from == to;
> > + return from == to || (from == BImode && to == SImode);
> > return true;
> > }
>
> Another patch I intend to merge tomorrow will tighten the checks even
> more: https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685837.html
Hi Dimitar,
Have you merged this yet?
> Seeing that or1k_hard_regno_mode_ok forbids SImode for class FLAG_REGS,
> the validate_subreg may once again start rejecting subregs for or1k.
As you rightfully point out, after that patch the issue returns again. I have
the following fix which allows convert_mode_scalar to bypass trying to use
subregs and use "sign extension" instead as a method to move the SR[F] details
into a register.
diff --git a/gcc/config/or1k/or1k.md b/gcc/config/or1k/or1k.md
index 627e40084b3..66e5af10a98 100644
--- a/gcc/config/or1k/or1k.md
+++ b/gcc/config/or1k/or1k.md
@@ -515,6 +515,24 @@
(ne:SI (reg:BI SR_F_REGNUM) (const_int 0)))]
"")
+(define_expand "zero_extendbisi2"
+ [(set (match_operand:SI 0 "register_operand" "")
+ (zero_extend:SI (match_operand:BI 1 "register_operand" "")))]
+ ""
+{
+ emit_insn(gen_sne_sr_f (operands[0]));
+ DONE;
+})
+
+(define_expand "extendbisi2"
+ [(set (match_operand:SI 0 "register_operand" "")
+ (sign_extend:SI (match_operand:BI 1 "register_operand" "")))]
+ ""
+{
+ emit_insn(gen_sne_sr_f (operands[0]));
+ DONE;
+})
+
This fixes the ICE issue, but it's a bit of a hack and it points out that the
ce1
pass is not able to perform good optimizations on OpenRISC.
This is because openrisc very early (during expand) converts branch statements
separate to "set flag" and "branch if flag". If we would maintain the branch
RTL in the original form a bit longer I think the ce1 pass could have a better
chance to optimize.
e.g. cbranchsi2 on OpenRISC produces:
(insn 11 10 12 2 (set (reg:BI 34 ?sr_f)
(le:BI (reg:SI 46)
(reg:SI 47))) "cmov.c":30:6 -1
(nil))
(jump_insn 12 11 13 2 (set (pc)
(if_then_else (ne (reg:BI 34 ?sr_f)
(const_int 0 [0]))
(label_ref:SI 25)
(pc))) "cmov.c":30:6 77 {*cbranch}
(int_list:REG_BR_PROB 365072228 (nil))
-> 25)
The ce1 pass tries to optimize "insn 12".
I think it could do better if we first kept the form, as below, and let later
split patterns generate the "*sr_f" and "*cbranch" patterns.
(jump_insn (set (pc)
(if_then_else (le (reg:SI 47)
(reg: SI 47))
(label_ref:SI 25)
(pc)))
)
I will try this.
-Stafford