> commit fa761b10d40aaa71e62fbc0c9f2ab8fc07a98b49 (HEAD, refs/bisect/bad) > Author: wilco <wilco@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Wed Sep 18 18:33:30 2019 +0000 > > [ARM] Cleanup 64-bit multiplies > > Cleanup 64-bit multiplies. Combine the expanders using iterators. > Merge the signed/unsigned multiplies as well as the pre-Armv6 and Armv6 > variants. Split DImode operands early into parallel sets inside the > MULL/MLAL instructions - this improves register allocation and avoids > subreg issues due to other DImode operations splitting early. > > gcc/ > * config/arm/arm.md (maddsidi4): Remove expander. > (mulsidi3adddi): Remove pattern. > (mulsidi3adddi_v6): Likewise. > (mulsidi3_nov6): Likewise. > (mulsidi3_v6): Likewise. > (umulsidi3): Remove expander. > (umulsidi3_nov6): Remove pattern. > (umulsidi3_v6): Likewise. > (umulsidi3adddi): Likewise. > (umulsidi3adddi_v6): Likewise. > (<Us>mulsidi3): Add combined expander. > (<Us>maddsidi4): Likewise. > (<US>mull): Add combined umull and smull pattern. > (<US>mlal): Likewise. > * config/arm/iterators.md (Us): Add new iterator.
Is causing the linux kernel to fail to build for arm-linux-gnueabi. -O2 with this testcase: __extension__ typedef unsigned long long __u64; typedef __u64 u64; static inline unsigned long __timespec64_to_jiffies (u64 sec, long nsec) { return ((sec * ((unsigned long) ((((u64) 1000000000L << (32 - 7)) + ((1000000000L + 100 / 2) / 100) - 1) / (u64) ((1000000000L + 100 / 2) / 100)))) + (((u64) nsec * ((unsigned long) ((((u64) 1 << ((32 - 7) + 29)) + ((1000000000L + 100 / 2) / 100) - 1) / (u64) ((1000000000L + 100 / 2) / 100)))) >> (((32 - 7) + 29) - (32 - 7)))) >> (32 - 7); } unsigned long __timespec_to_jiffies (unsigned long sec, long nsec) { return __timespec64_to_jiffies ((u64) sec, nsec); } But I think your change has just exposed a latent bug in cse.c A bit of review of how CONST_INT nodes work. They are modeless which is a historical wart and tends to cause all kinds of not so fun problems. Primarily because the validity of any particular constant is context dependent. Essentially if the constant is used in a mode smaller than a HOST_WIDE_INT, then the constant must be sign extended from that mode to a HOST_WIDE_INT. So a node like (const_int 0xc8000000) is valid if it is used in a DImode context, but not in an SImode context (where it would have to be 0xffffffffc8000000). So with that background we have the following key insns: (insn 13 12 14 2 (set (reg:SI 124) (const_int -939524096 [0xffffffffc8000000])) "j.c":10:54 161 {*arm_movsi_insn} (nil)) (insn 14 13 16 2 (parallel [ (set (reg:SI 132) (plus:SI (mult:SI (zero_extend:DI (reg/v:SI 115 [ sec ])) (zero_extend:DI (reg:SI 124))) (reg:SI 130))) (set (reg:SI 133 [+4 ]) (plus:SI (truncate:SI (lshiftrt:DI (plus:DI (mult:DI (zero_extend:DI (reg/v:SI 115 [ sec ])) (zero_extend:DI (reg:SI 124))) (zero_extend:DI (reg:SI 130))) (const_int 32 [0x20]))) (reg:SI 131 [+4 ]))) ]) "j.c":10:54 60 {umlal} (expr_list:REG_DEAD (reg:SI 131 [+4 ]) (expr_list:REG_DEAD (reg:SI 130) (expr_list:REG_DEAD (reg:SI 124) (expr_list:REG_DEAD (reg/v:SI 115 [ sec ]) (nil)))))) So the (const_int -939524096) mode in insn 13 is absolutely correct. When we substitute it into insn 14 and simplify the zero_extend the form changes into (const_int 3355443200) since it's used in a DImode context. Things have already gone wrong at this point though, but let's follow things further to see what eventually happens. Eventually we want to simplify MULT and call simplify_binary_operation_1. code = MULT mode = E_SImode op0/trueop0 = (zero_extend:DI (reg/v:SI 115 [ sec ])) op1/trueop1 = (const_int 3355443200 [0xc8000000]) It's critical to note that MODE here is the mode of the *output operand*, so SImode is the right value. Inside simplify_binary_operation_1 we want to convert a multiply by a constant power of two into a shift via this code: > /* Convert multiply by constant power of two into shift. */ > if (CONST_SCALAR_INT_P (trueop1)) > { > val = wi::exact_log2 (rtx_mode_t (trueop1, mode)); > if (val >= 0) > return simplify_gen_binary (ASHIFT, mode, op0, > gen_int_shift_amount (mode, val)); > } And we promptly fault inside wi::exact_log2 because we passed it a constant (0x3355443200) that is not valid in the passed-in mode. This is a common failure mode due to the modeless nature of CONST_INTs. Sadly RTL does not constrain things enough that we can get the mode from the other operand (operands do not have to have the same mode). As it turns out cse.c::fold_rtx actually tries to do the right thing here (but gets it wrong): > /* Pick the least expensive of the argument and an equivalent constant > argument. */ > if (const_arg != 0 > && const_arg != folded_arg > && (COST_IN (const_arg, mode_arg, code, i) > <= COST_IN (folded_arg, mode_arg, code, i)) > > /* It's not safe to substitute the operand of a conversion > operator with a constant, as the conversion's identity > depends upon the mode of its operand. This optimization > is handled by the call to simplify_unary_operation. */ > && (GET_RTX_CLASS (code) != RTX_UNARY > || GET_MODE (const_arg) == mode_arg0 > || (code != ZERO_EXTEND > && code != SIGN_EXTEND > && code != TRUNCATE > && code != FLOAT_TRUNCATE > && code != FLOAT_EXTEND > && code != FLOAT > && code != FIX > && code != UNSIGNED_FLOAT > && code != UNSIGNED_FIX))) > folded_arg = const_arg; I'll continue tomorrow. Mostly I wanted to post the analysis to-date so that nobody ends up repeating it. Jeff