Richard Henderson <[email protected]> writes:
> On 07/23/2016 02:14 PM, Nikunj A Dadhania wrote:
>> Adding following instructions:
>>
>> moduw: Modulo Unsigned Word
>> modsw: Modulo Signed Word
>>
>> Signed-off-by: Nikunj A Dadhania <[email protected]>
>> ---
>> target-ppc/helper.h | 2 ++
>> target-ppc/int_helper.c | 15 +++++++++++++++
>> target-ppc/translate.c | 19 +++++++++++++++++++
>> 3 files changed, 36 insertions(+)
>>
>> diff --git a/target-ppc/helper.h b/target-ppc/helper.h
>> index 1f5cfd0..76072fd 100644
>> --- a/target-ppc/helper.h
>> +++ b/target-ppc/helper.h
>> @@ -41,6 +41,8 @@ DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl)
>> DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_NO_RWG_SE, tl, tl)
>> DEF_HELPER_FLAGS_1(popcntw, TCG_CALL_NO_RWG_SE, tl, tl)
>> DEF_HELPER_FLAGS_2(cmpb, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>> +DEF_HELPER_FLAGS_2(modsw, TCG_CALL_NO_RWG_SE, i32, i32, i32)
>> +DEF_HELPER_FLAGS_2(moduw, TCG_CALL_NO_RWG_SE, i32, i32, i32)
>> DEF_HELPER_3(sraw, tl, env, tl, tl)
>> #if defined(TARGET_PPC64)
>> DEF_HELPER_FLAGS_1(cntlzd, TCG_CALL_NO_RWG_SE, tl, tl)
>> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
>> index 7445376..631e0b4 100644
>> --- a/target-ppc/int_helper.c
>> +++ b/target-ppc/int_helper.c
>> @@ -139,6 +139,21 @@ uint64_t helper_divde(CPUPPCState *env, uint64_t rau,
>> uint64_t rbu, uint32_t oe)
>>
>> #endif
>>
>> +uint32_t helper_modsw(uint32_t rau, uint32_t rbu)
>> +{
>> + int32_t ra = (int32_t) rau;
>> + int32_t rb = (int32_t) rbu;
>> +
>> + if ((rb == 0) || (ra == INT32_MIN && rb == -1)) {
>> + return 0;
>> + }
>> + return ra % rb;
>> +}
>> +
>> +uint32_t helper_moduw(uint32_t ra, uint32_t rb)
>> +{
>> + return rb ? ra % rb : 0;
>> +}
>
> I think, like you, I got distracted by the current div implementation in ppc.
> I've just re-read the spec and seen the "undefined" language. Which of
> course
> gives us much more freedom.
>
> With this freedom, we can do the division inline, without branches. Please
> see
> target-mips/translate.c, gen_r6_muldiv.
>
> Basically, we check for the offending cases and modify the divisor prior to
> the
> division. For unsigned:
>
> a / (b == 0 ? 1 : b)
Modulo case: a % (b == 0 ? 1 : b)
tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]);
tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]);
tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t1, 0);
tcg_gen_movi_i32(t3, 0);
tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1);
tcg_gen_remu_i32(t3, t0, t1);
tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3);
> For signed:
>
> a / ((a == INT_MAX & b == -1) | (b == 0) ? : b)
Modulo case: a % ((a == INT_MAX & b == -1) | (b == 0) ? 1 : b)
tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]);
tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]);
tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t0, INT_MIN);
tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, -1);
tcg_gen_and_i32(t2, t2, t3);
tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, 0);
tcg_gen_or_i32(t2, t2, t3);
tcg_gen_movi_i32(t3, 0);
tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1);
tcg_gen_rem_i32(t3, t0, t1);
tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3);
I think you were suggesting something like above?
For "div[wd]o." we will have further cases to implement overflow.
Regards,
Nikunj