On 05/04/2016 11:08 AM, Laurent Vivier wrote:
-void HELPER(divu)(CPUM68KState *env, uint32_t word) -{ - uint32_t num; - uint32_t den; - uint32_t quot; - uint32_t rem; - - num = env->div1; - den = env->div2; - /* ??? This needs to make sure the throwing location is accurate. */ - if (den == 0) { - raise_exception(env, EXCP_DIV0); - }
Because of the exception, and required branching, I question taking the division operations back inline. The throwing location is easily fixed; create a version of raise_exception that has an argument for GETPC and uses cpu_loop_exit_restore.
@@ -1179,64 +1187,182 @@ DISAS_INSN(mulw) DISAS_INSN(divw) { + TCGLabel *l1; + TCGv t0, src; + TCGv quot, rem; int sign; sign = (insn & 0x100) != 0; + + tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */ + + /* dest.l / src.w */ + + SRC_EA(env, t0, OS_WORD, sign, NULL); + + src = tcg_temp_local_new_i32(); + tcg_gen_mov_i32(src, t0); + l1 = gen_new_label(); + tcg_gen_brcondi_i32(TCG_COND_NE, src, 0, l1); + tcg_gen_movi_i32(QREG_PC, s->insn_pc); + gen_raise_exception(EXCP_DIV0); + gen_set_label(l1); + + tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */
Redundant store to CC_C? Is that simply so that the 0 is forward propagated within the second block by the tcg optimizer? If so, the comments could be clearer.
+ quot = tcg_temp_new(); + rem = tcg_temp_new(); if (sign) { - gen_helper_divs(cpu_env, tcg_const_i32(1)); + tcg_gen_div_i32(quot, DREG(insn, 9), src); + tcg_gen_rem_i32(rem, DREG(insn, 9), src); + tcg_gen_ext16s_i32(QREG_CC_V, quot); + tcg_gen_movi_i32(src, -1); + tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V, + QREG_CC_V, quot, + QREG_CC_C /* 0 */, src /* -1 */);
setcond + neg is better than movcond here.
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V, + QREG_CC_V, QREG_CC_C /* 0 */, + QREG_CC_V /* 0 */, src /* -1 */);
Likewise.
+ /* result rem:quot */ + + tcg_gen_ext16u_i32(quot, quot); + tcg_gen_deposit_i32(quot, quot, rem, 16, 16);
The ext16u is redundant with the deposit.
+ tcg_temp_free(rem); + + /* on overflow, operands and flags are unaffected */ + + tcg_gen_movcond_i32(TCG_COND_EQ, DREG(insn, 9), + QREG_CC_V, QREG_CC_C /* zero */, + quot, DREG(insn, 9)); + tcg_gen_ext16s_i32(quot, quot); + tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_Z, + QREG_CC_V, QREG_CC_C /* zero */, + quot, QREG_CC_Z); + tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_N, + QREG_CC_V, QREG_CC_C /* zero */, + quot, QREG_CC_N); + tcg_temp_free(quot);
Interesting. The manual says "may", as in "may or may not". Was the existing behaviour that didn't have early check for overflow a bug, or simply a bug for exactly emulating specific cpu models?
Anyway, this multiplicity of movcond is another reason to consider leaving the code out of line in a helper.
set_cc_op(s, CC_OP_FLAGS);
This needs to happen before the branch, along with an update_cc_op, surely. Otherwise the store to CC_C doesn't necessarily have an effect on the computed flags.
+ if (sign) { + tcg_gen_ext32s_i64(quot64, quot64); + tcg_gen_extrh_i64_i32(rem, quot64);
This is a long way around to simply extract the same sign bit out of quot, i.e. tcg_gen_sari_i32(rem, quot, 31);
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V, + QREG_CC_V, rem, + QREG_CC_C /* 0 */, minusone /* -1 */);
Again with setcond + neg vs movcond. r~