On Fri, Jan 29, 2016 at 06:40:21PM +0000, James Clarke wrote: > Here is the description of the mcrfs instruction from the PowerPC Architecture > Book, Version 2.02, Book I: PowerPC User Instruction Set Architecture > (http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html), > found > on page 120: > > The contents of FPSCR field BFA are copied to Condition Register field BF. > All exception bits copied are set to 0 in the FPSCR. If the FX bit is > copied, it is set to 0 in the FPSCR. > > Special Registers Altered: > CR field BF > FX OX (if BFA=0) > UX ZX XX VXSNAN (if BFA=1) > VXISI VXIDI VXZDZ VXIMZ (if BFA=2) > VXVC (if BFA=3) > VXSOFT VXSQRT VXCVI (if BFA=5) > > However, currently every bit in FPSCR field BFA is set to 0, including ones > not > on that list. > > This can be seen in the following simple C program: > > #include <fenv.h> > #include <stdio.h> > > int main(int argc, char **argv) { > int ret; > ret = fegetround(); > printf("Current rounding: %d\n", ret); > ret = fesetround(FE_UPWARD); > printf("Setting to FE_UPWARD (%d): %d\n", FE_UPWARD, ret); > ret = fegetround(); > printf("Current rounding: %d\n", ret); > ret = fegetround(); > printf("Current rounding: %d\n", ret); > return 0; > } > > which gave the output (before this commit): > > Current rounding: 0 > Setting to FE_UPWARD (2): 0 > Current rounding: 2 > Current rounding: 0 > > instead of (after this commit): > > Current rounding: 0 > Setting to FE_UPWARD (2): 0 > Current rounding: 2 > Current rounding: 2 > > The relevant disassembly is in fegetround(), which, on my system, is: > > __GI___fegetround: > <+0>: mcrfs cr7, cr7 > <+4>: mfcr r3 > <+8>: clrldi r3, r3, 62 > <+12>: blr > > What happens is that, the first time fegetround() is called, FPSCR field 7 is > retrieved. However, because of the bug in mcrfs, the entirety of field 7 is > set > to 0, which includes the rounding mode. > > There are other issues this will fix, such as condition flags not persisting > when they should if read, and if you were to read a specific field with some > exception bits set, but no others were set in the entire register, then the > bits would be cleared correctly, but FEX/VX would not be updated to 0 as they > should be. > > Signed-off-by: James Clarke <[email protected]>
Thanks for the fixup. It actually looks like helper_store_fpscr()
should really take a target_ulong instead of u64 and have the (single)
caller which wants to pass a 64 do the truncate. But that can be a
cleanup for another day.
Applied to ppc-for-2.6.
> ---
> target-ppc/cpu.h | 6 ++++++
> target-ppc/translate.c | 21 +++++++++++++++++----
> 2 files changed, 23 insertions(+), 4 deletions(-)
>
> diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
> index 3a967b7..d811bc9 100644
> --- a/target-ppc/cpu.h
> +++ b/target-ppc/cpu.h
> @@ -718,6 +718,12 @@ enum {
> #define FP_RN1 (1ull << FPSCR_RN1)
> #define FP_RN (1ull << FPSCR_RN)
>
> +/* the exception bits which can be cleared by mcrfs - includes FX */
> +#define FP_EX_CLEAR_BITS (FP_FX | FP_OX | FP_UX | FP_ZX | \
> + FP_XX | FP_VXSNAN | FP_VXISI | FP_VXIDI | \
> + FP_VXZDZ | FP_VXIMZ | FP_VXVC | FP_VXSOFT | \
> + FP_VXSQRT | FP_VXCVI)
> +
>
> /*****************************************************************************/
> /* Vector status and control register */
> #define VSCR_NJ 16 /* Vector non-java */
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 4be7eaa..ca10bd1 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -2500,18 +2500,31 @@ static void gen_fmrgow(DisasContext *ctx)
> static void gen_mcrfs(DisasContext *ctx)
> {
> TCGv tmp = tcg_temp_new();
> + TCGv_i32 tmask;
> + TCGv_i64 tnew_fpscr = tcg_temp_new_i64();
> int bfa;
> + int nibble;
> + int shift;
>
> if (unlikely(!ctx->fpu_enabled)) {
> gen_exception(ctx, POWERPC_EXCP_FPU);
> return;
> }
> - bfa = 4 * (7 - crfS(ctx->opcode));
> - tcg_gen_shri_tl(tmp, cpu_fpscr, bfa);
> + bfa = crfS(ctx->opcode);
> + nibble = 7 - bfa;
> + shift = 4 * nibble;
> + tcg_gen_shri_tl(tmp, cpu_fpscr, shift);
> tcg_gen_trunc_tl_i32(cpu_crf[crfD(ctx->opcode)], tmp);
> - tcg_temp_free(tmp);
> tcg_gen_andi_i32(cpu_crf[crfD(ctx->opcode)], cpu_crf[crfD(ctx->opcode)],
> 0xf);
> - tcg_gen_andi_tl(cpu_fpscr, cpu_fpscr, ~(0xF << bfa));
> + tcg_temp_free(tmp);
> + tcg_gen_extu_tl_i64(tnew_fpscr, cpu_fpscr);
> + /* Only the exception bits (including FX) should be cleared if read */
> + tcg_gen_andi_i64(tnew_fpscr, tnew_fpscr, ~((0xF << shift) &
> FP_EX_CLEAR_BITS));
> + /* FEX and VX need to be updated, so don't set fpscr directly */
> + tmask = tcg_const_i32(1 << nibble);
> + gen_helper_store_fpscr(cpu_env, tnew_fpscr, tmask);
> + tcg_temp_free_i32(tmask);
> + tcg_temp_free_i64(tnew_fpscr);
> }
>
> /* mffs */
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
