> Originally worked out by joshe@; this corrects the timing of the call to > fpuinit() for the primary CPU on amd64 so that the saved mask value is > initialized correctly.
Hmm, I'm not too happy about moving fpuinit() out of cpu_hatch() on amd64, while leaving npxinit() in there for i386. Also: > Index: sys/arch/amd64/amd64/cpu.c > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/amd64/cpu.c,v > retrieving revision 1.40 > diff -u -p -r1.40 cpu.c > --- sys/arch/amd64/amd64/cpu.c 27 Nov 2010 13:03:04 -0000 1.40 > +++ sys/arch/amd64/amd64/cpu.c 25 Dec 2010 19:38:48 -0000 > @@ -103,6 +105,16 @@ fpuinit(struct cpu_info *ci) > { > lcr0(rcr0() & ~(CR0_EM|CR0_TS)); > fninit(); > + if (CPU_IS_PRIMARY(ci)) { > + struct fxsave64 fx __attribute__((aligned(16))); > + > + bzero(&fx, sizeof(fx)); > + fxsave(&fx); > + if (fx.fx_mxcsr_mask) > + fpu_mxcsr_mask = fx.fx_mxcsr_mask; > + else > + fpu_mxcsr_mask = __INITIAL_MXCSR_MASK__; > + } > lcr0(rcr0() | (CR0_TS)); > } This function will be run again upon resume. Now overwriting fpu_mxcsr_mask shouldn't hurt, but perhaps replacing: if (CPU_IS_PRIMARY(ci)) { with if (fpu_mxcsr_mask == 0) { would be better? > Index: sys/arch/i386/i386/process_machdep.c > =================================================================== > RCS file: /cvs/src/sys/arch/i386/i386/process_machdep.c,v > retrieving revision 1.25 > diff -u -p -r1.25 process_machdep.c > --- sys/arch/i386/i386/process_machdep.c 29 Sep 2010 15:11:31 -0000 > 1.25 > +++ sys/arch/i386/i386/process_machdep.c 25 Dec 2010 19:38:52 -0000 > @@ -308,6 +309,7 @@ process_write_fpregs(struct proc *p, str > /* XXX Yuck. */ > memcpy(&s87, regs, sizeof(*regs)); > process_s87_to_xmm(&s87, &frame->sv_xmm); > + frame->sv_xmm.sv_env.en_mxcsr &= fpu_mxcsr_mask; > } else > memcpy(&frame->sv_87, regs, sizeof(*regs)); Oh, this actually points out an interesting problem. The MXCSR register isn't initialized from data passed to us by userland. However, if a user calls ptrace(PT_SETFPREGS, ...) on a process that didn't use the FPU yet, it isn't quite clear what we initialize the XMM state to. While your approach will make sure we won't cause a segfault, we might still be leaking stuff from the kernel to userland. I think the diff below is a better way to address the issue. Index: process_machdep.c =================================================================== RCS file: /cvs/src/sys/arch/i386/i386/process_machdep.c,v retrieving revision 1.25 diff -u -p -r1.25 process_machdep.c --- process_machdep.c 29 Sep 2010 15:11:31 -0000 1.25 +++ process_machdep.c 26 Dec 2010 15:23:33 -0000 @@ -299,8 +299,15 @@ process_write_fpregs(struct proc *p, str #if NNPX > 0 npxsave_proc(p, 0); #endif - } else + } else { + /* + * Make sure MXCSR and the XMM registers are + * initialized to sane defaults. + */ + if (i386_use_fxsave) + process_fninit_xmm(&frame->sv_xmm); p->p_md.md_flags |= MDP_USEDFPU; + } if (i386_use_fxsave) { struct save87 s87;