> Date: Sun, 26 Dec 2010 21:57:21 -0800 > From: Philip Guenther <guent...@gmail.com> > > On Sun, Dec 26, 2010 at 7:24 AM, Mark Kettenis <mark.kette...@xs4all.nl> > wrote: > >> Originally worked out by joshe@; this corrects the timing of the call to > >> fpuinit() for the primary CPU on amd64 so that the saved mask value is > >> initialized correctly. > > > > Hmm, I'm not too happy about moving fpuinit() out of cpu_hatch() on > > amd64, while leaving npxinit() in there for i386. > > Well, on i386, npxinit() is called for the primary cpu when the npx is > attached, which is way after the secondary cpus are attached. Can we > assume it to be safe to call npxinit() before the npx device is > attached? It wouldn't surprise me to hear that it only works when npx > errors are reported using exceptions and not ISA interrupts...which > would explain why the early calls are fine on multi-CPU systems, > 'cause they all use exceptions.
Looks like you're fooled by another not-so-subtle difference between i386 and amd64. On amd64 the CPUs are spun up when they attach. On i386 that doesn't happen until we call cpu_boot_secondary_processors() in main(), which is long after npx(4) attaches and npxinit() gets called on the primary CPU. > So no, I don't think it's safe to move up the npxinit() call into > cpu_init(), unless you're suggesting that we delay initializing > secondary cpus until after the isa bus is probed (barf). The > consistent alternative would be to leave the fpuinit() call in > cpu_hatch() and call it for the primary cpu near the end of > init_x86_64(), that being the rough equivalent for the primary cpu. I think that is actually what I'd prefer. > >> + if (CPU_IS_PRIMARY(ci)) { > >> + struct fxsave64 fx __attribute__((aligned(16))); > >> + > >> + bzero(&fx, sizeof(fx)); > >> + fxsave(&fx); > >> + if (fx.fx_mxcsr_mask) > >> + fpu_mxcsr_mask = fx.fx_mxcsr_mask; > >> + else > >> + fpu_mxcsr_mask = __INITIAL_MXCSR_MASK__; > >> + } > > > > This function will be run again upon resume. Now overwriting > > fpu_mxcsr_mask shouldn't hurt, but perhaps replacing: > > > > if (CPU_IS_PRIMARY(ci)) { > > > > with > > > > if (fpu_mxcsr_mask == 0) { > > > > would be better? > > No. The primary cpu check is necessary, at least on i386, because > secondary cpus call npxinit() before the primary cpu *and* before > setting CR4_OSFXSR. See my comment above. > >> Index: sys/arch/i386/i386/process_machdep.c > >> =================================================================== > >> RCS file: /cvs/src/sys/arch/i386/i386/process_machdep.c,v > >> retrieving revision 1.25 > >> diff -u -p -r1.25 process_machdep.c > >> --- sys/arch/i386/i386/process_machdep.c 29 Sep 2010 15:11:31 -0000 > >> 1.25 > >> +++ sys/arch/i386/i386/process_machdep.c 25 Dec 2010 19:38:52 -0000 > >> @@ -308,6 +309,7 @@ process_write_fpregs(struct proc *p, str > >> /* XXX Yuck. */ > >> memcpy(&s87, regs, sizeof(*regs)); > >> process_s87_to_xmm(&s87, &frame->sv_xmm); > >> + frame->sv_xmm.sv_env.en_mxcsr &= fpu_mxcsr_mask; > >> } else > >> memcpy(&frame->sv_87, regs, sizeof(*regs)); > > > > Oh, this actually points out an interesting problem. The MXCSR > > register isn't initialized from data passed to us by userland. > > However, if a user calls ptrace(PT_SETFPREGS, ...) on a process that > > didn't use the FPU yet, it isn't quite clear what we initialize the > > XMM state to. While your approach will make sure we won't cause a > > segfault, we might still be leaking stuff from the kernel to userland. > > I think the diff below is a better way to address the issue. > > > > Index: process_machdep.c > > =================================================================== > > RCS file: /cvs/src/sys/arch/i386/i386/process_machdep.c,v > > retrieving revision 1.25 > > diff -u -p -r1.25 process_machdep.c > > --- process_machdep.c 29 Sep 2010 15:11:31 -0000 1.25 > > +++ process_machdep.c 26 Dec 2010 15:23:33 -0000 > > @@ -299,8 +299,15 @@ process_write_fpregs(struct proc *p, str > > #if NNPX > 0 > > npxsave_proc(p, 0); > > #endif > > - } else > > + } else { > > + /* > > + * Make sure MXCSR and the XMM registers are > > + * initialized to sane defaults. > > + */ > > + if (i386_use_fxsave) > > + process_fninit_xmm(&frame->sv_xmm); > > p->p_md.md_flags |= MDP_USEDFPU; > > + } > > > > if (i386_use_fxsave) { > > struct save87 s87; > > Yes, I agree. IMO, that can be committed separately (ok guenther@) > from this mxcsr_mask diff. Ok, done. > What this really points out, however, is that the masking added by my > diff was in the wrong function: it should have been in > process_write_xmmregs(), duh. I'll fix that in the next rev. Right.