On Mon, Sep 06, 2010 at 01:09:39AM -0500, Philip Guenther wrote: > The amd64 pmap keeps track of which cpu(s) a given pmap is in use on, so > that when the pmap's page tables are modified it can send IPIs to the > neccessary processors to have them invalidate those pages in their caches. > Unfortunately, there are two bugs in this: 1) the bitmaps used to do that > aren't actually initiallized to zero, so they get have the 0xdeadbeef > default fill, and 2) the cpu_fork() logic leaves the bit for the cpu that > the parent was running on set in the child. > > Both of those result in many superfluous IPIs. Fixing those reduces the > number of shootpage and shootrange IPIs by a factor of 21 on my 4 CPU > laptop, completely eliminating the IPI on over 7/8th of the calls to those > functions. > > The diff below does the initialization, stops setting the bit in > pmap_activate unless it's the current proc, and adds a DIAGNOSTIC printf > to pmap_destroy() to complain if a pmap's pm_cpus member isn't zero when > it's being destroyed. It also eliminates the pm_flags member of the pmap, > which is unused. > > > Philip Guenther
Running a build cycle with this on my 6xAMD Phenom(tm) II X6 1055T box. So far so good. .... Ken