On Wed, Mar 08, 2006 at 10:40:38AM +0100, Andi Kleen wrote: > On Wednesday 08 March 2006 03:38, Benjamin LaHaise wrote: > > > It's hardly that uncommon for pages to cross cachelines or for pages to > > move around CPUs with networking. > > Data?
I posted a workload that shows this. Anything that transmits over TCP on a CPU other than the one the interrupt occurs on is going to hit that behaviour for 7/8 of pages. > > Please name some sort of benchmarks that show your concerns for decreased > > performance. > > Anything that manipulates lots of data. LMBench certainly doesn't look any worse. In fact, things look slightly better. (2.6.16n is with WANT_PAGE_VIRTUAL, while 2.6.16O is without). cd results && make summary percent 2>/dev/null | more make[1]: Entering directory `/md0/root/LMbench2/results' L M B E N C H 2 . 0 S U M M A R Y ------------------------------------ Basic system parameters ---------------------------------------------------- Host OS Description Mhz --------- ------------- ----------------------- ---- cobra.kva Linux 2.6.16n x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16n x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16n x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16n x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16n x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16O x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16O x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16O x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16O x86_64-linux-gnu 2997 cobra.kva Linux 2.6.16O x86_64-linux-gnu 2997 Processor, Processes - times in microseconds - smaller is better ---------------------------------------------------------------- Host OS Mhz null null open selct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ---- cobra.kva Linux 2.6.16n 2997 0.23 0.30 1.45 2.70 8.974 0.35 3.19 111. 443. 1674 cobra.kva Linux 2.6.16n 2997 0.23 0.30 1.46 2.81 8.984 0.35 3.19 116. 443. 1679 cobra.kva Linux 2.6.16n 2997 0.23 0.30 1.45 2.74 9.761 0.35 3.17 112. 446. 1683 cobra.kva Linux 2.6.16n 2997 0.23 0.30 1.43 2.82 8.992 0.34 3.20 113. 444. 1683 cobra.kva Linux 2.6.16n 2997 0.23 0.30 1.45 2.83 8.975 0.35 3.18 112. 445. 1673 cobra.kva Linux 2.6.16O 2997 0.23 0.30 1.44 2.74 9.008 0.36 3.17 110. 443. 1674 cobra.kva Linux 2.6.16O 2997 0.23 0.30 1.45 2.75 9.003 0.37 3.19 112. 445. 1681 cobra.kva Linux 2.6.16O 2997 0.23 0.30 1.45 2.80 9.010 0.37 3.17 112. 452. 1689 cobra.kva Linux 2.6.16O 2997 0.23 0.30 1.45 2.75 9.007 0.36 3.15 112. 453. 1689 cobra.kva Linux 2.6.16O 2997 0.23 0.30 1.44 2.77 9.021 0.36 3.15 113. 448. 1686 Context switching - times in microseconds - smaller is better ------------------------------------------------------------- Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw --------- ------------- ----- ------ ------ ------ ------ ------- ------- cobra.kva Linux 2.6.16n 2.360 3.7500 6.6100 4.6600 10.1 5.20000 15.3 cobra.kva Linux 2.6.16n 2.420 3.7600 6.6500 4.8500 10.5 5.41000 15.0 cobra.kva Linux 2.6.16n 2.400 3.7600 6.5900 4.7000 9.6600 5.37000 15.0 cobra.kva Linux 2.6.16n 2.400 3.7600 6.5600 4.6300 9.6100 5.81000 15.0 cobra.kva Linux 2.6.16n 2.420 3.8200 6.5800 4.8400 10.7 5.47000 14.7 cobra.kva Linux 2.6.16O 2.430 4.4800 7.2100 4.8500 10.4 5.91000 15.8 cobra.kva Linux 2.6.16O 2.460 4.3100 7.2400 4.9000 10.7 5.42000 15.9 cobra.kva Linux 2.6.16O 2.450 4.4800 7.2800 4.7000 10.1 5.20000 15.9 cobra.kva Linux 2.6.16O 2.460 4.3900 6.9900 4.7200 9.7400 5.65000 15.2 cobra.kva Linux 2.6.16O 2.450 4.4700 6.9400 4.8100 8.7100 5.52000 15.5 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- cobra.kva Linux 2.6.16n 2.360 8.228 12.8 12.7 21.7 17.6 29.0 43.6 cobra.kva Linux 2.6.16n 2.420 8.192 13.1 12.7 22.1 17.6 29.2 44.0 cobra.kva Linux 2.6.16n 2.400 8.163 13.0 12.7 22.4 17.6 29.3 43.9 cobra.kva Linux 2.6.16n 2.400 8.163 11.7 12.7 21.9 17.5 29.0 44.4 cobra.kva Linux 2.6.16n 2.420 8.211 13.1 12.7 22.3 17.6 29.5 44.4 cobra.kva Linux 2.6.16O 2.430 8.273 13.1 14.4 23.8 17.7 29.7 43.9 cobra.kva Linux 2.6.16O 2.460 8.284 10.6 14.2 24.1 17.7 29.6 43.8 cobra.kva Linux 2.6.16O 2.450 8.454 13.5 14.3 24.1 17.7 29.8 43.9 cobra.kva Linux 2.6.16O 2.460 8.245 10.7 14.2 24.3 17.7 29.9 44.1 cobra.kva Linux 2.6.16O 2.450 8.395 13.6 14.3 23.8 17.8 29.8 44.2 File & VM system latencies in microseconds - smaller is better -------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page Create Delete Create Delete Latency Fault Fault --------- ------------- ------ ------ ------ ------ ------- ----- ----- cobra.kva Linux 2.6.16n 15.4 12.5 49.1 26.2 2500.0 1.00000 cobra.kva Linux 2.6.16n 15.4 12.5 50.9 26.2 2512.0 0.126 1.00000 cobra.kva Linux 2.6.16n 15.5 12.6 50.3 25.6 2507.0 1.00000 cobra.kva Linux 2.6.16n 15.5 12.5 50.8 26.2 2514.0 1.00000 cobra.kva Linux 2.6.16n 15.4 12.5 51.0 26.2 2507.0 1.00000 cobra.kva Linux 2.6.16O 15.3 12.5 51.4 26.7 2533.0 1.00000 cobra.kva Linux 2.6.16O 15.3 12.5 51.7 26.7 2509.0 1.00000 cobra.kva Linux 2.6.16O 15.3 12.5 51.6 26.7 2522.0 1.00000 cobra.kva Linux 2.6.16O 15.3 12.5 51.6 26.8 2542.0 1.00000 cobra.kva Linux 2.6.16O 15.8 12.5 49.5 26.8 2521.0 1.00000 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----- cobra.kva Linux 2.6.16n 1652 614K 1399 2365.9 4345.9 1369.6 1396.1 4347 1717. cobra.kva Linux 2.6.16n 1654 615K 1383 2360.7 4347.5 1268.7 1303.2 4344 1900. cobra.kva Linux 2.6.16n 1651 597K 1392 2361.1 4346.8 1257.1 1287.2 4346 1918. cobra.kva Linux 2.6.16n 1608 593K 1372 2358.5 4348.1 1290.7 1328.7 4347 1960. cobra.kva Linux 2.6.16n 1629 583K 1388 2354.8 4344.5 1291.5 1318.1 4346 1986. cobra.kva Linux 2.6.16O 1625 657K 1394 2360.6 4346.1 1358.4 1352.0 4347 1740. cobra.kva Linux 2.6.16O 1625 658K 1373 2349.9 4344.0 1361.3 1273.6 4347 1856. cobra.kva Linux 2.6.16O 1657 585K 1370 2346.5 4344.5 1277.0 1299.7 4341 1934. cobra.kva Linux 2.6.16O 1535 631K 1385 2343.8 4348.1 1278.9 1285.0 4347 1959. cobra.kva Linux 2.6.16O 1621 605K 1368 2337.4 4342.7 1289.2 1295.0 4340 1982. Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) --------------------------------------------------- Host OS Mhz L1 $ L2 $ Main mem Guesses --------- ------------- ---- ----- ------ -------- ------- cobra.kva Linux 2.6.16n 2997 1.336 9.3600 44.2 cobra.kva Linux 2.6.16n 2997 1.336 9.4000 44.2 cobra.kva Linux 2.6.16n 2997 1.336 9.3540 44.1 cobra.kva Linux 2.6.16n 2997 1.336 9.3820 44.2 cobra.kva Linux 2.6.16n 2997 1.336 9.3500 44.2 cobra.kva Linux 2.6.16O 2997 1.336 9.3680 44.2 cobra.kva Linux 2.6.16O 2997 1.336 9.3780 44.2 cobra.kva Linux 2.6.16O 2997 1.336 9.3710 44.2 cobra.kva Linux 2.6.16O 2997 1.336 9.3650 44.2 cobra.kva Linux 2.6.16O 2997 1.336 9.3580 44.1 make[1]: Leaving directory `/md0/root/LMbench2/results' > > I've shown you one that gets improved, and I think the pages > > not overlapping cachelines is only a good thing. > > I think increasing the working set and wasting lots of money for this is only > a bad thing. > > I know these things look like piddly little worthless optimizations > > In this case they look more like "make the big picture worse for some > microbenchmark" to me. You haven't come up with any data to support your position, let alone even a specific benchmark that shows your concerns. That sort of position is unreasonable to have to argue against. Please come up with a specific benchmark addressing your concerns instead of this vague handwaving. Besides, we're reducing the cache footprint for many users by only using 1 cacheline per struct page instead of the 2 for every 7 out of 8 struct pages currently. How many places in the kernel really do a linear walk of the struct page array? Aside from early boot initialization, I think the answer is 0. Random is far more likely, and this patch helps that usage model. -ben - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html