On Mon, 29 Sep 2008, Lawrence Stewart wrote: > > The IEEE-1588 "Precision Time Protocol" can provide such levels of > > global clock > > synchronization. > That's the one I was trying to remember, but I didn't compose a good > query and couldn't find it. > > IIRC the NIC timestamps arriving packets right off the wire? We have > an on-chip logic analyzer gadget that can do that, but the > synchronization problem we have is > only to find one-time offsets, so we didn't need to go this deep.
Only a very few NICs add a timestamp at receive time, and the Linux kernel doesn't have a portable way to extract those timestamps. Even with a hardware receive timestamp, the number is less useful (and accurate) than you might initially expect. Some chunk of code really should correct for inaccuracies in the NIC clock -- apply a offset and linear drift factor. If you want real accuracy from the timestamp you need to know if it represents the initial symbol, header, final byte or terminating symbol of the packet. Oh, and the sending system has to synchronize transmission of the packet. There is only one NIC I know of that had a "defer transmission until time T" feature, and it appeared that no one had actually used/debugged that feature. (It appeared to be intended for low-rate, higher priority quasi-isochronous traffic, as it was a separate transmit queue.) Back to the original topic: why is there a belief that the we need accurate time synchronization? The paper referenced was: > "The Case of Missing Supercomputer Performance: Achieving Optimal > Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and Pakin) > http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf If you read it you find that they started by suspecting the already-known problem: that the performance hit they were seeing with large-node-count, lock-step applications was because of scheduling "noise". They were running a bunch of daemons that were frequently waking up, doing a trivial amount of work and going back to sleep. Their first, too-simple tests didn't confirm this. Only when they re-wrote their tests to use all of the cores on a node busy were they able to accurately reproduce the effect and confirm that indeed it was OS daemons (in their case, TruCluster and Quadrics network control) causing the performance loss. It's easy to mis-remember what the paper actually says. They addressed the problem by mapping processes that management nodes kept one core free for OS daemons and random kernel work. What the paper DOES NOT say is that you need a globally synchronized clock to fix the problem. They happened to have Quadrics, which had global synchronization operations. Given that large, expensive hammer it was natural to propose using the network to synchronize the execution of the "noise" (junk jobs), rather than re-think the need for running them at all. IIRC, it was companies such as Octiga Bay that actually implemented global-clock gang scheduling of system daemons, again with a network that implemented global synchronization operations. -- Donald Becker [EMAIL PROTECTED] Penguin Computing / Scyld Software www.penguincomputing.com www.scyld.com Annapolis MD and San Francisco CA _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf