Hmm,

1 uS accuracy whereas the cpu has a hardware counter for all this.

To be honest i find 1 microsecond very inaccurate now that cards have latencies near that.

Let's assume now a simple example of 2 nodes.

node A and node B.

Node A has time X
Node A ships to B time X

Then we do a loop.

Node A ships data to B and B responds to A.

A then measures the time needed for the 2 way pingpong latency,
based upon that gives to B a new time X'.

Nowadays network cards need a microsecond or 2 for this.

Doing that a couple of thousands of times, we should get a fairly accurate
timing in B, far more accurate than 1 microsecond, as the deviation in
one way pingpong latency isn't real big. It's quite constant.

Only the deviation of that latency is a measure for the accuracy at which you can
synchronize the clocktime.

Now this is a simple 2 node example. It is of course possible for a cluster to use the measurements of many nodes and synchronize to that, just like the coordinate calculation for GPS uses several satellites. Using many nodes that'll get the average error down. Of course to synchronize many nodes each node uses its own clock as new 'source' of measurement; if for the synchronization accuracy we always assume the
same clock from node A, then getting the error down is a lot tougher.

Vincent


On Sep 29, 2008, at 11:21 PM, Lombard, David N wrote:

On Mon, Sep 29, 2008 at 01:10:49PM -0700, Prentice Bisbal wrote:
In the previous thread I instigated about running services in cluster
nodes, there was some mentioning of precisely synchronizing the system
clocks and this issue is also mentioned in this paper:

"The Case of Missing Supercomputer Performance: Achieving Optimal
Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and Pakin)
http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf

I've also read a few other papers on the topic, and it seems you need to sync the system clocks to ~1 uS. On top of that, I imagine you also need to synch the activities of each system so they all stop to do the same
system-level tasks at the same time.

The IEEE-1588 "Precision Time Protocol" can provide such levels of global clock
synchronization.

Shameless plug: See "Hardware Assisted Precision Time Protocol (PTP, IEEE-1588)
- Design and Case Study" presented at the recent LCI conference;
<http://www.linuxclustersinstitute.org/conferences/archive/2008/ technicalpapers.html>

--

David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to