Hmm,
1 uS accuracy whereas the cpu has a hardware counter for all this.
To be honest i find 1 microsecond very inaccurate now that cards have
latencies near that.
Let's assume now a simple example of 2 nodes.
node A and node B.
Node A has time X
Node A ships to B time X
Then we do a loop.
Node A ships data to B and B responds to A.
A then measures the time needed for the 2 way pingpong latency,
based upon that gives to B a new time X'.
Nowadays network cards need a microsecond or 2 for this.
Doing that a couple of thousands of times, we should get a fairly
accurate
timing in B, far more accurate than 1 microsecond, as the deviation in
one way pingpong latency isn't real big. It's quite constant.
Only the deviation of that latency is a measure for the accuracy at
which you can
synchronize the clocktime.
Now this is a simple 2 node example. It is of course possible for a
cluster to use
the measurements of many nodes and synchronize to that, just like the
coordinate calculation
for GPS uses several satellites. Using many nodes that'll get the
average
error down. Of course to synchronize many nodes each node uses its
own clock as
new 'source' of measurement; if for the synchronization accuracy we
always assume the
same clock from node A, then getting the error down is a lot tougher.
Vincent
On Sep 29, 2008, at 11:21 PM, Lombard, David N wrote:
On Mon, Sep 29, 2008 at 01:10:49PM -0700, Prentice Bisbal wrote:
In the previous thread I instigated about running services in cluster
nodes, there was some mentioning of precisely synchronizing the
system
clocks and this issue is also mentioned in this paper:
"The Case of Missing Supercomputer Performance: Achieving Optimal
Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and
Pakin)
http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf
I've also read a few other papers on the topic, and it seems you
need to
sync the system clocks to ~1 uS. On top of that, I imagine you
also need
to synch the activities of each system so they all stop to do the
same
system-level tasks at the same time.
The IEEE-1588 "Precision Time Protocol" can provide such levels of
global clock
synchronization.
Shameless plug: See "Hardware Assisted Precision Time Protocol
(PTP, IEEE-1588)
- Design and Case Study" presented at the recent LCI conference;
<http://www.linuxclustersinstitute.org/conferences/archive/2008/
technicalpapers.html>
--
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my
own.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf