Dennis Gearon [gear...@sbcglobal.net] wrote:
> Even microseconds may not be enough on some really good, fast machine.

True, especially since the timer might not provide microsecond granularity 
although the returned value is in microseconds. However, an unique timestamp 
generator should keep track of the previous timestamp to guard against 
duplicates. Uniqueness can thus be guaranteed by waiting a bit or cheating on 
the decimals. With microseconds can produce 1 million timestamps / second. 
While I agree that duplicates within microseconds can occur on a fast machine, 
guaranteeing uniqueness by waiting should only be a performance problem when 
the number of duplicates is high. That's still a few years off, I think.

As Michael pointed out, using normal timestamps as unique IDs might not be such 
a great idea as it effectively locks index-building to a single JVM. By going 
the ugly route and expressing the time in nanos with only microsecond 
granularity and use the last 3 decimals for a builder ID this could be fixed. 
Not very clean though, as the contract is not expressed in the data themselves 
but must nevertheless be obeyed by all builders to avoid collisions. It also 
raises the question of who should assign the builder IDs. Not trivial in an 
anarchistic setup where new builders can be added by different controllers.

Pragmatists might use the PID % 1000 or similar for the builder ID as it does 
not require coordination, but this is where the Birthday Paradox hits us again: 
The chance of two processes on different machines having the same PID is 10% if 
just 15 machines are used (1% for 5 machines, 50% for 37 machines). I don't 
like those odds and that's assuming that the PIDs will be randomly distributed, 
which they won't. It could be lowered by reserving more decimals for the salt, 
but then we would decrease the maximum amount of timestamps / second, still 
without guaranteed uniqueness. Guys a lot smarter than me has spend time on the 
unique ID problem and it's clearly not easy: Java's UUID takes up 128 bits.

- Toke

Reply via email to