On Mon, Apr 4, 2016 at 6:06 PM, Chris Hostetter <hossman_luc...@fucit.org> wrote: > : > : Not sure I understand... _version_ is time based and hence will give > : roughly the same accuracy as something like > : TimestampUpdateProcessorFactory that you recommend below. Both > > Hmmm... last time i looked, i thought _version_ numbers were allocated & > incremented on a per-shard basis and "time" was only used for initial > seeding when the leader started up
No, time is used for every version generated. Upper bits are milliseconds and lower bits are incremented only if needed for uniqueness in the shard (i.e. two documents indexed at the same millisecond). We have 20 lower bits, so one would need a sustained indexing rate of over 1M documents per millisecond (or 1B docs/sec) to introduce a permanent skew due to indexing. There is system clock skew between shards of course, but an update processor that added a date field would include that as well. The code in VersionInfo is: public long getNewClock() { synchronized (clockSync) { long time = System.currentTimeMillis(); long result = time << 20; if (result <= vclock) { result = vclock + 1; } vclock = result; return vclock; } } -Yonik > -- so in a stable system running for > a long time, if shardA gets signifcantly more updates then shardB the > _version_ numbers can get skewed and a new doc in shardB might be updated > with a _version_ less then the _version_ of a document added to shardA > well before that. > > But maybe I'm remembering wrong? > > > > -Hoss > http://www.lucidworks.com/