This is all good stuff. Thank you all for your insight. Steve
On Mon, Apr 4, 2016 at 6:15 PM, Yonik Seeley <ysee...@gmail.com> wrote: > On Mon, Apr 4, 2016 at 6:06 PM, Chris Hostetter > <hossman_luc...@fucit.org> wrote: > > : > > : Not sure I understand... _version_ is time based and hence will give > > : roughly the same accuracy as something like > > : TimestampUpdateProcessorFactory that you recommend below. Both > > > > Hmmm... last time i looked, i thought _version_ numbers were allocated & > > incremented on a per-shard basis and "time" was only used for initial > > seeding when the leader started up > > No, time is used for every version generated. Upper bits are > milliseconds and lower bits are incremented only if needed for > uniqueness in the shard (i.e. two documents indexed at the same > millisecond). We have 20 lower bits, so one would need a sustained > indexing rate of over 1M documents per millisecond (or 1B docs/sec) to > introduce a permanent skew due to indexing. > > There is system clock skew between shards of course, but an update > processor that added a date field would include that as well. > > The code in VersionInfo is: > > public long getNewClock() { > synchronized (clockSync) { > long time = System.currentTimeMillis(); > long result = time << 20; > if (result <= vclock) { > result = vclock + 1; > } > vclock = result; > return vclock; > } > } > > > -Yonik > > > -- so in a stable system running for > > a long time, if shardA gets signifcantly more updates then shardB the > > _version_ numbers can get skewed and a new doc in shardB might be updated > > with a _version_ less then the _version_ of a document added to shardA > > well before that. > > > > But maybe I'm remembering wrong? > > > > > > > > -Hoss > > http://www.lucidworks.com/ >