This is all good stuff.  Thank you all for your insight.

Steve

On Mon, Apr 4, 2016 at 6:15 PM, Yonik Seeley <ysee...@gmail.com> wrote:

> On Mon, Apr 4, 2016 at 6:06 PM, Chris Hostetter
> <hossman_luc...@fucit.org> wrote:
> > :
> > : Not sure I understand... _version_ is time based and hence will give
> > : roughly the same accuracy as something like
> > : TimestampUpdateProcessorFactory that you recommend below.  Both
> >
> > Hmmm... last time i looked, i thought _version_ numbers were allocated &
> > incremented on a per-shard basis and "time" was only used for initial
> > seeding when the leader started up
>
> No, time is used for every version generated.  Upper bits are
> milliseconds and lower bits are incremented only if needed for
> uniqueness in the shard (i.e. two documents indexed at the same
> millisecond).  We have 20 lower bits, so one would need a sustained
> indexing rate of over 1M documents per millisecond (or 1B docs/sec) to
> introduce a permanent skew due to indexing.
>
> There is system clock skew between shards of course, but an update
> processor that added a date field would include that as well.
>
> The code in VersionInfo is:
>
> public long getNewClock() {
>   synchronized (clockSync) {
>     long time = System.currentTimeMillis();
>     long result = time << 20;
>     if (result <= vclock) {
>       result = vclock + 1;
>     }
>     vclock = result;
>     return vclock;
>   }
> }
>
>
> -Yonik
>
> > -- so in a stable system running for
> > a long time, if shardA gets signifcantly more updates then shardB the
> > _version_ numbers can get skewed and a new doc in shardB might be updated
> > with a _version_ less then the _version_ of a document added to shardA
> > well before that.
> >
> > But maybe I'm remembering wrong?
> >
> >
> >
> > -Hoss
> > http://www.lucidworks.com/
>

Reply via email to