: Thank-you, that all sounds great. My assumption about documents being
: missed was something like this:
        ...
: In that situation D would always be missed, whether the cursorMark 'C or
: greater' or 'greater than B' (I'm not sure which it is in practice), simply
: because the cursorMark is the unique ID and the unique ID is not your first
: sort mechanism.

First off: nothing about your example would result in "the cursorMark is 
the unique ID" ... let's clear that misconception up right away:

Using Cursors requires a deterministic sort w/o any "ties" that can result 
in abiguity.  For this reason (eliminating the abiguity) it is neccessary 
that the uniqueKey always be included in a sort -- but the cursorMark 
values that get computed are determined by *all* of the sort critera used.

So let's revisit your example, but let's make sure we are explicit about 
everything involved:  

 * A,B,C,D are all uniqueyKey values in the "id" field
 * 1,2,3.... are all time values in a "timestamp" field.
 * we're going to use a "sort=timestamp asc, id asc" param in this example
 * when we say "X(123)" we mean "Document with id 'X' which currently has 
   value '123' in the timestamp field"

Let's suppose that at the start of the example, all of the docs in your 
example, in sorted order, look like this...

  A(1), B(3), C(14), D(32)

A client uses our sort, along with cursorMark=* & rows=2.  That client 
will get back A(1) and B(3) as well as some nextCursorMark value of "$%^" 
(deliberately not using any letters or numbers so as not to misslead you 
ito thinking hte cursorMark value is an id or a timestamp -- it's 
neaither, it's an encoded binary value that has no meaning to client other 
then as a "mark" to send back to the server)

Now let's suppose that B & C are edited as you mention -- their new 
timestamp values must -- by definition -- be greater then D's existing 
timestamp value of "32" (otherwise it's not really a timestamp field) So 
let's assume now, that the total ordering of all our docs, using our sort 
is:

  A(1), D(32), B(56), C(57)

After B & C are modified, the the client makes a followup request using 
the same sort, rows=2, and cursorMark=$%^ (the nextCursorMark returned 
from the previous request)  the two documents the client will get this 
time are D(32) and B(56).

 - "D" will never be skipped.
 - "B" will be returned twice, because it's timestamp 
   value was updated after it was fetched

Does that make sense?

You can try this out manually if you want to see it for yourlself -- 
either using a "real" auto-assigned timestamp field, or just using a 
simple numeric field you set your self when updating docs.



-Hoss
http://www.lucidworks.com/

Reply via email to