Your phrasing of the question may be convoluting things -- you refered to 
"DocID" but it's not clear if you mean...

 * the low level internal lucene doc id
 * the uniqueKey field of your schema.xml
 * some identifier whose providence you don't care about.

In the first case, you can use the doctransformer jack mentioned, and you 
can even sort on the internal lucene id using "_docid_" but you can't 
really filter on it.

In the context of your specific problem the internal lucene doc id 
wouldn't help you anyway, since internal lucene doc ids can change as 
segments get merged and delets get flushed.

You can however use any of the later two cases, along with an fq to 
implement" cursor" style logic to insure you never get the same document 
more then once.  instead of increasing the "start" param, you just 
specify an fq param that filters on you id field using a range 
query, and you constinually increase (or descrease) the boundary on the 
range based on the last document fetched.

using your previous example...
:         query                               return
:         start=0&rows=1                       A
:         start=1&rows=1                       B
:         start=2&rows=1                       C

you would instead do...

    start=0&rows=1&sort=id+asc                  A
    start=0&rows=1&sort=id+asc&fq=id:{A TO *]   B
    start=0&rows=1&sort=id+asc&fq=id:{B TO *]   C

If you choose your id field such that the ids were always increasing (ie: 
time based) then you could also be certain that you were always able to 
fetch all documents (ie: you would never miss a doc because you were 
already "past" it's place in the ordered list of docs)

-Hoss

Reply via email to