: I have learned Solr as a power user and written a couple of simple
: filters. I'm not a Lucene heavy. Where is this in Lucene? Is it the
: default? I don't remember Lucene having the notion of a unique id
: (primary key).
I can't answer that question (because Yonik's answer suprised me too) but
as for this one...
: In this merge code, with the latest Lucene 2.3, will the duplicates in
: solr/data1 override the records in solr/data0? Or the other way around?
neither. duplicate overwritting is done when adding individual documents;
when merging two indexes this logic doesn't come into play.
The easiest way i can think of to deal with this would be:
1) merge the indexes (using the existing IndexMerger)
2) iterate over a TermEnum for the uniqueKey field.
3) if any term has a docFreq > 1, delete all but the lowest (or
highest) docid (depending on what order you merged the indexes in)
BTW: Would you mind updating that wiki page with some more details based
on your experience once you get it working?
-Hoss