On 8/7/2015 11:48 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>  On 8/7/2015 8:56 AM, Davis, Daniel (NIH/NLM) [C] wrote:
> > ... snip... 
> > Each document has an id I wish to use as the unique ID, but I also want to 
> > compute a signature.   Is there some way I can use an
> > updateRequestProcessorChain to throw away a document if its signature and 
> > document id match based on real-time get?
> 
> My main Solr indexes are each generated from a MySQL database.  One contains 
> over 100 million rows, another over 200 million.  
> A third contains about 18 million.  Here's how we handle the requirement you 
> asked about:
> 
> The main table has a delete id column that is its primary key.  This is an 
> autoincrement column.  There is another unique index 
> on another column in that table, which is the canonical unique identifier, 
> used as Solr's uniqueKey.
> 
> The main table also has triggers for DELETE and UPDATE which add records to 
> the idx_delete table (contains delete id values) 
> and idx_reinsert table (contains unique key values).  These extra tables each 
> have a primary key on an autoincrement column.
> The build program (written in Java using SolrJ) tracks three values for every 
> update -- the last did value in the main table, and 
> the last id value in idx_delete and idx_reinsert.

Thanks, Shawn - this is a better solution, and I've used something similar with 
PostgreSQL in the past.   
I don't control the schema, but I can make the suggestion.

Reply via email to