On 4/20/2010 9:09 PM, caman wrote:
Shawn,
Is this your custom implementation?
"For a delta-import, minDid comes from
the maxDid value stored after the last successful import."
Are you updating the dataTable after the import was successful? How did you
handle this? I have similar scenario and your approach will work for my
use-case as well
For safety, I do not use a DB user with write access. I have all the
build infrastructure that I wrote (perl scripts) in an NFS share that
all the hosts can reach. One of the directories under that share is a
series of config files that guide the automation. The important one for
this is named minDid. The update script changes that config file when
the delta-import is successful, so that the machine with the cronjobs
(whichever host in the load balancer cluster is active) can access it
the next time around. Everything is centralized in this way because
there are multiple shards (and two different roles in the index as a
whole), so the update, delete, and rebuild scripts run in one place and
do remote operations via HTTP.
I believe it would likely be possible to use a hybrid of the regular DIH
method and my method to retrieve the milestone from the database with
DIH, but store it with an external process. Lance Norskog has asked
that I file a JIRA request to have DIH use arbitrary milestones, which I
think is a very good idea.