Noble,

Thanks for the suggestion. The unfortunate thing is that we really don't
know ahead of time what sort of replication delay we're going to encounter
-- it could be one millisecond or it could be one hour. So, we end up
needing to do something like:

For delta-import run N:
1. query DB slave for "seconds_behind_master", use this to calculate
Date(N).
2. query DB slave for records updated since Date(N - 1)

I see there are plugin points for EventListener classes (onImportStart,
onImportEnd). Would those be the right spot to calculate these dates so that
I could expose them to my custom function at query time?

Thanks.

--Gregg

On Wed, Jan 28, 2009 at 11:20 PM, Noble Paul നോബിള്‍ नोब्ळ् <
noble.p...@gmail.com> wrote:

> The problem you are trying to solve is that you cannot use
> ${dataimporter.last_index_time} as is. you may need something like
> ${dataimporter.last_index_time} - 3secs
>
> am I right?
>
> There are no straight ways to do this .
> 1) you may write your own function say 'lastIndexMinus3Secs' and add
> them. functions can be plugged in to DIH using a <function
> name="lastIndexMinus3Secs" class=""foo.Foo/> under the <dataConfig>
> tag. And you can use it as
> ${dataimporter.functions.lastIndexMinus3Secs()}
> this will add to the existing in-built functions
>
> http://wiki.apache.org/solr/DataImportHandler#head-5675e913396a42eb7c6c5d3c894ada5dadbb62d7
>
> the class must extend org.apache.solr.handler.dataimport.Evaluator
>
> we may add a standard function for this too . you can raise an issue
> --Noble
>
>
>
> On Thu, Jan 29, 2009 at 6:26 AM, Gregg <gregg...@gmail.com> wrote:
> > I'd like to use the DataImportHandler running against a slave database
> that,
> > at any given time, may be significantly behind the master DB. This can
> cause
> > updates to be missed if you use the clock-time as the "last_index_time."
> > E.g., if the slave catches up to the master between two delta-imports.
> >
> > Has anyone run into this? In our non-DIH indexing system we get around
> this
> > by either using the slave DB's seconds-behind-master or the max last
> update
> > time of the records returned.
> >
> > Thanks.
> >
> > Gregg
> >
>
>
>
> --
> --Noble Paul
>

Reply via email to