I have two questions: 1. I am pulling data from 2 data sources using the DIH. I am using the deltaQuery functionality. Since the data sources pull data sequentially, I find that some data is getting unnecessarily re-indexed from my second data source. Hopefully this helps illustrate my probem:
Assume last_index_time is 0. At time = 1, pull data from data source 1 with a query that includes "last_modified> '${dataimporter.last_index_time}'". Note that this pulls data for the time interval [0,1]. This step takes 1 time interval. At time = 2, data source 2 is polled with the same query. This step takes 1 time interval. Note that this pulls data for the time interval [0,2]. At t=3, last_index_time is set to 1 Next time I run the DIH, I will be unneccessarily re-indexing data that appeared in data source 2 in the inteval [1,2]. Ideally, I'd like to have access to something like ${dataimporter.current_index_time}, so I could restrict my delta query to: "last_modified> '${dataimporter.last_index_time}' AND last_modified < '${dataimporter.current_index_time}'" Is this available? 2. I have a transient table that I query with the DIH to load my index. After loading values into the index, I want to delete them from the transient table. Is there a way to do this from the DIH? I tried stuffing a delete statement into the deltaQuery attribute, but that didn't work: <dataConfig> <dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:/temp/example/ex" user="sa" /> <document name="products"> <entity name="item" pk="ID" query="select * from item" deltaQuery="select id from item where last_modified > '${dataimporter.last_index_time}'; delete from item where last_modified < '${dataimporter.last_index_time}'"> </entity> </entity> </document> </dataConfig> -- View this message in context: http://www.nabble.com/DataImportHandler-current_index_time---post-completion-action-tp18498832p18498832.html Sent from the Solr - User mailing list archive at Nabble.com.