Re: DataImportHandler Robustness For Imports That Take A Long Time

2009-03-13 Thread Noble Paul നോബിള്‍ नोब्ळ्
alternately you can do the commit yourself after marking in the db . Context#getSolrCore().getUpdateHandler().commit() or as you mentioned you can do an autocommit On Sat, Mar 14, 2009 at 12:31 AM, Chris Harris wrote: > Wouldn't this approach get confused if there was an error that caused > DIH

Re: DataImportHandler Robustness For Imports That Take A Long Time

2009-03-13 Thread Chris Harris
Wouldn't this approach get confused if there was an error that caused DIH to do a rollback? For example, suppose this happened: * 1000 successful document adds * The custom transformer saves some marker in the DB to signal that the above docs have been successfully indexed * The next document add

Re: DataImportHandler Robustness For Imports That Take A Long Time

2009-03-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
I recommend writing a simple transformer which can write an entry into db after n documents (say 1000). and modify your query to take to consider that entry so that subsequent imports will start from there. DIH does not write the last_index_time unless the import completes successfully. On Tue, M

DataImportHandler Robustness For Imports That Take A Long Time

2009-03-09 Thread Chris Harris
I have a dataset (7M-ish docs each of which is maybe 1-100K) that, with my current indexing process, takes a few days or maybe a week to put into Solr. I'm considering maybe switching to indexing with the DataImportHandler, but I'm concerned about the impact of this on indexing robustness: If I u