Thanks Shawn In your opinion, what do you think is easier, writing the importer from scratch or extending the DIH (for example: adding the state etc...)?
Yuval On Thu, Apr 24, 2014 at 6:47 PM, Shawn Heisey <s...@elyograg.org> wrote: > On 4/24/2014 9:24 AM, Yuval Dotan wrote: > >> I want to use the DIH component in order to import data from old >> postgresql >> DB. >> I want to be able to recover from errors and crashes. >> If an error occurs I should be able to restart and continue indexing from >> where it stopped. >> Is the DIH good enough for my requirements ? >> If not is it possible to extend one of its classes in order to support the >> recovery? >> > > The entity in the Dataimport Handler (DIH) config has an "onError" > attribute. > > http://wiki.apache.org/solr/DataImportHandler#Schema_for_the_data_config > https://cwiki.apache.org/confluence/display/solr/ > Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler# > UploadingStructuredDataStoreDatawiththeDataImportHandler-EntityProcessors > > But honestly, if you want a really robust Java program that indexes to > Solr and does precisely what you want, you may be better off writing it > yourself using SolrJ and JDBC. DIH is powerful and efficient, but when you > write the program yourself, you can do anything you want with your data. > > You also have the possibility of resuming an import after a Solr crash. > Because DIH is embedded in Solr and doesn't save any kind of state data > about an import in progress, that's pretty much impossible with DIH. With > a SolrJ program, you'd have to handle that yourself, but it would be > *possible*. > > https://cwiki.apache.org/confluence/display/solr/Using+SolrJ > > Thanks, > Shawn > >