I guess SOLR-1352 should solve all the problems with performance. I am working on one currently and I hope to submit a patch soon.
On Thu, Nov 12, 2009 at 8:05 PM, Sascha Szott <sz...@zib.de> wrote: > Hi Avlesh, > > Avlesh Singh wrote: >>> >>> 1. Is it considered as good practice to set up several DIH request >>> handlers, one for each possible parameter value? >>> >> Nothing wrong with this. My assumption is that you want to do this to >> speed >> up indexing. Each DIH instance would block all others, once a Lucene >> commit >> for the former is performed. > Thanks for this clarification. > >> 2. In case the range of parameter values is broad, it's not convenient to >>> define separate request handlers for each value. But this entails a >>> limitation (as far as I see): It is not possible to fire several request >>> to the same DIH handler (with different parameter values) at the same >>> time. >>> >> Nope. >> >> I had done a similar exercise in my quest to write a >> ParallelDataImportHandler. This thread might be of interest to you - >> http://www.lucidimagination.com/search/document/a9b26ade46466ee/queries_regarding_a_paralleldataimporthandler. >> Though there is a ticket in JIRA, I haven't been able to contribute this >> back. If you think this is what you need, lemme know. > Actually, I've already read this thread. In my opinion, both support for > batch processing and multi-threading are important extensions of DIH's > current capabilities, though issue SOLR-1352 mainly targets the latter. Is > your PDIH implementation able to deal with batch processing right now? > > Best, > Sascha > >> On Thu, Nov 12, 2009 at 6:35 AM, Sascha Szott <sz...@zib.de> wrote: >> >>> Hi all, >>> >>> I'm using the DIH in a parameterized way by passing request parameters >>> that are used inside of my data-config. All imports end up in the same >>> index. >>> >>> 1. Is it considered as good practice to set up several DIH request >>> handlers, one for each possible parameter value? >>> >>> 2. In case the range of parameter values is broad, it's not convenient >>> to >>> define separate request handlers for each value. But this entails a >>> limitation (as far as I see): It is not possible to fire several request >>> to the same DIH handler (with different parameter values) at the same >>> time. However, in case several request handlers would be used (as in >>> 1.), >>> concurrent requests (to the different handlers) are possible. So, how to >>> overcome this limitation? >>> >>> Best, >>> Sascha >>> >> > > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com