However you can create multiple DIH configs under a core/collection. You can run them each in parallel and commit at the end.
SELECT * FROM existingtable WHERE column >= 1 AND column <= 2000; SELECT * FROM existingtable WHERE column >= 2001 AND column <= 4000; Something like that works for us to speed it up. On Wed, Jan 25, 2017 at 4:01 PM, Davis, Daniel (NIH/NLM) [C] < daniel.da...@nih.gov> wrote: > DIH is not multi-threaded, and so the idea of "queueing" up requests is a > misnomer. You might be better off using something other than > DataImportHandler. > LogStash can pull what it calls "events" from a database and then push > them into Solr, and you have some of the same row transformation > capabilities that DataImportHandler has. > > This is also the bread and butter of ETL tools such as > Kettle/Talend/MuleSoft/etc. > > That said, what I have done in the past is to take different streams of > data and divide them into different requestHandlers, all using > DataImportHandler. > Each of these request handlers has its own context as to whether it is > busy or not, and so each can be separately active/inactive. > > <!-- Data Import Handler for Health Topics --> > <requestHandler name="/import/health-topics" > class="solr.DataImportHandler"> > <lst name="defaults"> > <str name="config">health-topics-conf.xml</str> > </lst> > </requestHandler> > > <!-- Data Import Handler for Drugs and Supplements --> > <requestHandler name="/import/drugs" class="solr.DataImportHandler"> > <lst name="defaults"> > <str name="config">drugs-conf.xml</str> > </lst> > </requestHandler> > > > Both of the above or XML imports, but with database imports, I also > one-time implemented a sort of multithreading by having 4 request handlers > and 4 data-config files, each taking their own slice of data: > > data-config-0.xml > ... > <entity name="medsite" dataSource="proddb" rootEntity="true" > query="SELECT * FROM (SELECT t.*, Mod(RowNum, 4) threadid FROM > my_data_view t) WHERE threadid = 0" > transformer="TemplateTransformer,LogTransformer" > logTemplate="topic thread 0"> > ... > > data-config-1.xml: > ... > <entity name="medsite" dataSource="proddb" rootEntity="true" > query="SELECT * FROM (SELECT t.*, Mod(RowNum, 4) threadid FROM > my_data_view t) WHERE threadid = 1" > transformer="TemplateTransformer,LogTransformer" > logTemplate="topic thread 1" logLevel="debug"> > ... > > And so on... > > -----Original Message----- > From: William Bell [mailto:billnb...@gmail.com] > Sent: Wednesday, January 25, 2017 5:39 PM > To: solr-user@lucene.apache.org > Subject: Re: Does DIH queues up requests > > What we do is : > > Run URL to delete *:*, but do not commit. > > 1. Kick off indexing on DIH1, clean=false, commit=false. > 2. Kick off indexing on DIH2, clean=false, commit=false > > Then we manually commit. > > On Wed, Jan 25, 2017 at 2:57 PM, Nkeet Shah <nkeet.s...@mathworks.com> > wrote: > > > Hi, > > I have a multi-thread application that makes DIH request to perform > > indexing. What I could not gather from the documentation is that does > > DIH requests are queued up. > > > > In essence if a made a request to say DIH1 and it has accepted the > > request and is working on the indexing. What would happen if another > > request is made to the same DIH1. Will it be queued or rejected/ > > > > Thanks > > Ankit! > > > > > > > -- > Bill Bell > billnb...@gmail.com > cell 720-256-8076 > -- Bill Bell billnb...@gmail.com cell 720-256-8076