Instead of writing code, I’d fire up SQL Workbench/J, load the same JDBC driver that is being used in Solr, and run the query.
https://www.sql-workbench.eu <https://www.sql-workbench.eu/> If that takes 3.5 hours, you have isolated the problem. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Aug 18, 2020, at 6:50 AM, David Hastings <hastings.recurs...@gmail.com> > wrote: > > Another thing to mention is to make sure the indexer you build doesnt send > commits until its actually done. Made that mistake with some early in > house indexers. > > On Tue, Aug 18, 2020 at 9:38 AM Charlie Hull <char...@flax.co.uk> wrote: > >> 1. You could write some code to pull the items out of Mongo and dump >> them to disk - if this is still slow, then it's Mongo that's the problem. >> 2. Write a standalone indexer to replace DIH, it's single threaded and >> deprecated anyway. >> 3. Minor point - consider whether you need to index everything every >> time or just the deltas. >> 4. Upgrade Solr anyway, not for speed reasons but because that's a very >> old version you're running. >> >> HTH >> >> Charlie >> >> On 17/08/2020 19:22, Abhijit Pawar wrote: >>> Hello, >>> >>> We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / >>> replicas and just single core. >>> It takes almost 3.5 hours to index that data. >>> I am using a data import handler to import data from the mongo database. >>> >>> Is there something we can do to reduce the time taken to index? >>> Will upgrade to newer version help? >>> >>> Appreciate your help! >>> >>> Regards, >>> Abhijit >>> >> >> -- >> Charlie Hull >> OpenSource Connections, previously Flax >> >> tel/fax: +44 (0)8700 118334 >> mobile: +44 (0)7767 825828 >> web: www.o19s.com >> >>