I have a scenario in which I need to post 500,000 documents to Solr as a test. I have these documents in XML files already formatted in Solr's xml format.
Posting to Solr using post.jar it takes 1m55s. With a bit of bash jiggery-pokery, I was able to get this down to 1m08s by running four concurrent post.jar instances, which strikes me as a significant improvement. I'm considering adding multithreaded capabilities to post.jar, but before I go to that effort, I wanted to see if anyone else would consider it a useful feature. Given that the SimplePostTool is becoming far from simple, I wanted to see whether the feature is likely to be accepted before I put in the effort. Also, I would need to consider which parts of the tool to add that to. Currently I only want it for posting XML docs, but there's also crawling capabilities in it too. Thoughts? Upayavira