By dependencies, do you mean other java classes? I was thinking of splitting it out into a few classes, each of which is clearer in its purpose.
Upayavira On Tue, Feb 5, 2013, at 02:26 PM, Jan Høydahl wrote: > Wiki page exists already: http://wiki.apache.org/solr/post.jar > > I'm happy to consider a refactoring, especially if it make it SIMPLER to > read and interact with and doesn't add a ton of mandatory dependencies. > It should probably still be possible to say something like > > javac org/apache/solr/util/SimplePostTool.java > java -cp . org.apache.solr.util.SimplePostTool -h > > That's just how I've been thinking so far though. If other committers are > happy with abandoning the simple-ness and instead create a best-practices > based feature-rich tool with dependencies, then I'll not object. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > 5. feb. 2013 kl. 05:22 skrev Upayavira <u...@odoko.co.uk>: > > > Thx Jan, > > > > All I know is I've got a data set of 500k documents, Solr formatted, and > > I want it to be as easy as possible to get them into Solr. I also want > > to be able to show the benefit of multithreading. The outcome would > > really be "make sure your code uses multiple threads to push to Solr" > > rather than "use post.jar in production". I see post.jar as a > > demonstration tool, rather than anything else, and am considering adding > > another feature to enhance that. > > > > However, I did stall once I started looking at the SimplePostTool.jar > > class, because it is loosing its connection with the term 'Simple'. > > Adding multithreading, however useful, correct, whatever, would > > completely push it over the edge. Thus, I think the proper approach is > > to refactor the tool into a number of classes, and only then think about > > adding multithreading as a completely separate affair. I'm more than > > happy to have a go at that refactoring, especially if you're prepared to > > review it. > > > > I guess the other thing that is much needed is a wiki page that details > > the features of the tool, and also explains that its role is > > educational, rather than anything else. > > > > Upayavira > > > > On Mon, Feb 4, 2013, at 09:10 PM, Jan Høydahl wrote: > >> Hi, > >> > >> Hmm, the tool is getting bloated for a one-class no-deps tool already :) > >> Guess it would be useful too with real-life code examples using SolrJ and > >> other libs as well (such as robots.txt lib, commons-cli etc), but whether > >> that should be an extension of SimplePostTool or a totally new tool from > >> scratch is something to discuss. Please bring on your ideas of how you > >> plan to extend it, perhaps even simplifying the code in the process? > >> > >> -- > >> Jan Høydahl, search solution architect > >> Cominvent AS - www.cominvent.com > >> Solr Training - www.solrtraining.com > >> > >> 3. feb. 2013 kl. 17:19 skrev Upayavira <u...@odoko.co.uk>: > >> > >>> I have a scenario in which I need to post 500,000 documents to Solr as a > >>> test. I have these documents in XML files already formatted in Solr's > >>> xml format. > >>> > >>> Posting to Solr using post.jar it takes 1m55s. With a bit of bash > >>> jiggery-pokery, I was able to get this down to 1m08s by running four > >>> concurrent post.jar instances, which strikes me as a significant > >>> improvement. > >>> > >>> I'm considering adding multithreaded capabilities to post.jar, but > >>> before I go to that effort, I wanted to see if anyone else would > >>> consider it a useful feature. Given that the SimplePostTool is becoming > >>> far from simple, I wanted to see whether the feature is likely to be > >>> accepted before I put in the effort. Also, I would need to consider > >>> which parts of the tool to add that to. Currently I only want it for > >>> posting XML docs, but there's also crawling capabilities in it too. > >>> > >>> Thoughts? > >>> > >>> Upayavira > >> >