I took the cheap and cheerful approach, and created another class that
wraps SimplePostTool. It makes lots of assumptions, such as that the
shell will already have expanded any globs/wildcards, and just assigns
various arguments to the various threads. It is good enough for what I
need. 

The idea of a shell is an interesting one. But is there stuff we
couldn't achieve without creative use of 'curl'?

Upayavira

On Tue, Feb 26, 2013, at 04:34 AM, Otis Gospodnetic wrote:
> Upayavira, ever did this?
> 
> Ha, look at my email from 20 days ago and this:
> https://github.com/javanna/elasticshell
> 
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
> 
> 
> 
> 
> 
> On Wed, Feb 6, 2013 at 2:38 PM, Otis Gospodnetic
> <otis.gospodne...@gmail.com
> > wrote:
> 
> > Btw wouldn't this be a chance to create a solr cli tool, much like
> > es2unix?  Maybe with a shell? I'm off-line now, but I recently came across
> > a java lib that makes this easy... jclam jsomething ...
> >
> > Otis
> > Solr & ElasticSearch Support
> > http://sematext.com/
> > On Feb 6, 2013 8:48 AM, "Jan Høydahl" <jan....@cominvent.com> wrote:
> >
> >> With dependencies I meant external jar dependencies. Perhaps extensions
> >> could have deps while leaving the "core" compilable without?
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >> Solr Training - www.solrtraining.com
> >>
> >> 5. feb. 2013 kl. 17:10 skrev Upayavira <u...@odoko.co.uk>:
> >>
> >> > By dependencies, do you mean other java classes? I was thinking of
> >> > splitting it out into a few classes, each of which is clearer in its
> >> > purpose.
> >> >
> >> > Upayavira
> >> >
> >> > On Tue, Feb 5, 2013, at 02:26 PM, Jan Høydahl wrote:
> >> >> Wiki page exists already: http://wiki.apache.org/solr/post.jar
> >> >>
> >> >> I'm happy to consider a refactoring, especially if it make it SIMPLER
> >> to
> >> >> read and interact with and doesn't add a ton of mandatory dependencies.
> >> >> It should probably still be possible to say something like
> >> >>
> >> >>  javac org/apache/solr/util/SimplePostTool.java
> >> >>  java -cp . org.apache.solr.util.SimplePostTool -h
> >> >>
> >> >> That's just how I've been thinking so far though. If other committers
> >> are
> >> >> happy with abandoning the simple-ness and instead create a
> >> best-practices
> >> >> based feature-rich tool with dependencies, then I'll not object.
> >> >>
> >> >> --
> >> >> Jan Høydahl, search solution architect
> >> >> Cominvent AS - www.cominvent.com
> >> >> Solr Training - www.solrtraining.com
> >> >>
> >> >> 5. feb. 2013 kl. 05:22 skrev Upayavira <u...@odoko.co.uk>:
> >> >>
> >> >>> Thx Jan,
> >> >>>
> >> >>> All I know is I've got a data set of 500k documents, Solr formatted,
> >> and
> >> >>> I want it to be as easy as possible to get them into Solr. I also want
> >> >>> to be able to show the benefit of multithreading. The outcome would
> >> >>> really be "make sure your code uses multiple threads to push to Solr"
> >> >>> rather than "use post.jar in production". I see post.jar as a
> >> >>> demonstration tool, rather than anything else, and am considering
> >> adding
> >> >>> another feature to enhance that.
> >> >>>
> >> >>> However, I did stall once I started looking at the SimplePostTool.jar
> >> >>> class, because it is loosing its connection with the term 'Simple'.
> >> >>> Adding multithreading, however useful, correct, whatever, would
> >> >>> completely push it over the edge. Thus, I think the proper approach is
> >> >>> to refactor the tool into a number of classes, and only then think
> >> about
> >> >>> adding multithreading as a completely separate affair. I'm more than
> >> >>> happy to have a go at that refactoring, especially if you're prepared
> >> to
> >> >>> review it.
> >> >>>
> >> >>> I guess the other thing that is much needed is a wiki page that
> >> details
> >> >>> the features of the tool, and also explains that its role is
> >> >>> educational, rather than anything else.
> >> >>>
> >> >>> Upayavira
> >> >>>
> >> >>> On Mon, Feb 4, 2013, at 09:10 PM, Jan Høydahl wrote:
> >> >>>> Hi,
> >> >>>>
> >> >>>> Hmm, the tool is getting bloated for a one-class no-deps tool
> >> already :)
> >> >>>> Guess it would be useful too with real-life code examples using
> >> SolrJ and
> >> >>>> other libs as well (such as robots.txt lib, commons-cli etc), but
> >> whether
> >> >>>> that should be an extension of SimplePostTool or a totally new tool
> >> from
> >> >>>> scratch is something to discuss. Please bring on your ideas of how
> >> you
> >> >>>> plan to extend it, perhaps even simplifying the code in the process?
> >> >>>>
> >> >>>> --
> >> >>>> Jan Høydahl, search solution architect
> >> >>>> Cominvent AS - www.cominvent.com
> >> >>>> Solr Training - www.solrtraining.com
> >> >>>>
> >> >>>> 3. feb. 2013 kl. 17:19 skrev Upayavira <u...@odoko.co.uk>:
> >> >>>>
> >> >>>>> I have a scenario in which I need to post 500,000 documents to Solr
> >> as a
> >> >>>>> test. I have these documents in XML files already formatted in
> >> Solr's
> >> >>>>> xml format.
> >> >>>>>
> >> >>>>> Posting to Solr using post.jar it takes 1m55s. With a bit of bash
> >> >>>>> jiggery-pokery, I was able to get this down to 1m08s by running four
> >> >>>>> concurrent post.jar instances, which strikes me as a significant
> >> >>>>> improvement.
> >> >>>>>
> >> >>>>> I'm considering adding multithreaded capabilities to post.jar, but
> >> >>>>> before I go to that effort, I wanted to see if anyone else would
> >> >>>>> consider it a useful feature. Given that the SimplePostTool is
> >> becoming
> >> >>>>> far from simple, I wanted to see whether the feature is likely to be
> >> >>>>> accepted before I put in the effort. Also, I would need to consider
> >> >>>>> which parts of the tool to add that to. Currently I only want it for
> >> >>>>> posting XML docs, but there's also crawling capabilities in it too.
> >> >>>>>
> >> >>>>> Thoughts?
> >> >>>>>
> >> >>>>> Upayavira
> >> >>>>
> >> >>
> >>
> >>

Reply via email to