Easier than:

solrpost.sh a*.xml > a.log &
solrpost.sh b*.xml > b.log &
solrpost.sh c*.xml > c.log &

and so on?

We have a fair selection of Solr servers where I work (Chegg), loaded several 
different ways, and one of our production cores is loaded with curl sending in 
a CSV file and checking for errors. Works great.

wunder

On Feb 4, 2013, at 8:47 PM, Upayavira wrote:

> Heh, I've considered all sorts of things :-) Including precisely what
> you are referring to :-) In the end, I need something that will require
> the minimum of effort for a new user, so updating post.jar is going to
> be the most straight-forward, as otherwise I'd need to find a cross
> platform multithreading aware scripting language that is available on
> all platforms by default, and such are in short supply! Whether or not
> the Solr community is interested in my changes is another matter.
> 
> Upayavira
> 
> On Tue, Feb 5, 2013, at 04:43 AM, Walter Underwood wrote:
>> Have you considered writing a script to upload them with curl and running
>> multiple copies of the script in the background?
>> 
>> wunder
>> 
>> On Feb 4, 2013, at 8:22 PM, Upayavira wrote:
>> 
>>> Thx Jan,
>>> 
>>> All I know is I've got a data set of 500k documents, Solr formatted, and
>>> I want it to be as easy as possible to get them into Solr. I also want
>>> to be able to show the benefit of multithreading. The outcome would
>>> really be "make sure your code uses multiple threads to push to Solr"
>>> rather than "use post.jar in production". I see post.jar as a
>>> demonstration tool, rather than anything else, and am considering adding
>>> another feature to enhance that.
>>> 
>>> However, I did stall once I started looking at the SimplePostTool.jar
>>> class, because it is loosing its connection with the term 'Simple'.
>>> Adding multithreading, however useful, correct, whatever, would
>>> completely push it over the edge. Thus, I think the proper approach is
>>> to refactor the tool into a number of classes, and only then think about
>>> adding multithreading as a completely separate affair. I'm more than
>>> happy to have a go at that refactoring, especially if you're prepared to
>>> review it.
>>> 
>>> I guess the other thing that is much needed is a wiki page that details
>>> the features of the tool, and also explains that its role is
>>> educational, rather than anything else.
>>> 
>>> Upayavira
>>> 
>>> On Mon, Feb 4, 2013, at 09:10 PM, Jan Høydahl wrote:
>>>> Hi,
>>>> 
>>>> Hmm, the tool is getting bloated for a one-class no-deps tool already :)
>>>> Guess it would be useful too with real-life code examples using SolrJ and
>>>> other libs as well (such as robots.txt lib, commons-cli etc), but whether
>>>> that should be an extension of SimplePostTool or a totally new tool from
>>>> scratch is something to discuss. Please bring on your ideas of how you
>>>> plan to extend it, perhaps even simplifying the code in the process?
>>>> 
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>> Solr Training - www.solrtraining.com
>>>> 
>>>> 3. feb. 2013 kl. 17:19 skrev Upayavira <u...@odoko.co.uk>:
>>>> 
>>>>> I have a scenario in which I need to post 500,000 documents to Solr as a
>>>>> test. I have these documents in XML files already formatted in Solr's
>>>>> xml format.
>>>>> 
>>>>> Posting to Solr using post.jar it takes 1m55s. With a bit of bash
>>>>> jiggery-pokery, I was able to get this down to 1m08s by running four
>>>>> concurrent post.jar instances, which strikes me as a significant
>>>>> improvement.
>>>>> 
>>>>> I'm considering adding multithreaded capabilities to post.jar, but
>>>>> before I go to that effort, I wanted to see if anyone else would
>>>>> consider it a useful feature. Given that the SimplePostTool is becoming
>>>>> far from simple, I wanted to see whether the feature is likely to be
>>>>> accepted before I put in the effort. Also, I would need to consider
>>>>> which parts of the tool to add that to. Currently I only want it for
>>>>> posting XML docs, but there's also crawling capabilities in it too.
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> Upayavira
>>>> 




Reply via email to