Your indexing project is disk-bound. My modern midrange laptop gets
30MB/s doing "cat > /dev/null" (1 7200rpm disk). The Amazon instances
I'm playing with get 50-60 (I really want to know how it fits
together). Your laptop might be 10-20?

On Thu, Sep 24, 2009 at 11:54 PM, Constantijn Visinescu
<baeli...@gmail.com> wrote:
> This may or may not help but here goes :)
>
> When i was running performance tests i look a look at the simple post tool
> that comes with the solr examples.
>
> First i changed my schema.xml to fit my needs and then i deleted the old
> index so solr created a blank one when i started up.
> Then i had a had a process chew on my data and spit out xml files that are
> formatted similarly to the xml files that the SimplePostTool example uses.
> Next i used the simple Post tool to post the xml files to solr (60k-80k
> records per xml file). Each file only took a couple minutes to index this
> way.
> Comit and optimize after that (took less then 10 minutes) and after about
> 2.5 hrs i had indexed just under 8 milion records.
>
> This was on a 4 year old single core laptop using resin 3 as my servlet
> container.
>
> Hope this helps.
>
>
> On Fri, Sep 25, 2009 at 3:51 AM, Lance Norskog <goks...@gmail.com> wrote:
>
>> In "top", press the '1' key. This will give a list of the CPUs and how
>> much load is on each. The display is otherwise a little weird for
>> multi-cpu machines. But don't be surprised when Solr is I/O bound. The
>> biggest fanciest RAID is often a better investment than CPUs. On one
>> project we bought low-end rack servers come with 6-8 disk bays,
>> filling them with 10k/15k RPM disks.
>>
>> On Wed, Sep 23, 2009 at 2:47 PM, Dan A. Dickey <dan.dic...@savvis.net>
>> wrote:
>> > On Friday 11 September 2009 11:06:20 am Dan A. Dickey wrote:
>> > ...
>> >> Our JBoss expert and I will be looking into why this might be occurring.
>> >> Does anyone know of any JBoss related slowness with Solr?
>> >> And does anyone have any other sort of suggestions to speed indexing
>> >> performance?   Thanks for your help all!  I'll keep you up to date with
>> >> further progress.
>> >
>> > Ok, further progress... just to keep any interested parties up to date
>> > and for the record...
>> >
>> > I'm finding that using the "example" jetty setup (will be switching very
>> > very soon to a "real" jetty installation) is about the fastest.  Using
>> > several processes to send posts to Solr helps a lot, and we're seeing
>> > about 80 posts a second this way.
>> >
>> > We also stripped down JBoss to the bare bones and the Solr in it
>> > is running nearly as fast - about 50 posts a second.  It was our previous
>> > JBoss configuration that was making it appear "slow" for some reason.
>> >
>> > We will be running more tests and spreading out the "pre-index" workload
>> > across more machines and more processes. In our case we were seeing
>> > the bottleneck being one machine running 18 processes.
>> > The 2 quad core xeon system is experiencing about a 25% cpu load.
>> > And I'm not certain, but I think this may be actually 25% of one of the 8
>> cores.
>> > So, there's *lots* of room for Solr to be doing more work there.
>> >        -Dan
>> >
>> > --
>> > Dan A. Dickey | Senior Software Engineer
>> >
>> > Savvis
>> > 10900 Hampshire Ave. S., Bloomington, MN  55438
>> > Office: 952.852.4803 | Fax: 952.852.4951
>> > E-mail: dan.dic...@savvis.net
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to