Hi there,

I am working on some performance numbers too. This is part of my evaluation of 
solr. I'm planning to replace a legacy search engine and have to find out if 
this is possible with solr.
I have loaded 1,1 million documents into solr by now. Indexing speed is not a 
big concern for me. I had about 17 documents per second while my indexing 
client is still only a python prototype with a very slow filtering engine based 
on windows Ifilter.

I'm measuring the search performance by using a python client that is 
continually querying solr. It grabs a random word from the results and uses it 
for the next search. For every search request the time from sending the request 
till receiving the response is taken. Every query uses one word as search text 
and one word as filter query text. Highlighting is on.

Some first results:

Solr loaded with 1120000 Documents:
Max queries per second: 14,5
Average request duration with only 1 client: 0,08 s
My criteria of 90% requests completing in less than 1 second is met with a 
maximum of 10 parallel clients.
I suspect to serve at least 300 users with one system like this.
(Measured on a single CPU Pentium4 3GHz, 2GB RAM, internal standard ATA Drive)

Next step will be to increase the number of documents till I reach the point 
where no request is completed in less than 1 second. (From this point on no 
amount of replication can bring me back to production performance).

I have a few questions, too.
- What size is the largest known solr server
- What number of documents do you think can be handled by solr
- Solr is using only one lucene index. There has been a thread about this 
before but it was more related to bringing together different lucene indexes 
under one solr server. I potentially need a solution for up to 500 millions of 
documents. I believe this will not work without splitting the index. What do 
you think?
- Does anybody have own performance numbers they would share?
- solr was running under jetty for my performance tests. What container is best 
suited for high performance?


Thanks a lot for the inspiring talk going on on this mailing list.

Christian


-----Ursprüngliche Nachricht-----
Von: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Gesendet: Montag, 5. Februar 2007 11:23
An: solr-user@lucene.apache.org
Betreff: performance testing practices


This week I'm going to be incrementally loading up to 3.7M records  
into Solr, in 50k chunks.

I'd like to capture some performance numbers after each chunk to see  
how she holds up.

What numbers are folks capturing?  What techniques are you using to  
capture numbers?  I'm not looking for anything elaborate, as the goal  
is really to see how faceting fares as more data is loaded.  We've  
got some ugly data in our initial experiment, so the faceting  
concerns me.

Thanks,
        Erik

Reply via email to