Hi there, I am working on some performance numbers too. This is part of my evaluation of solr. I'm planning to replace a legacy search engine and have to find out if this is possible with solr. I have loaded 1,1 million documents into solr by now. Indexing speed is not a big concern for me. I had about 17 documents per second while my indexing client is still only a python prototype with a very slow filtering engine based on windows Ifilter.
I'm measuring the search performance by using a python client that is continually querying solr. It grabs a random word from the results and uses it for the next search. For every search request the time from sending the request till receiving the response is taken. Every query uses one word as search text and one word as filter query text. Highlighting is on. Some first results: Solr loaded with 1120000 Documents: Max queries per second: 14,5 Average request duration with only 1 client: 0,08 s My criteria of 90% requests completing in less than 1 second is met with a maximum of 10 parallel clients. I suspect to serve at least 300 users with one system like this. (Measured on a single CPU Pentium4 3GHz, 2GB RAM, internal standard ATA Drive) Next step will be to increase the number of documents till I reach the point where no request is completed in less than 1 second. (From this point on no amount of replication can bring me back to production performance). I have a few questions, too. - What size is the largest known solr server - What number of documents do you think can be handled by solr - Solr is using only one lucene index. There has been a thread about this before but it was more related to bringing together different lucene indexes under one solr server. I potentially need a solution for up to 500 millions of documents. I believe this will not work without splitting the index. What do you think? - Does anybody have own performance numbers they would share? - solr was running under jetty for my performance tests. What container is best suited for high performance? Thanks a lot for the inspiring talk going on on this mailing list. Christian -----Ursprüngliche Nachricht----- Von: Erik Hatcher [mailto:[EMAIL PROTECTED] Gesendet: Montag, 5. Februar 2007 11:23 An: solr-user@lucene.apache.org Betreff: performance testing practices This week I'm going to be incrementally loading up to 3.7M records into Solr, in 50k chunks. I'd like to capture some performance numbers after each chunk to see how she holds up. What numbers are folks capturing? What techniques are you using to capture numbers? I'm not looking for anything elaborate, as the goal is really to see how faceting fares as more data is loaded. We've got some ugly data in our initial experiment, so the faceting concerns me. Thanks, Erik