Hi,
I do not have an answer to your questions.
But, I have the same issue/problem you have.

It would be good if Solr community would agree and share their approach
for benchmarking Solr. Indeed, it would be good to have a benchmark for
"information retrieval" systems. AFIK there isn't one. :-/

The content on the wiki [1] is better than nothing, but in practice
more is needed IMHO.

I have seen JMeter being used in ElasticSearch [2].
Solr could do the same to help users and new adopters to start.

Some guidelines/advices (I know it's hard) would be useful as well.

I ended up writing my own "crappy" multi-threaded benchmarking tool.
Also, are you using Jetty? At a certain point, in particular when you
are hitting the Solr cache and returning a large number of results,
the transfer time is a significant part of your response time.
Tuning Jetty or Tomcat or something else is essential.

Are you using Jetty or Tomcat?

I would also be interested in understanding the impact of the slave
pooling interval on searches and the impact of the number of slaves
and pooling interval on updates on the master.

Paolo

 [1] http://wiki.apache.org/solr/SolrPerformanceData
[2] http://github.com/elasticsearch/elasticsearch/tree/master/modules/benchmark/jmeter

Blargy wrote:
I am about to deploy Solr into our production environment and I would like to
do some benchmarking to determine how many slaves I will need to set up.
Currently the only way I know how to benchmark is to use Apache Benchmark
but I would like to be able to send random requests to the Solr... not just
one request over and over.

I have a sample data set of 5000 user entered queries and I would like to be
able to use AB to benchmark against all these random queries. Is this
possible?

FYI our current index is ~1.5 gigs with ~5m documents and we will be using
faceting quite extensively. Are average requests per/day is ~2m. We will be
running RHEL with about 8-12g ram. Any idea how many slaves might be
required to handle our load?

Thanks

Reply via email to