Ah, wait, you're doing sharding? Yeah, I am NOT doing sharding, so that could explain our different experiences. It seems like sharding definitely has trade-offs, makes some things faster and other things slower. So far I've managed to avoid it, in the interest of keeping things simpler and easier to understand (for me, the developer/Solr manager), thinking that sharding is also a somewhat less mature feature.

With only 1M documents.... are you sure you need sharding at all? You could still use replication to "scale out" for volume, sharding seems more about scaling for number of documents (or total bytes) in your index. 1M documents is not very large, for Solr, in general.

Jonathan

On 3/16/2011 11:51 AM, Toke Eskildsen wrote:
On Wed, 2011-03-16 at 13:05 +0100, Dmitry Kan wrote:
Hello guys. We are using shard'ed solr 1.4 for heavy faceted search over the
trigrams field with about 1 million of entries in the result set and more
than 100 million of entries to facet on in the index. Currently the faceted
search is very slow, taking about 5 minutes per query.
I tried creating an index with 1M documents, each with 100 unique terms
in a field. A search for "*:*" with a facet request for the first 1M
entries in the field took about 20 seconds for the first call and about
1-1½ second for each subsequent call. This was with Solr trunk. The
complexity of my setup is no doubt a lot simpler and lighter than yours,
but 5 minutes sounds excessive.

My guess is that your performance problem is due to the merging process.
Could you try measuring the performance of a direct request to a single
shard? If that is satisfactory, going to the cloud would not solve your
problem. If you really need 1M entries in your result set, you would be
better of investigating whether your index can be in a single instance.

Reply via email to