Re: faceting over ngrams

Jonathan Rochkind Wed, 16 Mar 2011 09:24:20 -0700

Ah, wait, you're doing sharding? Yeah, I am NOT doing sharding, so thatcould explain our different experiences. It seems like shardingdefinitely has trade-offs, makes some things faster and other thingsslower. So far I've managed to avoid it, in the interest of keepingthings simpler and easier to understand (for me, the developer/Solrmanager), thinking that sharding is also a somewhat less mature feature.

With only 1M documents.... are you sure you need sharding at all? Youcould still use replication to "scale out" for volume, sharding seemsmore about scaling for number of documents (or total bytes) in yourindex. 1M documents is not very large, for Solr, in general.


Jonathan

On 3/16/2011 11:51 AM, Toke Eskildsen wrote:

On Wed, 2011-03-16 at 13:05 +0100, Dmitry Kan wrote:

Hello guys. We are using shard'ed solr 1.4 for heavy faceted search over the
trigrams field with about 1 million of entries in the result set and more
than 100 million of entries to facet on in the index. Currently the faceted
search is very slow, taking about 5 minutes per query.

I tried creating an index with 1M documents, each with 100 unique terms
in a field. A search for "*:*" with a facet request for the first 1M
entries in the field took about 20 seconds for the first call and about
1-1½ second for each subsequent call. This was with Solr trunk. The
complexity of my setup is no doubt a lot simpler and lighter than yours,
but 5 minutes sounds excessive.

My guess is that your performance problem is due to the merging process.
Could you try measuring the performance of a direct request to a single
shard? If that is satisfactory, going to the cloud would not solve your
problem. If you really need 1M entries in your result set, you would be
better of investigating whether your index can be in a single instance.

Re: faceting over ngrams

Reply via email to