Re: field collapsing performance in sharded environment

2013-11-19 Thread Otis Gospodnetic
Have a look at https://issues.apache.org/jira/browse/SOLR-5027 + https://wiki.apache.org/solr/CollapsingQParserPlugin Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wed, Nov 13, 2013 at 2:46 PM, David Anthony Troiano < dtr

Re: field collapsing performance in sharded environment

2013-11-15 Thread Paul Masurel
That's not the way grouping is done. On a first round all shards return their 10 best group (represented as their 10 best grouping values). As a result it's a three round thing instead of the two round for regular search, so observing an increasing in latency is normal but not in the realm of what

Re: field collapsing performance in sharded environment

2013-11-14 Thread Erick Erickson
bq: Of the 10k docs, most have a unique near duplicate hash value, so there are about 10k unique values for the field that I'm grouping on. I suspect (but don't know the grouping code well) that this is the issue. You're getting the top N groups, right? But in the general case, you can't insure

field collapsing performance in sharded environment

2013-11-13 Thread David Anthony Troiano
Hello, I'm hitting a performance issue when using field collapsing in a distributed Solr setup and I'm wondering if others have seen it and if anyone has an idea to work around. it. I'm using field collapsing to deduplicate documents that have the same near duplicate hash value, and deduplicating