Yes the AnalyticsQuery is being called twice in the logs, which is not a good thing. Originally I believe this was not the case but changes in the QueryComponent in later release have caused this to happen. The test cases aren't broken by this so it didn't get caught.
The actual merge of the results from the AnalyticsQuery, which is done in the MergeStrategy, will only happen on the first stage. In the second stage the results from the Analytics query should be ignored. As a work around for the double call to the AnalyticsQuery you can look for the "ids" param in your Analytics query and skip gathering the analytics if it's present. The ids param is sent in the second phase of a distributed search. What you're running into here is that the MergeStrategy is not really in use in combination with the AnalyticsQuery. There are users that use the MergeStrategy to handle custom merging of documents to produce custom rankings. But the AnalyticsQuery really hasn't been used much with the MergeStrategy that I'm aware of. So this has not been reported before. I have moved away from using the MergeStrategy for merging custom analytics. I'll give you a little context for how this has evolved. The MergeStrategy was originally introduced for an e-commerce customer that wanted to produce custom rankings. As part of that work the AnalyticsQuery was added to support custom analytics. And the MergeStrategy supported that as well. Later, Streaming Expressions were added which took control of the merge in a much more elegant way then the MergeStrategy. So now there are features in Solr that nicely combine an AnalyticsQuery which is merged through the Streaming Expression framework. The FeatureSelectionStream and the TextLogitStream use this approach. These two streams are in master and branch_6x if you want to see how they operate. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Aug 11, 2016 at 10:29 AM, tedsolr <tsm...@sciquest.com> wrote: > OK, some more info ... it's not aggregating because the doc values it's > using > for grouping are the unique ID field's. There are some big differences in > the whole flow between searches against a single shard collection, and > searches against a multi-shard collection. In a single shard collection the > AnalyticsQuery is called one time, and there's only one pass through the > delegating collector. If someone could explain what's going on in a > multi-sharded search that would help a lot I think. My test collection has > two shards each one has a replica. > > For this search > .../aggr?q=*:*&fl=VENDOR_NAME&sort=VENDOR_NAME+asc > The user has selected just one field to view, so I make VENDOR_NAME the > group by field. > > This is what I see while debugging: > 1. custom AnalyticsQuery is instantiated and the "fl" param is VENDOR_NAME > + > [AggregationStats] > 2. custom AnalyticsQuery is instantiated (again) and the "fl" param is id + > [AggregationStats] > 3. custom AnalyticsQuery is instantiated (again) and the "fl" param is id + > [AggregationStats] > 4. getAnalyticsCollector() is called (fl is id + [AggregationStats]) > 5. getAnalyticsCollector() is called again (fl is id + [AggregationStats]) > 6. custom DelegatingCollector finish() is called > 7. custom DelegatingCollector finish() is called > 8. custom AnalyticsQuery is instantiated and the "fl" param is VENDOR_NAME > + > [AggregationStats] + id + [AggregationStats] > 9. custom AnalyticsQuery is instantiated and the "fl" param is VENDOR_NAME > + > [AggregationStats] + id + [AggregationStats] > > And from the log: > > INFO - 2016-08-11 09:19:47.245; [ShardTest1 shard1_1 core_node4 > ShardTest1_shard1_1_replica1] org.apache.solr.core.SolrCore; > [ShardTest1_shard1_1_replica1] webapp=/solr path=/aggr > params={distrib=false&qt=/aggr&fl=id&shards.purpose=4& > start=0&fsv=true&sort=VENDOR_NAME+asc&fq={!AggregationPostFilter+count% > 3DCount+spend%3DINVOICE_AMOUNT}&shard.url=http://localhost:8983/solr/ > ShardTest1_shard1_1_replica1/|http://localhost:8984/solr/ > ShardTest1_shard1_1_replica2/&rows=10&version=2&q=*:*&NOW= > 1470925120206&isShard=true&wt=javabin&_=1470925120222} > hits=12096 status=0 QTime=64734 > > INFO - 2016-08-11 09:19:48.876; [ShardTest1 shard1_0 core_node3 > ShardTest1_shard1_0_replica1] org.apache.solr.core.SolrCore; > [ShardTest1_shard1_0_replica1] webapp=/solr path=/aggr > params={distrib=false&qt=/aggr&fl=id&shards.purpose=4& > start=0&fsv=true&sort=VENDOR_NAME+asc&fq={!AggregationPostFilter+count% > 3DCount+spend%3DINVOICE_AMOUNT}&shard.url=http://localhost:8983/solr/ > ShardTest1_shard1_0_replica1/|http://localhost:8984/solr/ > ShardTest1_shard1_0_replica2/&rows=10&version=2&q=*:*&NOW= > 1470925120206&isShard=true&wt=javabin&_=1470925120222} > hits=12062 status=0 QTime=66365 > > INFO - 2016-08-11 09:19:50.952; [ShardTest1 shard1_1 core_node4 > ShardTest1_shard1_1_replica1] org.apache.solr.core.SolrCore; > [ShardTest1_shard1_1_replica1] webapp=/solr path=/aggr > params={distrib=false&qt=/aggr&fl=VENDOR_NAME&fl=[AggregationStats]&fl=id& > shards.purpose=64&fq={!AggregationPostFilter+count% > 3DCount+spend%3DINVOICE_AMOUNT}&shard.url=http://localhost:8983/solr/ > ShardTest1_shard1_1_replica1/|http://localhost:8984/solr/ > ShardTest1_shard1_1_replica2/&version=2&q=*:*&NOW= > 1470925120206&ids=100713,940122,44812,210965,584851& > isShard=true&wt=javabin&_=1470925120222} > status=0 QTime=2070 > > INFO - 2016-08-11 09:19:53.176; [ShardTest1 shard1_0 core_node3 > ShardTest1_shard1_0_replica1] org.apache.solr.core.SolrCore; > [ShardTest1_shard1_0_replica1] webapp=/solr path=/aggr > params={distrib=false&qt=/aggr&fl=VENDOR_NAME&fl=[AggregationStats]&fl=id& > shards.purpose=64&fq={!AggregationPostFilter+count% > 3DCount+spend%3DINVOICE_AMOUNT}&shard.url=http://localhost:8983/solr/ > ShardTest1_shard1_0_replica1/|http://localhost:8984/solr/ > ShardTest1_shard1_0_replica2/&version=2&q=*:*&NOW= > 1470925120206&ids=533737,44864,100672,940123,96752& > isShard=true&wt=javabin&_=1470925120222} > status=0 QTime=4293 > > INFO - 2016-08-11 09:19:53.178; [ShardTest1 shard1_0 core_node3 > ShardTest1_shard1_0_replica1] org.apache.solr.core.SolrCore; > [ShardTest1_shard1_0_replica1] webapp=/solr path=/aggr > params={q=*:*&indent=true&fl=VENDOR_NAME&sort=VENDOR_NAME+ > asc&wt=json&_=1470925120222} > hits=24158 status=0 QTime=72972 > > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/AnalyticsQuery-fails-on-a-sharded-collection- > tp4289274p4291301.html > Sent from the Solr - User mailing list archive at Nabble.com. >