I have your suggested shards.qt set up in another collection for another
reason but I'll do that redirect here as well, thanks for the confirmation.
On 08/04/2015 10:45 AM, Shawn Heisey wrote:
On 8/4/2015 5:19 AM, David Santamauro wrote:
I have a question about how the stat 'requests' is calculated. I would
really appreciate it if anyone could shed some light on the figures below.
Assumptions:
version: 5.2.0
layout: 8 node solrcloud, no replicas (node71-node78)
collection: col1
handler: /search
stats request: /col1/admin/mbeans?stats=true&cat=QUERYHANDLER&wt=json'
I wrote a simple shell script that grabs the requests stats member from
every node.
After collection reload
node 71 -- requests: 2
node 72 -- requests: 2
node 73 -- requests: 2
node 74 -- requests: 2
node 75 -- requests: 2
node 76 -- requests: 2
node 77 -- requests: 2
node 78 -- requests: 2
* I assume these are the auto-warm searches
After submitting 1 request (q=*:*)
node 71 -- requests: 4
node 72 -- requests: 3
node 73 -- requests: 3
node 74 -- requests: 3
node 75 -- requests: 3
node 76 -- requests: 4
node 77 -- requests: 3
node 78 -- requests: 3
After resubmitting the same request
node 71 -- requests: 6
node 72 -- requests: 4
node 73 -- requests: 4
node 74 -- requests: 4
node 75 -- requests: 4
node 76 -- requests: 5
node 77 -- requests: 5
node 78 -- requests: 4
If that wasn't strange enough, things get out of control if I add in
facet.pivot parameter(s)
Fresh after reload (see above, 2 for every node)
Total after a facet.pivot on two fields
node 71 -- requests: 13
node 72 -- requests: 15
node 73 -- requests: 14
node 74 -- requests: 12
node 75 -- requests: 14
node 76 -- requests: 12
node 77 -- requests: 14
node 78 -- requests: 12
I imagine I'm seeing the internal cross-talk between nodes and if so,
how can one reliably keep stats on the number of "real" requests?
Queries on distributed indexes change from the one request that you make
into a request to every shard, to check for relevant documents. If
relevant documents are found, a second call to those specific shards is
made to retrieve those documents. So if you have 5 shards in your
index, there could be up to 11 requests counted for a single query. If
all the shards are on separate nodes, then for that 11-request query,
one of those nodes would count three requests and the others would count
two.
I know what I'm going to say next would work on an index that is
distributed but *not* SolrCloud, and I think it will work in SolrCloud too.
If you add a "shards.qt" parameter to defaults in your main request
handler (usually /select) that points at another, identically configured
handler (perhaps named "/shards") that is also in solrconfig.xml, then
that other handler should receive the distributed requests and the main
handler should only count the "real" requests. You would be able to
track those numbers separately.
Thanks,
Shawn