: Is there any debug settings to see where the time is taken during a
: distributed search?

I don't think so.  the existing timming code will show you how much 
time each search component took, but i don't think anything breaks it down 
to isolate the remote requests

: When I query all shards together, I get:
: 
http://host:8880/solr/select/?shards=host1:8881/solr,host2:8882/solr,host3:8883/solr,host4:8884/solr,host5:8885/solr,host6:8886/solr,host7:8887/solr&q=cancer
: 428 then 287
: 
: If I isolate each shard like this:
: http://host:8880/solr/select/?shards=host1:8881/solr&q=cancer
: 195,146,844,230,51,48,43
: 
: Then going directly gets this:
: http://host1:8881/solr/select/?q=cancer
: 0,1,0,1,1,1,1

FWIW: hitting a single shard like this last url doesn't give you an 
accurate representation of what is happening when that shard is hit by 
that second URL.  different code paths are triggered on the shards when 
the request originates from the ... i'm not sure what the term is ... 
coordinator? (host:8880 in your example)  If you look at the logs for 
host1:8881 when hitting that second URL you listed, you should see 
multiple requests coming in representing the multiple passes that rae 
involved in a distributed search)

: I can see taking a few sample responses is not conclusive to say one shard
: is slower or faster. However, the query time directly is orders of magnitude
: faster than through shards.

if i remember correctly, distributed searching on one shard (or even a few 
small shards) is almost goingto be slower then having a single index if 
you can fit that index on one machine.

: My only guess is this is network based and involved in passing the results
: around in order to reduce them.
: 
: Is there any debug or way to confirm and investigate this further?

I would look at the timing info loged on each of the shards, and compute 
some aggregate info about how long the various stages take -- you dont' 
neccessarily need to corilate that this request to the coordinator 
corrisponds exactly to these requests on each shard, as long as you can 
say "on average a query for cancer takes this long on the coordinator, 
and on average the stage1 query for cancer takes this much time on each 
shard and the stage2 query for cancer takes this much time on each 
shard,..." then you should be able make some assessments about where the 
time is being spent


-Hoss

Reply via email to