Hi,
There are a couple ways of handling this.
One is to do it from the 'client' side - i.e. do a Solr ping to each
shard beforehand to find out which/if any shards are unavailable. This
may not always work if you use forwarders/proxies etc.
What we do is add the name of all failed shards to the
CommonParams.FAILED_SHARDS parameter in the response header (if
partialResults=true), by retrieving the current list (if any) and
appending:
Excerpt from SearchHandler.java : handleRequestBody():
[code]
log.info("Waiting for shard replies...");
// now wait for replies, but if anyone puts more requests on
// the outgoing queue, send them out immediately (by exiting
// this loop)
while (rb.outgoing.size() == 0) {
ShardResponse srsp = comm.takeCompletedOrError();
if (srsp == null) break; // no more requests to wait for
// If any shard does not respond (ConnectException) we respond with
// other shards and set partialResults to true
for (ShardResponse shardRsp : srsp.getShardRequest().responses) {
Throwable th = shardRsp.getException();
if (th != null) {
log.info("Got shard exception for: " + srsp.getShard()
+ " : " + th.getClass().getName() + " cause: " + th.getCause());
if (th instanceof SolrServerException && th.getCause()
instanceof Exception) {
// Was there an exception and return partial results
is false? If so, abort everything and rethrow
if (failOnShardFailure) {
log.info("Not set for partial results. Aborting...");
comm.cancelAll();
throw new
SolrException(SolrException.ErrorCode.SERVER_ERROR, th);
}
if(rsp.getResponseHeader().get(CommonParams.FAILED_SHARDS) == null) {
rsp.getResponseHeader().add(CommonParams.FAILED_SHARDS,
shardRsp.getShard() + "|" +
(srsp.getException() != null &&
srsp.getException().getCause() != null ?
srsp.getException().getCause().getClass().getSimpleName() :
(th instanceof SolrServerException &&
th.getCause() != null ? th.getCause().getClass().getSimpleName() :
th.getClass().getSimpleName())));
}
else {
////////////////////////////////////////////////////////////////////////////
//Append the name of the failed shard, delimiting
multiple failed shards with |
String prslt =
rsp.getResponseHeader().get(CommonParams.FAILED_SHARDS).toString();
prslt += ";" + shardRsp.getShard() + "|" +
(srsp.getException() != null &&
srsp.getException().getCause() != null ?
srsp.getException().getCause().getClass().getSimpleName() :
(th instanceof SolrServerException &&
th.getCause() != null ? th.getCause().getClass().getSimpleName() :
th.getClass().getSimpleName()));
rsp.getResponseHeader().remove(CommonParams.FAILED_SHARDS);
rsp.getResponseHeader().add(CommonParams.FAILED_SHARDS, prslt);
}
log.error("Connection to shard [" +
shardRsp.getShard() + "] did not succeed", th.getCause());
} else {
comm.cancelAll();
if (th instanceof SolrException) {
throw (SolrException) th;
} else {
throw new
SolrException(SolrException.ErrorCode.SERVER_ERROR,
srsp.getException());
}
}
}
}
rb.finished.add(srsp.getShardRequest());
[/code]
[Note we also log the failure to the [local] server's log]
Your client can then extract the CommonParams.FAILED_SHARDS parameter
and display and/or process accordingly.