Chris M. Hostetter created SOLR-14718:
-----------------------------------------
Summary: Multiple flaws in tracking which UpdateCommand is
associated with a given failure logged by
ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}"
Key: SOLR-14718
URL: https://issues.apache.org/jira/browse/SOLR-14718
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter
Here's an example, taken from SOLR-13486, of an ERROR logged by
{{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure
occured...
{noformat}
[junit4] 2> 1704143 ERROR
(updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1
r:core_node2 null n:127.0.0.1:34940_solr
c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr
c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2
x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1]
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode:
http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
to
http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/
[junit4] 2> => java.io.IOException: java.net.ConnectException:
Connection refused
{noformat}
In this case the the underlying cause was a ConnectException - but the same
ERROR msg format is used regardless of the underlying Exception that was thrown
- and it's the result of these two bits of code...
{code:java}
// ErrorReportingConcurrentUpdateSolrClient.handleError
log.error("Error when calling {} to {}", req, req.node.getUrl(), ex);
// Req.toString()...
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString());
sb.append("; node=").append(String.valueOf(node));
return sb.toString();
}
{code}
I was recently asked why the {{UpdateCommand cmd}} reported by the
{{Req.toString()}} was *ALWAYS* showing up as {{add\{,id=(null)};}} (ie: an
"empty" {{AddUpdateCommand}} ) instead of correctly identifying which document
was failing.
In the above case of a "ConnectionException" this may not matter, but the same
problem exists if an individual document has problem, perhaps due to schema
conflictss detected by the leader when some other node forwards TOLEADER.
Based on an audit of the code, there appears to be at least 2 diff bugs in Solr
that can cause the "cmd" reported in these error situations to be wrong:
* UpdateCommand re-use in JavabinLoader
* ErrorReportingConcurrentUpdateSolrClient in StreamingSolrClients
...full notes to follow in comment.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]