Chris M. Hostetter created SOLR-14718: -----------------------------------------
Summary: Multiple flaws in tracking which UpdateCommand is associated with a given failure logged by ErrorReportingConcurrentUpdateSolrClient: "cmd=add{,id=(null)}" Key: SOLR-14718 URL: https://issues.apache.org/jira/browse/SOLR-14718 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter Here's an example, taken from SOLR-13486, of an ERROR logged by {{ErrorReportingConcurrentUpdateSolrClient}} when a distrubted update failure occured... {noformat} [junit4] 2> 1704143 ERROR (updateExecutor-6525-thread-1-processing-x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1 r:core_node2 null n:127.0.0.1:34940_solr c:outOfSyncReplicasCannotBecomeLeader-false s:shard1) [n:127.0.0.1:34940_solr c:outOfSyncReplicasCannotBecomeLeader-false s:shard1 r:core_node2 x:outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n1] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode: http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ to http://127.0.0.1:40376/solr/outOfSyncReplicasCannotBecomeLeader-false_shard1_replica_n5/ [junit4] 2> => java.io.IOException: java.net.ConnectException: Connection refused {noformat} In this case the the underlying cause was a ConnectException - but the same ERROR msg format is used regardless of the underlying Exception that was thrown - and it's the result of these two bits of code... {code:java} // ErrorReportingConcurrentUpdateSolrClient.handleError log.error("Error when calling {} to {}", req, req.node.getUrl(), ex); // Req.toString()... public String toString() { StringBuilder sb = new StringBuilder(); sb.append("SolrCmdDistributor$Req: cmd=").append(cmd.toString()); sb.append("; node=").append(String.valueOf(node)); return sb.toString(); } {code} I was recently asked why the {{UpdateCommand cmd}} reported by the {{Req.toString()}} was *ALWAYS* showing up as {{add\{,id=(null)};}} (ie: an "empty" {{AddUpdateCommand}} ) instead of correctly identifying which document was failing. In the above case of a "ConnectionException" this may not matter, but the same problem exists if an individual document has problem, perhaps due to schema conflictss detected by the leader when some other node forwards TOLEADER. Based on an audit of the code, there appears to be at least 2 diff bugs in Solr that can cause the "cmd" reported in these error situations to be wrong: * UpdateCommand re-use in JavabinLoader * ErrorReportingConcurrentUpdateSolrClient in StreamingSolrClients ...full notes to follow in comment. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org