[ 
https://issues.apache.org/jira/browse/SOLR-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007784#comment-17007784
 ] 

Erick Erickson commented on SOLR-14159:
---------------------------------------

I'm baffled. What I know now:

1> I ran into this while trying to figure out SOLR-13486. Near as I can tell, 
it's a completely different issue.
2> I can reproduce this reasonably regularly on my MBP, but not on my mac pro.
3> the failure is in TestCloudConsistency.assertDocExists and it's "connection 
refused". The cases I've seen seem to be a follower, not the leader.
4> I have no clue whether the follower being queried is up to date since the 
connection refused error short-circuits the check.
5> The port is not a proxy. The attached log file shows failures on port 49186.
6> This failure happens when beasting a single test in TestCloudConsistency, so 
it's not the case that some other test is polluting the test space,
7> If I'm reading the log file correctly, the recovery process completes 
successfully for the node that we can't connect to.
8> live_nodes contains the problem node and the cluster state looks like it's 
active.
9> Just for yucks, I tried creating a new HttpSolrServer in the failure case 
and that doesn't seem to make any difference.

The attached failure log has a ton of additional debug information in it. The 
attached patch is what I'm using, it doesn't have any substantive changes 
except all the additional logging.

The log file is very confusing since it has all this additional logging output. 
So far I can't find any smoking guns.

I suspect this is something in our underlying test framework, or something in 
Jetty that I'm not seeing. Perhaps the underlying cause has wider implications 
for other tests, but that's speculation.

Any help appreciated, I'm pretty much at my wit's end.

> Fix errors in TestCloudConsistency
> ----------------------------------
>
>                 Key: SOLR-14159
>                 URL: https://issues.apache.org/jira/browse/SOLR-14159
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: SOLR-14159_debug.patch, stdout
>
>
> Moving over here from SOLR-13486 as per Hoss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to