I am setting up Solr in some environment, those hostname looks like foo_bar. then I got this strange error.
org.apache.solr.client.solrj.SolrServerException:IOException occured when talking to server at: http://foo/bar:18984_solr
Solr can not understand the hostname if it contains '_', the reason is the getBaseUrlForNodeName function.
The logic of this function is
- split the input into two part by first '_' character
- URL decode the second part
- return first part + '/' + second part
Solr will understand the hostname if change the split point to the last '_', But will break this test:
input: 127.0.0.1:61631_l_%2Fig
output: 127.0.0.1:6131/l_/ig
It looks like someone is already relying on this behavior, change this behavior will break some existed code.
Erick advises the split point should be '_solr', and it will break more unit tests.
I want to ask If someone depends on the behavior, and can I change the behavior?
Change the split point to last '_' looks has a minimal break.
If someone really depends on the behavior, I think the best way is to document the hostname must conform to RFC1035.
PR at https://github.com/apache/lucene-solr/pull/2164
