samuelgmartinez commented on a change in pull request #1480: URL: https://github.com/apache/lucene-solr/pull/1480#discussion_r421106586
########## File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java ########## @@ -568,12 +568,18 @@ private HttpEntityEnclosingRequestBase fillContentStream(SolrRequest request, Co // Read the contents entity = response.getEntity(); respBody = entity.getContent(); - Header ctHeader = response.getLastHeader("content-type"); - String contentType; - if (ctHeader != null) { - contentType = ctHeader.getValue(); - } else { - contentType = ""; + String mimeType = null; + Charset charset = null; + String charsetName = null; + + ContentType contentType = ContentType.get(entity); + if (contentType != null) { + mimeType = contentType.getMimeType().trim().toLowerCase(Locale.ROOT); + charset = contentType.getCharset(); + + if (charset != null) { + charsetName = charset.name(); + } } Review comment: I made the change already and was about to push it when I just wondered if it's the right decision. Here is my thinking: * The 626 log line wouldn't cause an NPE but just concat `null`. I agree that it should not be like that and should log a proper value. * I'm afraid of taking the decision of the default encoding in this class as 'null' encoding for most operations would mean: "either default on Charset.defaultCharset() or take the decision yourself based on whatever you know you are parsing", specially regarding the `ResponseParser`s. For example, the Json parser is defaulting to UTF-8 when the encoding is passed as null, but it could be that any other parser defaults to JVM `file.encoding` or an XML parser defaults to `ISO-8859-1`. In the original PR I just replaced the hardcoded references to `UTF-8` to the properly referenced `FALLBACK_CHARSET`, while I kept the response's original encoding (whether it's null or not) usages; like the one in line 634, `processor.processResponse` is passing `null` in the original code also. Do you think that we can safely take the decision at this component layer (and hardcoded it to UTF-8) instead of delegating it to the `ResponseParser`s? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org