mikemccand commented on issue #60: URL: https://github.com/apache/lucene-jira-archive/issues/60#issuecomment-1193291103
Indeed, I can see that we converted it "correctly" (preserving the U+0010 character): ``` "body": "Patch is available but it involves a binary file change.\u0010\nSo no easy to review and not easy to check with different OS\n\n[Legacy Jira: Alessandro Benedetti (@alessandrobenedetti) on [May 23 2018](https://issues.apache.org/jira/browse/LUCENE-8329?focusedCommentId=16487052&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16487052)]\n", ``` Maybe GitHub's import API does not allow "exotic" (yet, correctly encoded) Unicode characters and remaps some of them to U+FFFD? I tried searching for � LOL in GitHub's search but it matches nothing. I guess they do not index that character as a separate token ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org