Solr handles UTF-8, so it should be able to. The problem you’ll have is getting the UTF-8 characters to get through all the various transport encodings, i.e. if you try to search from a browser, you need to encode it so the browser passes it through. If you search through SolrJ, it needs to be encoded at that level. If you use cURL, it needs another….
> On Dec 1, 2020, at 12:30 AM, Eran Buchnick <buchni...@gmail.com> wrote: > > Hi community, > During integration tests with new data source I have noticed weird scenario > where replacement character can't be searched, though, seems to be stored. > I mean, honestly, I don't want that irrelevant data stored in my index but > I wondered if solr can index replacement character (U+FFFD �) as string, if > so, how to search it? > And in general, is there any built-in char filtration?! > > Thanks