Merlin Ü encodes to two characters in utf-8 (C39C), and one in iso-8859-1 (%DC) so it looks like there is a charset mismatch somewhere.
Cheers François On Aug 27, 2011, at 6:34 AM, Merlin Morgenstern wrote: > Hello, > > I am having problems with searches that are issued from spiders that contain > the ASCII encoded character "ü" > > For example in : "Übersetzung" > > The solr log shows following query request: /suche/%DCbersetzung > which has been translated into solr query: q=?ersetzung > > If you enter the search term directly as a user into the search box it will > result into: > /suche/Übersetzung which returns perfect results. > > I am decoding the URL within PHP: $term = trim(urldecode($q)); > > Somehow urldecode() translates the Character Ü (%DC) into a ? which is a > illigeal first character in Solr. > > I tried it without urldecode(), with rawurldecode() and with utf8_decode() > but all of those did not help. > > Thank you for any help or hint on how to solve that problem. > > Regards, Merlin