I'm not entirely sure about the fine points, but consider the filters that are available that fold all the diacritics into their low-ascii equivalents. Perhaps using that filter at *both* index and search time on the English index would do the trick.
In your example, both would be 'munchen'. Straight English would be unaffected by the filter, but any German words with diacritics that crept in would be folded into their low-ascii "equivalents". This would also work at index time, just in case you indexed English text that had some German words. NOTE: My experience is more on the Lucene side than the SOLR side, but I'm sure the filters are available. Best Erick On Wed, Jan 28, 2009 at 5:21 PM, Julian Davchev <j...@drun.net> wrote: > Hi, > I currently have two indexes with solr. One for english version and one > with german version. They use respectively english/german2 snowball > factory. > Right now depending on which language is website currently I query > corresponding index. > There is requirement though that stuff is found regardless in which > language is found. > So for example if searching for muenchen (will be caught correctly by > german snowball factory as münchen) in english index it should be found. > Right now > it is not as I suppose english factory doesn't really care about umlauts. > > Any pointers are more than welcome. I am considering synonyms but this > will be kinda to heavy to follow/create. > Cheers, > JD >