I'm shooting a bit in the dark here, but I'd guess that these are actually understandable results.
If you replace then stem, the stemming algorithm works on the exact same word. And you got the results you expect. If you stem then replace, the inputs are different to thestemmer, so the fact that your outputs are different isn't a surprise. That is your implicit assumption, it seems to me, is that'wärme' and 'waerme' should go through the stemmer and become 'wärm' and 'waerm', that you can then do the substitution on and produce the same output. I don't think that's a valid assumption. You could probably check the actual contents of your index with Luke and verify whether your assumptions are correct or not Best Erick On Thu, Jul 2, 2009 at 9:27 AM, Michael Lackhoff <mich...@lackhoff.de>wrote: > In Germany we have a strange habbit of seeing some sort of equivalence > between Umlaut letters and a two letter representation. Example 'ä' and > 'ae' are expected to give the same search results. To achieve this I > added this filter to the "text" fieldtype definition: > <filter class="solr.PatternReplaceFilterFactory" > pattern="ä" replacement="ae" replace="all" > /> > to both index and query analyzers (and more for the other umlauts). > > This works well when I search for a name (a word not stemmed) but not > e.g. with the word "Wärme". > search for 'wärme' works > search for 'waerme' does not work > search for 'waerm' works if I move the EnglishPorterFilterFactory after > the PatternReplaceFilterFactory. > > DebugQuery for "waerme" gives a parsedquery FS:waerm. > What I don't understand is why the (existing) records are not found. If > I understand it right, there should be 'waerm' in the index as well. > > By the way, the reason why I keep the EnglishPorterFilterFactory is that > the records are in many languages and the English stemming gives good > results in many cases and I don't want (yet) to multiply my fields to > have language specific versions. > But even if the stemming is not right because the language is not > English I think records should be found as long as the analyzers are the > same for index and query. > > This is with Solr 1.3. > > Can someone shed some light on what is going on and how I can achieve my > goal? > > -Michael >