All, I don't know if this change was intended, but it feels like a bug to me...
TokenFilterFactory[] filters = new TokenFilterFactory[2]; filters[0] = new LowerCaseFilterFactory(Collections.EMPTY_MAP); filters[1] = new ASCIIFoldingFilterFactory(Collections.EMPTY_MAP); TokenizerChain chain = new TokenizerChain ( new MockTokenizerFactory(Collections.EMPTY_MAP), filters); System.out.println("NORMALIZE: " + chain.normalize("text0", "f\u00F6\u00F6Ba").utf8ToString()); System.out.println("NORMALIZE with multiterm: " + chain.getMultiTermAnalyzer().normalize("text0", "f\u00F6\u00F6Ba").utf8ToString()); output: NORMALIZE: fooba NORMALIZE with multiterm: fööBa If this is a bug and not the desired behavior, the source of the problem is that in TokenizerChain's getMultiTermAnalyzer(), there's no override of #normalize(String fieldName, TokenStream ts)...which means that the multiTermAnalyzer returned by TokenizerChain doesn't actually work to normalize multiterms! If this is a bug, I'll open a ticket.