Otis Gospodnetic wrote: > > I haven't used the German analyzer (either Snowball or the one we have in > Lucene's contrib), but have you checked if that does the trick of keeping > words together? > I'm not sure how this can work out with words that are space separated, especially since we use a whitespacetokenizer first in the filter chain.
I solved the problem for now by applying the follwing filter: public class ConcatFilter extends TokenFilter { private Token _last; private Queue<Token> _concatVersions = new LinkedList<Token>(); public ConcatFilter(TokenStream input) { super(input); } @Override public Token next() throws IOException { final Token next = input.next(); if ( next != null ) { if ( _last != null ) { final String concatStr = _last.termText() + next.termText(); _concatVersions.add(new Token(concatStr, 0, concatStr.length())); } _last = next; return next; } else if ( ! _concatVersions.isEmpty() ) { return _concatVersions.poll(); } return null; } } -- View this message in context: http://www.nabble.com/Howto-concatenate-tokens-at-index-time-%28without-spaces%29-tp19740271p19756337.html Sent from the Solr - User mailing list archive at Nabble.com.