Is there a tokenizer that supports providing variants of the tokens at index time? I'm looking for something that could take a syntax like:
International|I Business|B Machines|M Which would take each pipe delimited token and preserve its position so that phrase queries work properly. The above would result in queries for "International Business Machines" as well as "I B M" or any variants. The point is that the variants would be generated externally as part of the indexing process so they may not be as simple as the above. Any ideas or do I have to write a custom tokenizer to do this? Thanks, Paul