Thanks Robert, worked perfect for the index side of the house.  Now on
the query side I have a similar Tokenizer, but it's not operating
quite the way I want it to.  The query tokenizer generates the tokens
properly except I'm ending up with a phrase query, i.e. field:"1 2 3
4" when I really want field:1 OR field:2 OR field:3 OR field:4.  Is
there something in the tokenizer that needs to be set for this to
generate this type of query or is it something in the query parser?

On Thu, Feb 9, 2012 at 9:02 PM, Robert Muir <rcm...@gmail.com> wrote:
> On Thu, Feb 9, 2012 at 8:54 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>> Again thanks.  I'll take a stab at that are you aware of any
>> resources/examples of how to do this?  I figured I'd start with
>> WhiteSpaceTokenizer but wasn't sure if there was a simpler place to
>> start.
>>
>
> Well, easiest is if you can build what you need out of existing resources...
>
> But if you need to write your own, and If your input is not massive
> documents/you have no problem processing the whole field in RAM at
> once, you could try looking at PatternTokenizer for an example:
>
> http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/java/org/apache/lucene/analysis/pattern/PatternTokenizer.java
>
> --
> lucidimagination.com

Reply via email to