: so the only ones I can utilize are EdgeNGramTokenizerFactory and : NGramTokenizerFactory. : : I've done some playing around with them but the best result I've gotten so far : is a field-type that enables searching for specific letters, for example I can : search for an item that contains the letters a and x, but it returns a hit no : matter where these letters are in the text, they don't have to be next to each : other, and that's not the result I was going for. If the field contains : "monitor" I want a hit on a search for "onit" but not on "rint" for example.
NGramTokenizerFactory should work fine for this ... the key is to use it at indexing time with the appropriate min and max gram sizes to meet your needs -- at query time, don't use it at all (use keyword or whitespace tokenizer) so the word "monitor" will be indexed as these tokens (but not neccessarily in this order)... m o n i t o r mo on ni it to or mon oni nit ... onit ... and at search time when the user gives you "onit" that term will exist. : I've never attempted to construct a new field-type of my own before and I'm : finding the available documentation somewhat incomplete and not very helpful FWIW: creating a new FieldType is almost never what you need if you are dealing with text .. creating new FieldTypes is something that typically only needs done in cases where you want specialized encoding or sorting. -Hoss