: I can put "chinese cuisines" and "chinese cuisine" as two different tokens.
: But I was wondering if there is better way to do it, like tweaking
: different FilterFactory.
: 
: My problem is, if it's not exact match, when I search "cuisine", that would
: match both, I don't want that happen.

this is where lots of specifics matter...

it sounds like what *you* mean by "exact match" is...

 * i want to be able to use analayzers that do stemming (and maybe 
lowercasing, and maybe stopwords, and maybe synonyms)
 * i want queries to match documents only if all of the query words are in 
the doc field in order
 * i only want documents to match if there are no other words in the 
doc field besides the words in the query.

does that sound about right?

If so, then KeywordTokenizer isn't going to help you.

i think the simplest way to do what you want is what you alluded to about 
inserting marker tokens at the begining and end of your field values when 
indexing, and then do "phrase queries" that include those marker tokens.

you could probably use something like PatternReplaceCharFilter to inject 
the start/end tokens fairly easily in both our index & query analyzer -- 
just make sure you don't pick something that would get removed by another 
tokenfilter later.

once you have that in place, using something like the "FieldQParser" may 
be the easiest way to generate the phrase queries...

   q={!field f=name}chinese cuisine&debugQuery=true


-Hoss

Reply via email to