Query by phrase is a core feature of tokenized text in Lucene and Solr, so
there is no need to use a pattern token filter for that purpose. And yes,
doing so pretty much breaks most token filters that would assume that the
text is tokenized.
-- Jack Krupansky
-----Original Message-----
From: solr-user
Sent: Wednesday, April 2, 2014 12:46 PM
To: solr-user@lucene.apache.org
Subject: Re: how do I get search for "fort st john" to match "ft saint john"
Hi Eric.
No, that doesnt fix the problem either (I have tested this previously and
did so again just now)
Since the PatternTokenizerFactory is not tokenizing on whitespace(by design
since I want the user to search by phrase), the phrase "marina former fort
ord" (for example) does not get turned into four tokens ("marina", "former",
"fort" and "ord"), and so the SynonymFilterFactory does not create synonyms
for them (by design)
the original question remains: is there a tokenizer/plugin that will allow
me to synonym words in a unbroken phrase?
note: the reason I dont want to tokenize the data by whitespace is that it
would cause way to many results to get returned if I, for example, search on
"new" or "st" ... However, I still want to be able to include "fort saint
john" in the results if the user searches for "ft st john" or "fort st john"
or ...
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-do-I-get-search-for-fort-st-john-to-match-ft-saint-john-tp4127231p4128640.html
Sent from the Solr - User mailing list archive at Nabble.com.