The standard, keyword-oriented query parsers will all treat unquoted,
unescaped white space as term delimiters and ignore the what space. There is
no way to bypass that behavior. So, your regex will never even see the white
space - unless you enclose the text and white space in quotes or use a
backslash to quote each white space character.
You can use the "field" and "term" query parsers to pass a query string as
if it were fully enclosed in quotes, but that only handles a single term and
does not allow for multiple terms or any query operators. For example:
{!field f=myfield}Foo Bar
See:
http://wiki.apache.org/solr/QueryParser
You can also pre-configure the field query parser with the defType=field
parameter.
-- Jack Krupansky
-----Original Message-----
From: Srinivasa7
Sent: Thursday, January 30, 2014 6:37 AM
To: solr-user@lucene.apache.org
Subject: Re: KeywordTokenizerFactory - trouble with "exact" matches
Hi,
I have similar kind of problem where I want search for a words with spaces
in that. And I wanted to search by stripping all the spaces .
I have used following schema for that
<fieldType name="nospaces" class="solr.TextField"
autoGeneratePhraseQueries="true" >
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="[^\w]+" replacement="" replace="all"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="[^\w]+" replacement="" replace="all"/>
</analyzer>
</fieldType>
And
<field name="text_nospaces" type="nospaces" indexed="true" stored="true"
omitNorms="true" />
<copyField source="text" dest="text_nospaces" />
But it is not searching the right terms . we are stripping the spaces and
indexing lowercase values when we do that.
Like : East Enders
when I seach for 'east end ers' text, its not returning any values saying
no document found.
I realised the solr uses QueryParser before passing query string to the
QueryAnalyzer in defined in schema.
And The Query parser is tokenizing the query string providing in query . So
it is sending each token to the QueryAnalyser that is defined in schema.
SO is there anyway that I can by pass this query parser or use a correct
query processor which can consider the entire string as single pharse.
At the moment I am using dismax query processor.
Any suggestion would be much appreciated.
Thanks
Srinivasa
--
View this message in context:
http://lucene.472066.n3.nabble.com/KeywordTokenizerFactory-trouble-with-exact-matches-tp4114193p4114432.html
Sent from the Solr - User mailing list archive at Nabble.com.