Ideally, such a text search should be done using tokenized text and span query. Maybe you could do it using the "surround" query parser, but you should be able to do it using the LucidWorks Search query parser:

"this is" BEFORE:1 ("good" OR "excellent")

But, given that you have a keyword tokenizer with embedded white space, you should be able to write a Lucene regex query for the same as raw text, something like [untested!]:

/this\\s+is\\s+(\\w\\s+)?(good|excellent)/

That would be "contains".

Starts with:

/^this\\s+is\\s+(\\w\\s+)?(good|excellent)/

Ends with:

/this\\s+is\\s+(\\w\\s+)?(good|excellent)$/

Exact match:

/^this\\s+is\\s+(\\w\\s+)?(good|excellent)$/

Caveat:
BUT... such character-level regex matching is NOT guaranteed to be speedy and really should only be used for relatively small datasets.

-- Jack Krupansky

-----Original Message----- From: kobe.free.wo...@gmail.com
Sent: Saturday, May 18, 2013 6:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Searching for terms having embedded white spaces like "word1 word2"

Thank you so very much Jack for your prompt reply. Your solution worked for
us.

I have another issue in querying fields having values of the sort
<string>This is good</string><string>This is also good</string><string>This
is excellent</string>. I want to perform "StartsWith" as well as 'Contains"
searches on this field. The field definition is as follow,

 <fieldType name="cust_str" class="solr.TextField"
positionIncrementGap="100" sortMissingLast="true">
     <analyzer type="index">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.TrimFilterFactory" />
     </analyzer>
     <analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory"/>
     </analyzer>
   </fieldType>

Please suggest how to perform the above mentioned search.



--
View this message in context: http://lucene.472066.n3.nabble.com/Searching-for-terms-having-embedded-white-spaces-like-word1-word2-tp4064170p4064355.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to