Re: Exact match

Erick Erickson Mon, 02 Dec 2019 14:29:36 -0800

There are two different interpretations of “exact match” going on here, don’t 
be confused!


Emir’s version is “the text has to match the _entire_ input. So a field with “a 
b c d” will NOT match “a b” or “a b c” or “b c", but only “a b c d”.

David’s version is “The text has to contain some sequence of words that exactly 
matches my query”, so a field with “a b c d” _would_ match “a b”, “a b c”, “a b 
c d”, “b c”, “c d”, etc.

Both are entirely valid use-cases, depending on what you mean by “exact match"

Best,
Erick

> On Dec 2, 2019, at 4:38 PM, Emir Arnautović <emir.arnauto...@sematext.com> 
> wrote:
> 
> Hi Omer,
> From performance perspective, it is the best if you index title as a single 
> token: KeywordTokenizer + LowerCaseFilter
> 
> If you need to query that field in some other way, you can index it 
> differently as some other field using copyField.
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 2 Dec 2019, at 21:43, OTH <omer.t....@gmail.com> wrote:
>> 
>> Hello,
>> 
>> What would be the best way to get exact matches (if any) to a query?
>> 
>> E.g.:  Let's the document text is:  "united states of america".
>> Currently, any query containing one or more of the three words "united",
>> "states", or "america" will match with the above document.  I would like a
>> way so that the document matches only and only if the query were also
>> "united states of america" (case-insensitive).
>> 
>> Document field type:  TextField
>> Index Analyzer: TokenizerChain
>> Index Tokenizer: StandardTokenizerFactory
>> Index Token Filters: StopFilterFactory, LowerCaseFilterFactory,
>> SnowballPorterFilterFactory
>> The Query Analyzer / Tokenizer / Token Filters are the same as the Index
>> ones above.
>> 
>> FYI I'm relatively novice at Solr / Lucene / Search.
>> 
>> Much appreciated
>> Omer
>

Re: Exact match

Reply via email to