Re: Exact matching without using new fields

David R Tue, 19 Jan 2021 09:22:04 -0800

We had the same requirement. Just to echo back your requirements, I understand 
your case to be this. Given these 2 doc titles:


doc 1: "information retrieval"
doc 2: "Advanced information retrieval with Solr"

You want a phrase search for "information retrieval" to find both documents, 
but an EXACT phrase search for "information retrieval" to find doc #1 only.

If that's true, and case-sensitive search isn't a requirement, I indexed this 
in the token stream, with adjacent positions of course.

START information retrieval END
START advanced information retrieval with solr END

And with our custom query parser, when an EXACT operator is found, I tokenize 
the query to match the first case. Otherwise pass it through.

Needs custom analyzers on the query and index sides to generate the correct 
token sequences.

It's worked out well for our case.

Dave



________________________________
From: gnandre <arnoldbron...@gmail.com>
Sent: Tuesday, January 19, 2021 4:07 PM
To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
Subject: Exact matching without using new fields

Hi,

I am aware that to do exact matching (only whatever is provided inside
double quotes should be matched) in Solr, we can copy existing fields with
the help of copyFields into new fields that have very minimal tokenization
or no tokenization (e.g. using KeywordTokenizer or using string field type)

However this solution is expensive in terms of index size because it might
almost double the size of the existing index.

Is there any inexpensive way of achieving exact matches from the query
side. e.g. boost the original tokens more at query time compared to their
tokens?

Re: Exact matching without using new fields

Reply via email to