Hi, You can try search query -> "+information +retrieval"
Meaning the document should have both the keywords. Doc 5 will also be in the results. https://lucene.apache.org/solr/guide/8_7/the-standard-query-parser.html#the-boolean-operator - Mohandoss. On Wed, Jan 20, 2021 at 1:38 AM gnandre <arnoldbron...@gmail.com> wrote: > Thanks for replying, Dave. > > I am afraid that I am looking for non-index time i.e. query time solution. > > Actually in my case I am expecting both documents to be returned from your > example. I am just trying to avoid returning of documents which contain a > tokenized versions > of the provided search query when it is enclosed within double quotes to > indicate exact matching expectation. > > e.g. > search query -> "information retrieval" > > This should match documents like following: > doc 1: "information retrieval" > doc 2: "Advanced information retrieval with Solr" > > but should NOT match documents like > doc 3: "informed retrieval" > doc 4: "information extraction" (considering 'extraction' was a specified > synonym of 'retrieval' ) > doc 5: "INFORMATION RETRIEVAL" > > etc > > I am also ok with these documents showing up as long as they show up at > bottom. Also, query time solution is a must. > > On Tue, Jan 19, 2021 at 12:22 PM David R <davidtr...@hotmail.com> wrote: > > > We had the same requirement. Just to echo back your requirements, I > > understand your case to be this. Given these 2 doc titles: > > > > doc 1: "information retrieval" > > doc 2: "Advanced information retrieval with Solr" > > > > You want a phrase search for "information retrieval" to find both > > documents, but an EXACT phrase search for "information retrieval" to find > > doc #1 only. > > > > If that's true, and case-sensitive search isn't a requirement, I indexed > > this in the token stream, with adjacent positions of course. > > > > START information retrieval END > > START advanced information retrieval with solr END > > > > And with our custom query parser, when an EXACT operator is found, I > > tokenize the query to match the first case. Otherwise pass it through. > > > > Needs custom analyzers on the query and index sides to generate the > > correct token sequences. > > > > It's worked out well for our case. > > > > Dave > > > > > > > > ________________________________ > > From: gnandre <arnoldbron...@gmail.com> > > Sent: Tuesday, January 19, 2021 4:07 PM > > To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> > > Subject: Exact matching without using new fields > > > > Hi, > > > > I am aware that to do exact matching (only whatever is provided inside > > double quotes should be matched) in Solr, we can copy existing fields > with > > the help of copyFields into new fields that have very minimal > tokenization > > or no tokenization (e.g. using KeywordTokenizer or using string field > type) > > > > However this solution is expensive in terms of index size because it > might > > almost double the size of the existing index. > > > > Is there any inexpensive way of achieving exact matches from the query > > side. e.g. boost the original tokens more at query time compared to their > > tokens? > > >