Thanks Peter, Charlie, Shawn
Makes perfect sense now. I had missed out the tokenizer from index, was present 
only in the query. Got rid of the preserveOriginal.

Thanks & Best Regards,
Lulu Paul

-----Original Message-----
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: 29 March 2018 15:21
To: solr-user@lucene.apache.org
Subject: Re: Query redg : diacritics in keyword search

On 3/29/2018 5:02 AM, Paul, Lulu wrote:
> The keyword search Carré  returns values Carré and Carre (this works
> well as I added the tokenizer <filter
> class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/> in
> the schema config to enable returning of both sets of values)
>
> Now looks like we want Carre to return both Carré and Carre (and this dosen’t 
> work. Solr only returns Carre) – any ideas on how this scenario can be 
> achieved?

Charlie Hull has hit the nail on the head regarding searching.  I actually 
would remove the preserveOriginal flag from that filter.  If the filter is run 
at both index and query time, you don't need preserveOriginal.

If you're talking about what's displayed in your search results, that is 
completely unaffected by analysis.  Analysis only affects queries and the data 
that goes into the 'indexed="true"' part of the index.  Search
*results* are almost always exactly what was sent to Solr.

There is UpdateProcessor functionality that can sit between the values sent to 
Solr and what actually goes into stored/indexed/docValues. Things that happen 
during update processing ARE visible in search results.

https://lucene.apache.org/solr/guide/6_6/update-request-processors.html

Thanks,
Shawn



******************************************************************************************************************
Experience the British Library online at www.bl.uk<http://www.bl.uk/>
The British Library’s latest Annual Report and Accounts : 
www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>
Help the British Library conserve the world's knowledge. Adopt a Book. 
www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>
The Library's St Pancras site is WiFi - enabled
*****************************************************************************************************************
The information contained in this e-mail is confidential and may be legally 
privileged. It is intended for the addressee(s) only. If you are not the 
intended recipient, please delete this e-mail and notify the 
postmas...@bl.uk<mailto:postmas...@bl.uk> : The contents of this e-mail must 
not be disclosed or copied without the sender's consent.
The statements and opinions expressed in this message are those of the author 
and do not necessarily reflect those of the British Library. The British 
Library does not take any responsibility for the views of the author.
*****************************************************************************************************************
Think before you print

Reply via email to