Thanks Peter, Charlie, Shawn Makes perfect sense now. I had missed out the tokenizer from index, was present only in the query. Got rid of the preserveOriginal.
Thanks & Best Regards, Lulu Paul -----Original Message----- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: 29 March 2018 15:21 To: solr-user@lucene.apache.org Subject: Re: Query redg : diacritics in keyword search On 3/29/2018 5:02 AM, Paul, Lulu wrote: > The keyword search Carré returns values Carré and Carre (this works > well as I added the tokenizer <filter > class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/> in > the schema config to enable returning of both sets of values) > > Now looks like we want Carre to return both Carré and Carre (and this dosen’t > work. Solr only returns Carre) – any ideas on how this scenario can be > achieved? Charlie Hull has hit the nail on the head regarding searching. I actually would remove the preserveOriginal flag from that filter. If the filter is run at both index and query time, you don't need preserveOriginal. If you're talking about what's displayed in your search results, that is completely unaffected by analysis. Analysis only affects queries and the data that goes into the 'indexed="true"' part of the index. Search *results* are almost always exactly what was sent to Solr. There is UpdateProcessor functionality that can sit between the values sent to Solr and what actually goes into stored/indexed/docValues. Things that happen during update processing ARE visible in search results. https://lucene.apache.org/solr/guide/6_6/update-request-processors.html Thanks, Shawn ****************************************************************************************************************** Experience the British Library online at www.bl.uk<http://www.bl.uk/> The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html> Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook> The Library's St Pancras site is WiFi - enabled ***************************************************************************************************************** The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the postmas...@bl.uk<mailto:postmas...@bl.uk> : The contents of this e-mail must not be disclosed or copied without the sender's consent. The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author. ***************************************************************************************************************** Think before you print