Hi there, I tried the solution provided in https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ .The mentioned solution works when the indexed data does not have alpha numerics or special characters. But in my case the synonyms are something like the below.
T-MAZ 20 POLYOXYETHYLENE (20) SORBITAN MONOLAURATE SORBITAN MONODODECANOATE POLY(OXY-1,2-ETHANEDIYL) DERIVATIVE POLYOXYETHYLENE SORBITAN MONOLAURATE POLYSORBATE 20 [MART.] SORBIMACROGOL LAURATE 300 POLYSORBATE 20 [FHFI] FEMA NO. 2915 They have alpha numerics, special characters, spaces, etc. Is there a way to implment synonyms even in such case? Thanks, Kaushik On Mon, Apr 20, 2015 at 11:03 AM, Davis, Daniel (NIH/NLM) [C] < daniel.da...@nih.gov> wrote: > Handling MESH descriptor preferred terms and such is similar. I > encountered this during evaluation of Solr for a project here at NLM. We > decided to use Solr for different projects instead. I considered the > following approaches: > - use a custom tokenizer at index time that indexed all of the multiple > term alternatives. > - index the data, and then have an enrichment process that queries on > each source synonym, and generates an update to add the target synonyms. > Follow this with an optimize. > - During the indexing process, but before sending the data to Solr, > process the data to tokenize and add synonyms to another field. > > Both the custom tokenizer and enrichment process share the feature that > they use Solr's own tokenizer rather than duplicate it. The enrichment > process seems to me only workable in environments where you can re-index > all data periodically, so no continuous stream of data to index that needs > to be handled relatively quickly once it is generated. The last method > of pre-processing the data seems the least desirable to me from a blue-sky > perspective, but is probably the easiest to implement and the most > independent of Solr. > > Hope this helps, > > Dan Davis, Systems/Applications Architect (Contractor), > Office of Computer and Communications Systems, > National Library of Medicine, NIH > > -----Original Message----- > From: Kaushik [mailto:kaushika...@gmail.com] > Sent: Monday, April 20, 2015 10:47 AM > To: solr-user@lucene.apache.org > Subject: Mutli term synonyms > > Hello, > > Reading up on synonyms it looks like there is no real solution for multi > term synonyms. Is that right? I have a use case where I need to map one > multi term phrase to another. i.e. Tween 20 needs to be translated to > Polysorbate 40. > > Any thoughts as to how this can be achieved? > > Thanks, > Kaushik >