Hi there,

I tried the solution provided in
https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
.The mentioned solution works when the indexed data does not have alpha
numerics or special characters. But in  my case the synonyms are something
like the below.


 T-MAZ 20  POLYOXYETHYLENE (20) SORBITAN MONOLAURATE  SORBITAN
MONODODECANOATE  POLY(OXY-1,2-ETHANEDIYL) DERIVATIVE  POLYOXYETHYLENE
SORBITAN MONOLAURATE  POLYSORBATE 20 [MART.]  SORBIMACROGOL LAURATE
300  POLYSORBATE
20 [FHFI]  FEMA NO. 2915

They have alpha numerics, special characters, spaces, etc. Is there a way
to implment synonyms even in such case?

Thanks,
Kaushik

On Mon, Apr 20, 2015 at 11:03 AM, Davis, Daniel (NIH/NLM) [C] <
daniel.da...@nih.gov> wrote:

> Handling MESH descriptor preferred terms and such is similar.   I
> encountered this during evaluation of Solr for a project here at NLM.   We
> decided to use Solr for different projects instead.     I considered the
> following approaches:
>  - use a custom tokenizer at index time that indexed all of the multiple
> term alternatives.
>  - index the data, and then have an enrichment process that queries on
> each source synonym, and generates an update to add the target synonyms.
>    Follow this with an optimize.
>  - During the indexing process, but before sending the data to Solr,
> process the data to tokenize and add synonyms to another field.
>
> Both the custom tokenizer and enrichment process share the feature that
> they use Solr's own tokenizer rather than duplicate it.   The enrichment
> process seems to me only workable in environments where you can re-index
> all data periodically, so no continuous stream of data to index that needs
> to be handled relatively quickly once it is generated.    The last method
> of pre-processing the data seems the least desirable to me from a blue-sky
> perspective, but is probably the easiest to implement and the most
> independent of Solr.
>
> Hope this helps,
>
> Dan Davis, Systems/Applications Architect (Contractor),
> Office of Computer and Communications Systems,
> National Library of Medicine, NIH
>
> -----Original Message-----
> From: Kaushik [mailto:kaushika...@gmail.com]
> Sent: Monday, April 20, 2015 10:47 AM
> To: solr-user@lucene.apache.org
> Subject: Mutli term synonyms
>
> Hello,
>
> Reading up on synonyms it looks like there is no real solution for multi
> term synonyms. Is that right? I have a use case where I need to map one
> multi term phrase to another. i.e. Tween 20 needs to be translated to
> Polysorbate 40.
>
> Any thoughts as to how this can be achieved?
>
> Thanks,
> Kaushik
>

Reply via email to