Sure thing. Post back w what you find! Good luck-
On Sun, Mar 26, 2017 at 3:36 PM Sanjana Sridhar <sanjana.srid...@flipp.com> wrote: > Hi John, > > Thanks for letting me know what works for you. I'm going to try that out. > Sounds like a suitable solution to my problem. > > Best, > Sanjana > > > > On Sun, Mar 26, 2017 at 12:30 PM, John Blythe <j...@curvolabs.com> wrote: > > > I use the keyword tokenizer and then pattern replace to transform multi > > words into underscore connected tokens. For instance, "Burger Joint" > > transforms to "burger_joint" which then looks in my synonym filter for > > underscored synonyms. When it matches I then replace underscores with > > spaces or just toss over to the word delimiter filter factory before > > further processing > > > > > > On Sun, Mar 26, 2017 at 11:53 AM Sanjana Sridhar < > > sanjana.srid...@wishabi.com> wrote: > > > > > Hello, > > > > > > Does anyone have a good solution for working with multi word synonyms? > > I've > > > been reading a lot about this online and haven't really found a great > > > solution to it. I use the SynonymFilterFactory at index time, but words > > > don't really get matched to the appropriate multi word synonyms, even > > > though using the Analysis tool shows that it should be matched. > > > > > > Examples: > > > > > > coke, coca cola > > > > > > > > > > > > This is the configuration I have on text fields: > > > > > > <fieldType name ="text_icu_english" class="solr.TextField" > > > positionIncrementGap="100" multiValued="true"> > > > <analyzer type="index"> > > > <!-- The white space tokenizer splits on white space but > > preserves > > > the tokens so that it can be used by the next filter --> > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > <filter class="solr.SynonymFilterFactory" ignoreCase="true" > > expand= > > > "true" synonyms="synonyms.txt" /> > > > <!-- This filter splits a word on punctuation, preserves the > > > original, concatenates the split words and also stems english > possessive > > > nouns --> > > > <filter class="solr.WordDelimiterFilterFactory" > > > generateWordParts="0" generateNumberParts = "0" > > > splitOnCaseChange = "0" preserveOriginal="1" > > catenateWords="1"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > > > <filter class="solr.EnglishMinimalStemFilterFactory"/> > > > <filter class="solr.ICUFoldingFilterFactory"/> > > > <filter class="solr.PatternReplaceFilterFactory" > > > pattern="(.*[\*].*)" replacement=""/> > > > <filter class="solr.TrimFilterFactory"/> > > > <filter class="solr.LengthFilterFactory" min="1" max="100"/> > > > <filter class="solr.ClassicFilterFactory"/> > > > > > > </analyzer> > > > <analyzer type="query"> > > > <!-- The white space tokenizer splits on white space but > > preserves > > > the tokens so that it can be used by the next filter --> > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > <!-- This filter splits a word on punctuation, preserves the > > > original, concatenates the split words and also stems english > possessive > > > nouns --> > > > <filter class="solr.WordDelimiterFilterFactory" > > > generateWordParts="0" generateNumberParts = "0" > > > splitOnCaseChange = "0" preserveOriginal="1" > > catenateWords="1"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > > > <filter class="solr.EnglishMinimalStemFilterFactory"/> > > > <filter class="solr.ICUFoldingFilterFactory"/> > > > <filter class="solr.ClassicFilterFactory"/> > > > </analyzer> > > > <similarity class="solr.BM25SimilarityFactory"> > > > <float name="b">0.0</float> > > > </similarity> > > > </fieldType> > > > > > > > > > Greatly appreciate any help ya'll can offer. > > > > > > Thanks, > > > Sanjana > > > > > > -- > > > IMPORTANT NOTICE: This message, including any attachments (hereinafter > > > collectively referred to as "Communication"), is intended only for the > > > addressee(s) > > > named above. This Communication may include information that is > > > privileged, confidential and exempt from disclosure under applicable > law. > > > If the recipient of this Communication is not the intended recipient, > or > > > the employee or agent responsible for delivering this Communication to > > the > > > intended recipient, you are notified that any dissemination, > distribution > > > or copying of this Communication is strictly prohibited. If you have > > > received this Communication in error, please notify the sender > > immediately > > > by phone or email and permanently delete this Communication from your > > > computer without making a copy. Thank you. > > > > > -- > > -- > > *John Blythe* > > Product Manager & Lead Developer > > > > 251.605.3071 | j...@curvolabs.com > > www.curvolabs.com > > > > 58 Adams Ave > > Evansville, IN 47713 > > > > > > -- > > <http://corp.flipp.com/> <http://corp.flipp.com/> > > Sanjana Sridhar > Flipp Corporation > > p: 647-217-3599 > e: sanjana.srid...@flipp.com > > -- > IMPORTANT NOTICE: This message, including any attachments (hereinafter > collectively referred to as "Communication"), is intended only for the > addressee(s) > named above. This Communication may include information that is > privileged, confidential and exempt from disclosure under applicable law. > If the recipient of this Communication is not the intended recipient, or > the employee or agent responsible for delivering this Communication to the > intended recipient, you are notified that any dissemination, distribution > or copying of this Communication is strictly prohibited. If you have > received this Communication in error, please notify the sender immediately > by phone or email and permanently delete this Communication from your > computer without making a copy. Thank you. > -- -- *John Blythe* Product Manager & Lead Developer 251.605.3071 | j...@curvolabs.com www.curvolabs.com 58 Adams Ave Evansville, IN 47713