Many thanks Joe! I'll follow the instructions on the linked webpage. On Tue, May 31, 2016 at 4:05 PM, Joe Lawson < jlaw...@opensourceconnections.com> wrote:
> The docs are out of date for the synonym_edismax but it does work. Check > out the tests for working examples. I'll try to update it soon. I've run > the plugin on Solr 5 and 6, solrcloud and standalone. For running in > SolrCloud make sure you follow > > https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode > On May 31, 2016 5:13 PM, "John Bickerstaff" <j...@johnbickerstaff.com> > wrote: > > > All -- > > > > I'm now attempting to use the hon_lucene_synonyms project from github. > > > > I found the documents that were infered by the dead links on the readme > in > > the repository -- however, given that I'm using Solr 5.4.x, I no longer > > have the need to integrate into a war file (as far as I can see). > > > > The suggestion on the readme is that I can drop the hon_lucene_synonyms > jar > > file into the $SOLR_HOME directory, but this does not seem to be working > - > > I'm getting class not found exceptions. > > > > Does anyone on this list have direct experience with getting this plugin > to > > work in Solr 5.x? > > > > Thanks in advance... > > > > On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey <mjsmin...@gmail.com> > > wrote: > > > > > It's been awhile since I installed it so I really can't say. I'm more > of > > a > > > code monkey than a server gal (particularly Linux... I'm amazed I got > > Solr > > > installed in the first place, LOL!) So I had asked our network guy to > > look > > > it over recently and see if it looked like I did it okay. He said since > > it > > > shows up in the list of jars in the Solr admin that it's installed.... > if > > > that's not necessarily true, I probably need to point him in the right > > > direction for what else to do since he really doesn't know Solr well > > > either. > > > > > > Mary Jo > > > > > > > > > > > > > > > On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff < > > > j...@johnbickerstaff.com> > > > wrote: > > > > > > > Thanks for the comment Mary Jo... > > > > > > > > The error loading the class rings a bell - did you find and follow > > > > instructions for adding that to the WAR file? I vaguely remember > > seeing > > > > something about that. > > > > > > > > I'm going to try my own tests on the auto phrasing one.. If I'm > > > > successful, I'll post back. > > > > > > > > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey <mjsmin...@gmail.com > > > > > > wrote: > > > > > > > > > This is a very timely discussion for me as well as we're trying to > > > tackle > > > > > the multi term synonym issue as well and have not been able to > > > hon-lucene > > > > > plugin to work, the jar shows up as installed but when we set up > the > > > > sample > > > > > request handler it throws this error: > > > > > > > > > > > > > > > > > > > > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > > > > > Error loading class > > > > > > > > > > > > > > > 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin' > > > > > > > > > > I have tried the auto-phrasing one as well (I did set up a field > > using > > > > copy > > > > > to configure it on) but when testing it didn't seem to return the > > > > synonyms > > > > > as expected. So gave up on that one too (am willing to give it > > another > > > > try > > > > > though, that was awhile ago). Would definitely like to hear what > > other > > > > > people have found works on the latest versions of Solr 5.x and/or > 6. > > > Just > > > > > sucks that this issue has never been fixed in the core product such > > > that > > > > > you still need to mess with plugins and patches to get such a basic > > > > > functionality working properly. > > > > > > > > > > > > > > > *Mary Jo Sminkey* > > > > > *Senior ColdFusion Developer* > > > > > > > > > > *CF Webtools* > > > > > You Dream It... We Build It. <https://www.cfwebtools.com/> > > > > > 11204 Davenport Suite 100 > > > > > Omaha, Nebraska 68154 > > > > > O: 402.408.3733 x128 > > > > > E: maryjo.smin...@cfwebtools.com > > > > > Skype: maryjos.cfwebtools > > > > > > > > > > > > > > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff < > > > > > j...@johnbickerstaff.com> > > > > > wrote: > > > > > > > > > > > So I'm looking at the solution mentioned here: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > > > > > > > > > > > The thing that's troubling me slightly is that the way it's > > > documented > > > > it > > > > > > seems to be missing a small but important link... > > > > > > > > > > > > What exactly causes the results listed to be returned? > > > > > > > > > > > > Here's my thought process: > > > > > > > > > > > > 1. The entry for /autophrase searchHandler does not specify a > > default > > > > > > search field. > > > > > > 2. The field type "text_autophrase" is set up as the one with the > > > > > > AutoPhrasingFilterFactory as part of it's indexing > > > > > > > > > > > > There isn't any mention (perhaps because it's too obvious) of the > > > need > > > > to > > > > > > copy or otherwise get data into the "text_autophrase" field at > > index > > > > > time. > > > > > > > > > > > > There isn't any explicit listing of "text_autophrase" as the > > default > > > > > search > > > > > > field in the /autophrase search handler > > > > > > > > > > > > There isn't any explicit statement of "df=text_autophrase" in the > > > query > > > > > > statment: [/autophrase?q=New+York] > > > > > > > > > > > > Therefore it seems to me that if someone tries to implement this, > > > > they're > > > > > > going to be disappointed in the results unless they: > > > > > > a. copy or otherwise get ALL the text they're interested in -- > into > > > the > > > > > > "text_autophrase" field as part of the schema.xml setup (to > happen > > at > > > > > index > > > > > > time) > > > > > > b. somehow explicitly declare "text_autophrase" as the default > > search > > > > > field > > > > > > - either in the searchHandler or wherever else the default field > is > > > > > > configured. > > > > > > > > > > > > If anyone out there has done this specific approach - could you > > > > validate > > > > > > whether my thought process is correct and / or if I'm missing > > > > something? > > > > > > Yes - I get that I can set it all up and try - but it's what I > > don't > > > > > know I > > > > > > don't know that bothers me... > > > > > > > > > > > > On Fri, May 27, 2016 at 11:57 AM, John Bickerstaff < > > > > > > j...@johnbickerstaff.com > > > > > > > wrote: > > > > > > > > > > > > > Thank you Steve -- very helpful. > > > > > > > > > > > > > > I can see that whatever implementation I decide to try, some > > > testing > > > > > will > > > > > > > be in order. If anyone is aware of significant gotchas with > this > > > > > synonym > > > > > > > thing that are not mentioned in the already-listed URLs, please > > > feel > > > > > free > > > > > > > to comment. > > > > > > > > > > > > > > On Fri, May 27, 2016 at 10:28 AM, Steve Rowe <sar...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > >> I’m working on addressing problems using multi-term synonyms > at > > > > query > > > > > > >> time in Lucene and Solr. > > > > > > >> > > > > > > >> I recommend these two blogs for understanding the issues (the > > > second > > > > > one > > > > > > >> was mentioned earlier in this thread): > > > > > > >> > > > > > > >> < > > > > > > >> > > > > > > > > > > > > > > > > > > > > > http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html > > > > > > >> > > > > > > > >> < > > > > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/> > > > > > > >> > > > > > > >> In addition to the already-mentioned projects, there is also: > > > > > > >> > > > > > > >> <https://issues.apache.org/jira/browse/SOLR-5379> > > > > > > >> > > > > > > >> All of these projects try in various ways to work around the > > fact > > > > that > > > > > > >> Lucene’s QueryParser splits on whitespace before sending text > to > > > > > > analysis, > > > > > > >> one token at a time, so in a synonym filter, multi-word > synonyms > > > can > > > > > > never > > > > > > >> match and add alternatives. See < > > > > > > >> https://issues.apache.org/jira/browse/LUCENE-2605>, where > I’ve > > > > > posted a > > > > > > >> patch to directly address that problem - note that it’s still > a > > > work > > > > > in > > > > > > >> progress. > > > > > > >> > > > > > > >> Once LUCENE-2605 has been fixed, there is still work to do > > getting > > > > > > >> (e)dismax to work with the modified Lucene QueryParser, and > > > > addressing > > > > > > >> problems with how queries are constructed from Lucene’s > > > “sausagized” > > > > > > token > > > > > > >> stream. > > > > > > >> > > > > > > >> -- > > > > > > >> Steve > > > > > > >> www.lucidworks.com > > > > > > >> > > > > > > >> > On May 26, 2016, at 2:21 PM, John Bickerstaff < > > > > > > j...@johnbickerstaff.com> > > > > > > >> wrote: > > > > > > >> > > > > > > > >> > Thanks Chris -- > > > > > > >> > > > > > > > >> > The two projects I'm aware of are: > > > > > > >> > > > > > > > >> > https://github.com/healthonnet/hon-lucene-synonyms > > > > > > >> > > > > > > > >> > and the one referenced from the Lucidworks page here: > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > > > > > >> > > > > > > > >> > ... which is here : > > > > > > >> https://github.com/LucidWorks/auto-phrase-tokenfilter > > > > > > >> > > > > > > > >> > Is there anything else out there that you would recommend I > > look > > > > at? > > > > > > >> > > > > > > > >> > On Thu, May 26, 2016 at 12:01 PM, Chris Morley < > > > > ch...@depahelix.com > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > > >> >> Chris Morley here, from Wayfair. (Depahelix = my domain) > > > > > > >> >> > > > > > > >> >> Suyash Sonawane and I have worked on multiple word synonyms > > at > > > > > > Wayfair. > > > > > > >> >> We worked mostly off of Ted Sullivan's work and also off of > > > some > > > > > > >> >> suggestions from Koorosh Vakhshoori. We have gotten to a > > point > > > > > where > > > > > > >> we > > > > > > >> >> have a more sophisticated internal implementation, however, > > > we've > > > > > > found > > > > > > >> >> that it is very difficult to make it do what you want it to > > do, > > > > and > > > > > > >> also be > > > > > > >> >> sufficiently performant. Watch out for exceptional > > situations > > > > with > > > > > > mm > > > > > > >> >> (minimum should match). > > > > > > >> >> > > > > > > >> >> Trey Grainger (now at Lucidworks) and Simon Hughes of > > Dice.com > > > > have > > > > > > >> also > > > > > > >> >> done work in this area. > > > > > > >> >> > > > > > > >> >> It should be very possible to get this kind of thing > working > > on > > > > > > >> >> SolrCloud. I haven't tried it yet but I think > theoretically, > > > it > > > > > > should > > > > > > >> >> just work. The synonyms stuff is mostly about doing things > > at > > > > > index > > > > > > >> time > > > > > > >> >> and query time. The index time stuff should translate to > > > > SolrCloud > > > > > > >> >> directly, while the query time stuff might pose some > issues, > > > but > > > > > > >> probably > > > > > > >> >> not too bad, if there are any issues at all. > > > > > > >> >> > > > > > > >> >> I've had decent luck porting our various plugins from > 4.10.x > > to > > > > > 5.5.0 > > > > > > >> >> because a lot of stuff is just Java, and it still works > > within > > > > the > > > > > > >> Jetty > > > > > > >> >> context. > > > > > > >> >> > > > > > > >> >> -Chris. > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> ---------------------------------------- > > > > > > >> >> From: "John Bickerstaff" <j...@johnbickerstaff.com> > > > > > > >> >> Sent: Thursday, May 26, 2016 1:51 PM > > > > > > >> >> To: solr-user@lucene.apache.org > > > > > > >> >> Subject: Re: Solr Cloud and Multi-word Synonyms :: > > > > synonym_edismax > > > > > > >> parser > > > > > > >> >> Hey Jeff (or anyone interested in multi-word synonyms) here > > are > > > > > some > > > > > > >> >> potentially interesting links... > > > > > > >> >> > > > > > > >> >> http://wiki.apache.org/solr/QueryParser (search the page > for > > > > > > >> >> synonum_edismax) > > > > > > >> >> > > > > > > >> >> > > > > > > https://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ > > > > > > >> (blog > > > > > > >> >> post about what became the synonym_edissmax Query Parser) > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/ > > > > > > >> >> > > > > > > >> >> This last was useful for lots of reasons and contains links > > to > > > > > other > > > > > > >> >> interesting, related web pages... > > > > > > >> >> > > > > > > >> >> On Thu, May 26, 2016 at 11:45 AM, Jeff Wartes < > > > > > > jwar...@whitepages.com> > > > > > > >> >> wrote: > > > > > > >> >> > > > > > > >> >>> Oh, interesting. I've certainty encountered issues with > > > > multi-word > > > > > > >> >>> synonyms, but I hadn't come across this. If you end up > using > > > it > > > > > > with a > > > > > > >> >>> recent solr verison, I'd be glad to hear your experience. > > > > > > >> >>> > > > > > > >> >>> I haven't used it, but I am aware of one other project in > > this > > > > > vein > > > > > > >> that > > > > > > >> >>> you might be interested in looking at: > > > > > > >> >>> https://github.com/LucidWorks/auto-phrase-tokenfilter > > > > > > >> >>> > > > > > > >> >>> > > > > > > >> >>> On 5/26/16, 9:29 AM, "John Bickerstaff" < > > > > j...@johnbickerstaff.com > > > > > > > > > > > > >> >> wrote: > > > > > > >> >>> > > > > > > >> >>>> Ahh - for question #3 I may have spoken too soon. This > line > > > > from > > > > > > the > > > > > > >> >>>> github repository readme suggests a way. > > > > > > >> >>>> > > > > > > >> >>>> Update: We have tested to run with the jar in > > $SOLR_HOME/lib > > > as > > > > > > well, > > > > > > >> >> and > > > > > > >> >>>> it works (Jetty). > > > > > > >> >>>> > > > > > > >> >>>> I'll try that and only respond back if that doesn't work. > > > > > > >> >>>> > > > > > > >> >>>> Questions 1 and 2 still stand of course... If anyone on > the > > > > list > > > > > > has > > > > > > >> >>>> experience in this area... > > > > > > >> >>>> > > > > > > >> >>>> Thanks. > > > > > > >> >>>> > > > > > > >> >>>> On Thu, May 26, 2016 at 10:25 AM, John Bickerstaff < > > > > > > >> >>> j...@johnbickerstaff.com > > > > > > >> >>>>> wrote: > > > > > > >> >>>> > > > > > > >> >>>>> Hi all, > > > > > > >> >>>>> > > > > > > >> >>>>> I'm creating a Solr Cloud that will index and search > > medical > > > > > text. > > > > > > >> >>>>> Multi-word synonyms are a pretty important factor. > > > > > > >> >>>>> > > > > > > >> >>>>> I find that there are some challenges around multi-word > > > > synonyms > > > > > > >> and I > > > > > > >> >>>>> also found on the wiki that there is a recommended > > 3rd-party > > > > > > parser > > > > > > >> >>>>> (synonym_edismax parser) created by Nolan Lawson and > found > > > > here: > > > > > > >> >>>>> https://github.com/healthonnet/hon-lucene-synonyms > > > > > > >> >>>>> > > > > > > >> >>>>> Here's the thing - the instructions on the github site > > > involve > > > > > > >> >> bringing > > > > > > >> >>>>> the jar file into the war file - which is not applicable > > any > > > > > > more... > > > > > > >> >> at > > > > > > >> >>>>> least I think it's not... > > > > > > >> >>>>> > > > > > > >> >>>>> I have three questions: > > > > > > >> >>>>> > > > > > > >> >>>>> 1. Is this still a good solution for multi-word synonyms > > > (I.e. > > > > > > Solr > > > > > > >> >>> Cloud > > > > > > >> >>>>> doesn't break it in some way) > > > > > > >> >>>>> 2. Is there a tool or plug-in out there that the > > > contributors > > > > > > would > > > > > > >> >>>>> recommend above this one? > > > > > > >> >>>>> 3. Assuming 1 = yes and 2 = no, can anyone tell me an > > > updated > > > > > > >> >> procedure > > > > > > >> >>>>> for bringing it in to Solr Cloud (I'm running 5.4.x) > > > > > > >> >>>>> > > > > > > >> >>>>> Thanks > > > > > > >> >>>>> > > > > > > >> >>> > > > > > > >> >>> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >