Hi Chris, Its all a bit early in the morning for this mined :-)
The question asked, in good faith, was does solr support or extend to implementing a thesaurus. It looks like it does not which is fine. It does support synonyms and synonym rings which is again fine. The ski example was an illustration in response to a follow up question for more explanation on what a thesaurus is. An attempt at an answer of why a thesaurus; is below. Use case 1: improve facets Motivation Unstructured lists of labels in facets offer very poor user experience. Similar to tag clouds users find them arbitrary, with out focus and often overwhelming. Labels in facets which are grouped in meaningful ways relevant to the user increase engagement, perceived relevance and user satisfaction. Solution A thesaurus of term relationships could be used to group facet labels Implementation (er completely out of my depth at this point) Thesaurus relationships defined in a simple text file term, bt=>term,term nt=> term, term rt=>term, term, pt=>term if a search specifies a facet to be returned the field terms are identified by reading the thesaurus into groups, broader terms, narrower terms, related terms etc These groups are returned as part of the response for the UI to display faceted labels as broader, narrower, related terms etc Use case 2: Increase synonym search precision Motivation Synonyms rings do not allow differences in synonym to be identified. Rarely are synonyms exactly equivalent. This leads to a decrease in search precision. Solution Boost queries based on search term thesaurus relationships Implementation (again completely out of depth here) Allow terms in the index to be identified as bt , nt, .. terms of the search term. Allow query parser to boost terms differentially based on these thesaurus relationships As for the x and y stuff I'm not sure, like i say its quite early in the morning for me. I'm sure their may well be a different way of achieving the above (but note it is more than a hierarchy). However the librarians have been doing this for 50 years now . Again though just to repeat this is hardly a killer for us. We've looked at solr for a project; created a proto type; generated tons of questions, had them answered in the main by the docs, some on this list and been amazed at the fantastic results solr has given us. In fact with a combination of keepwords and synonyms we have got a pretty nice simple set of facet labels anyway (my motivation for the original question), so our corpus at the moment does not really need a thesaurus! :-) Thanks Lee On 9 December 2010 23:38, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > > : a term can have a Prefered Term (PT), many Broader Terms (BT), Many > Narrower > : Terms (NT) Related Terms (RT) etc > ... > : User supplied Term is say : Ski > : > : Prefered term: Skiing > : Broader terms could be : Ski and Snow Boarding, Mountain Sports, Sports > : Narrower terms: down hill skiing, telemark, cross country > : Related terms: boarding, snow boarding, winter holidays > > I'm still lost. > > You've described a black box with some sample input ("Ski") and some > corrisponding sample output (PT=..., BT=..., NT=..., RT=....) -- but you > haven't explained what you want to do with tht black box. Assuming such a > black box existed in solr what are you expecting/hoping to do with it? > how would such a black box modify solr's user experience? what is your > goal? > > Smells like an XY Problem... > http://people.apache.org/~hossman/#xyproblem<http://people.apache.org/%7Ehossman/#xyproblem> > > Your question appears to be an "XY Problem" ... that is: you are dealing > with "X", you are assuming "Y" will help you, and you are asking about "Y" > without giving more details about the "X" so that we can understand the > full issue. Perhaps the best solution doesn't involve "Y" at all? > See Also: http://www.perlmonks.org/index.pl?node_id=542341 > > > -Hoss >