Shawn,

Have you looked 
at http://www.sematext.com/products/dym-researcher/index.html as a solution to 
the ZeroHits problem?

If that doesn't work, then yes, offline word/phase co-occurrence may work.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>________________________________
>From: Shawn Heisey <s...@elyograg.org>
>To: solr-user@lucene.apache.org
>Sent: Wednesday, October 5, 2011 4:06 PM
>Subject: Offering search suggestions - a discussion of multi-term phrases
>
>I am trying to figure out how we can begin offering search suggestions to 
>people, especially when a user types in something that results in few or zero 
>results.  For background, we have an archive of about 60 million objects, most 
>of which are photographs.  There are also a number of text articles, and most 
>recently, videos.  The metadata is kept in a database, and the database is 
>used as the import source for Solr.
>
>The first thing we're going to try is spellcheck, using the terms component to 
>generate a wordlist from our catchall field and then doing what we can in with 
>a program to remove undesirable words.  I do not anticipate running into much 
>trouble with this part.
>
>Another idea we have is search suggestions.  One aspect is autocomplete, the 
>other is similar to the spell-check, but more sophisticated.  It would do 
>things like offer "Nicole Kidman" if the user typed in "Tom Cruise" and didn't 
>get many search results.
>
>The problem I can see with all of these things is that single terms will not 
>really be enough, and single terms is all I can get out of the index.  Our 
>distributed index is already quite a bit larger than the available RAM on the 
>machines that contain it, and it's growing steadily.  Adding analysis 
>complexity or copyFields to the index is not much of an option, because we 
>have no budget available for new hardware, but I won't completely rule it out.
>
>Is there any way, even if it's offline analysis of either the index or the 
>database, to come up with common short phrases specific to our data?  If there 
>is, perhaps I can then give it to Solr and let it make suggestions with it.
>
>Thanks,
>Shawn
>
>
>
>

Reply via email to