On Feb 24, 2010, at 1:17 AM, Sachin wrote: > Hi All, > > I am trying to setup autosuggest using solr 1.4 for my site and needed some > pointers on that. Basically, we provide autosuggest for user typed in > characters in the searchbox. The autosuggest index is created with older user > typed in search queries which returned > 0 results. We do some lazy writing > to store this information into the db and then export it to solr on a nightly > basis. As far as I know, there are 3 ways (apart from wild card search) of > achieving autosuggest using solr 1.4: > > 1. Use EdgeNGrams > 2. Use shingles and prefix query. > 3. Use the new Terms component.
Another scenario you did not consider is the approach I recommend in my book (p. 156). There's a poor example of this on the wiki: http://wiki.apache.org/solr/SimpleFacetParameters#Facet_prefix_.28term_suggest.29 > I am for now more inclinded towards using the EdgeNGrams (no method to > madness) and just wanted to know is there any recommended approach out of the > 3 in terms of performance, since the user excepts the suggestions to be > almost instantaneous? We do some heavy caching at our end to avoid hitting > solr everytime but is any of these 3 approaches faster than the other? The Terms component should be the fastest since it has the most direct access to the underlying data. But I don't understand why people use it for auto-suggest because it fails to consider the context of the query considering words before the right-most term. However if you use KeywordTokenizer with EdgeNGram with Terms then this addresses that somewhat... You don't seem interested in matching cases where someone once queried "a b c" and you don't want "b c" to match on this apparently. Personally that would bug me. I like the faceting approach but admittedly I have not used it at scale. ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/ > Also, I would also like to return the suggestion even if the user typed in > query matches in between: for instance if I have the query "chicken pasta" in > my index and the user types in "pasta", I would also like this query to be > returned as part of the suggestion (ala Yahoo!). Below is my field definition: > > <fieldType name="suggest" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" > maxGramSize="50" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > > I tried changing the KeywordTokenizerFactory with LetterTokenizerFactory, and > though it works great for the above scenario (does a in-between match), it > has the side-effect of removing everything which are not letters so if the > user types in "123" he gets absolutely no suggestions. Is there anything that > I'm missing in my configuration, is this even achievable by using EdgeNGrams > or shall I look at using perhaps the TermsComponent after applying the regex > patch from 1.5 and maybe do something like ".*user-typed-in-chars.*"? > > Thanks! > > >