Just FYI, we have also implemented a Trie approach (outside of solr, even though our mail search uses solr) at the link in the signature.
You can try out the auto-completion working on the comparison tool on the home page. - nishant www.reviewgist.com ----- Original Message ---- From: Vaijanath N. Rao <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, May 6, 2008 12:43:25 PM Subject: Re: Your valuable suggestion on autocomplete Hi Rantjil Bould, I would suggest you to give a thought on Trie data structure which is used for auto-complete. Hitting Solr for every prefix looks time consuming job, but I might be wrong. I have Trie implementation and it works very fast (of course it is in memory data structure unlike solr index which lies on disk) --Thanks and Regards Vaijanath Rantjil Bould wrote: > Hi Group, > I have already got some valuable suggestions from group. Based > on that, I have come out with following process to finally implement > autocomplete like fetaure in my system > 1- Index the whole documents > 2- Extract all terms using indexReader's terms() method > > I am getting terms like vl,vla,vlan,vlana,vlanan,vlanand. But I would like > to get absolute terms i.e. vlanand. The field definition in solr is > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true"></filter> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"></filter> > <filter class="solr.LowerCaseFilterFactory"></filter> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"></filter> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"></filter> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"></filter> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"></filter> > <filter class="solr.LowerCaseFilterFactory"></filter> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"></filter> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter> > </analyzer> > </fieldType> > > Would appreciate your input to get absolute terms?? > > 3- For each term, extract documents containing those term using termDocs() > method > 4- Create one more index with fields, term, frequency and docNo. This index > would be used for autocomplete feature. > 5- Any letter typed by user in search field, use Ajax script (like > scriptaculous or JQuery) to extract all terms using prefix query. > 6- Based on search term selected by user, keep track of document nos in > which this term belongs. > 7- For next search term selection using documents nos to select all terms > excluding currently selected term. > > This somehow works. As new to SOlr ans also to Lucene, I would like to know > in case it can be improved? > > - RB > > ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ