Thanks Avlesh for sharing the info. Will try it! In between, some another solution is also found http://metaoptimize.com/qa/questions/17/stemming-problems-when-writing-search-auto-complete
Kind regards. On 8/4/2010 9:13 PM, Avlesh Singh wrote: > I preferred to answer this question privately earlier. But I have received > innumerable requests to unveil the architecture. For the benefit of all, I > am posting it here (after hiding as much info as I should, in my company's > interest). > > The context: Auto-suggest feature on http://askme.in > > *Solr setup*: Underneath are some of the salient features - > > 1. TermsComponent is NOT used. > 2. The index is made up of 4 fields of the following types - > "autocomplete_full", "autocomplete_token", "string" and "text". > 3. "autocomplete_full" uses KeywordTokenizerFactory and > EdgeNGramFilterFactory. "autocomplete_token" uses > WhitespaceTokenizerFactory > and EdgeNGramFilterFactory. Both of these are Solr text fields with > standard > filters like LowerCaseFilterFactory etc applied during querying and > indexing. > 4. Standard DataImportHandler and a bunch of sql procedures are used to > "derive" all suggestable phrases from the system and index them in the > above > mentioned fields. > > *Controller setup*: The controller (to handle suggest queries) is a typical > JAVA servlet using Solr as its backend (connecting via solrj). Based on the > incoming query string, a lucene query is created. It is BooleanQuery > comprising of TermQuery across all the above mentioned fields. The boost > factor to each of these term queries would determine (to an extent) what > kind of matches do you prefer to show up first. JSON is used as the data > exchange format. > > *Frontend setup*: It is a home grown JS to address some specific use cases > of the project in question. One simple exercise with Firebug will spill all > the beans. However, I strongly recommend using jQuery to build (and extend) > the UI component. > > Any help beyond this is available, but off the list. > > Cheers > Avlesh > @avlesh<http://twitter.com/avlesh> | http://webklipper.com > > On Tue, Aug 3, 2010 at 10:04 AM, Bhavnik Gajjar< > bhavnik.gaj...@gatewaynintec.com> wrote: > > >> Whoops! >> >> table still not looks ok :( >> >> trying to send once again >> >> >> lorem Lorem ipsum dolor sit amet >> Hieyed ddi lorem ipsum dolor >> test lorem ipsume >> test xyz lorem ipslili >> >> lorem ip Lorem ipsum dolor sit amet >> Hieyed ddi lorem ipsum dolor >> test lorem ipsume >> test xyz lorem ipslili >> >> lorem ipsl test xyz lorem ipslili >> >> On 8/3/2010 10:00 AM, Bhavnik Gajjar wrote: >> >> Avlesh, >> >> Thanks for responding >> >> The table mentioned below looks like, >> >> lorem Lorem ipsum dolor sit amet >> Hieyed ddi lorem ipsum >> dolor >> test lorem ipsume >> test xyz lorem ipslili >> >> lorem ip Lorem ipsum dolor sit amet >> Hieyed ddi lorem ipsum >> dolor >> test lorem ipsume >> test xyz lorem ipslili >> >> lorem ipsl test xyz lorem ipslili >> >> >> Yes, [http://askme.in] looks good! >> >> I would like to know its designs/solr configurations etc.. Can you >> please provide me detailed views of it? >> >> In [http://askme.in], there is one thing to be noted. Search text like, >> [business c] populates [Business Centre] which looks OK but, [Consultant >> Business] looks bit odd. But, in general the pointer you suggested is >> great to start with. >> >> On 8/2/2010 8:39 PM, Avlesh Singh wrote: >> >> >> From whatever I could read in your broken table of sample use cases, I >> think >> >> >> you are looking for something similar to what has been done here >> -http://askme.in; if this is what you are looking do let me know. >> >> Cheers >> Avlesh >> @avlesh<http://twitter.com/avlesh> <http://twitter.com/avlesh> | >> http://webklipper.com >> >> On Mon, Aug 2, 2010 at 8:09 PM, Bhavnik >> Gajjar<bhavnik.gaj...@gatewaynintec.com> wrote: >> >> >> >> >> Hi, >> >> I'm looking for a solution related to auto complete feature for one >> application. >> >> Below is a list of texts from which auto complete results would be >> populated. >> >> Lorem ipsum dolor sit amet >> tincidunt ut laoreet >> dolore eu feugiat nulla facilisis at vero eros et >> te feugait nulla facilisi >> Claritas est etiam processus >> anteposuerit litterarum formas humanitatis >> fiant sollemnes in futurum >> Hieyed ddi lorem ipsum dolor >> test lorem ipsume >> test xyz lorem ipslili >> >> Consider below table. First column describes user entered value and >> second column describes expected result (list of auto complete terms >> that should be populated from Solr) >> >> lorem >> *Lorem* ipsum dolor sit amet >> Hieyed ddi *lorem* ipsum dolor >> test *lorem *ipsume >> test xyz *lorem *ipslili >> lorem ip >> *Lorem ip*sum dolor sit amet >> Hieyed ddi *lorem ip*sum dolor >> test *lorem ip*sume >> test xyz *lorem ip*slili >> lorem ipsl >> test xyz *lorem ipsl*ili >> >> >> >> Can anyone share ideas of how this can be achieved with Solr? Already >> tried with various tokenizers and filter factories like, >> WhiteSpaceTokenizer, KeywordTokenizer, EdgeNGramFilterFactory, >> ShingleFilterFactory etc. but no luck so far.. >> >> Note that, It would be excellent if terms populated from Solr can be >> highlighted by using Highlighting or any other component/mechanism of Solr. >> >> *Note :* Standard autocomplete (like, >> facet.field=AutoComplete&f.AutoComplete.facet.prefix=<user entered >> term>&f.AutoComplete.facet.limit=10&facet.sort&rows=0) are already >> working fine with the application. but, nowadays, looking for enhancing >> the existing auto complete stuff with the above requirement. >> >> Any thoughts? >> >> Thanks in advance >> >> >> >> >> > -- Regards, *Bhavnik Gajjar* www.gatewaynintec.com <http://www.gatewaynintec.com> *Mobile:* +91-9998436253 *Phone: *+91 79 2685 2554 / 5 / 6 *MSN: *bhavnik.gaj...@gatewaynintec.com The contents of this eMail including the contents of attachment(s) are privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and should not be disclosed to, used by or copied in any manner by anyone other than the intended addressee(s). If this eMail has been received by error, please advise the sender immediately and delete it from your system. The views expressed in this eMail message are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this eMail or any action taken in reliance on this eMail is strictly prohibited and may be unlawful. This eMail may contain viruses. GNPL has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this eMail. You should carry out your own virus checks before opening the eMail or attachment(s). GNPL is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. GNPL reserves the right to monitor and review the content of all messages sent to or from this eMail address and may be stored on the GNPL eMail system. In case this eMail has reached you in error, and you would no longer like to receive eMails from us, then please send an eMail to d...@gatewaynintec.com