Re: Your valuable suggestion on autocomplete

Walter Underwood Tue, 06 May 2008 21:24:46 -0700

Query logs are full of junk. We fill from the correct values in the
search index. We used to fill directly from the DB, but there were
updates in the DB that weren't in Solr.


Every two hours, it does a search for "type:movie" and retrieves the
title field for every match. Those are loaded into the ternary
search tree. The search box completes movie titles. Very helpful
for Ratatouille or Koyaanisqatsi.

You can try it on the non-member pages at www.netflix.com, click
the "Browse" tab instead of signing up. It would be OK if you signed
up, of course.

The number of hits per request are sized to match the max cached
request in our middle tier HTTP server. We have over twenty front
end webapps and five back end Solr servers.

wunder 

On 5/6/08 9:50 AM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote:

> Hi Wunder,
> 
> ----- Original Message ----
>> From: Walter Underwood <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, May 6, 2008 11:21:31 AM
>> Subject: Re: Your valuable suggestion on autocomplete
>> 
>> I wrote a prefix map (ternary search tree) in Java and load it with
>> queries to Solr every two hours. That keeps the autocomplete and
>> search index in sync.
> 
> What do you mean by the two staying in sync?  If you fill the TST with info
> from query logs, how does that make it stay in sync with the index?  Or do you
> mean you look for queries with >N hits (maybe even N=1) and only feed those
> into TST, thus ensuring autocomplete always suggests queries that yield hits?
> 
> Thanks,
> Otis
> 
>> Our autocomplete gets over 25M hits per day, so we don't really
>> want to send all that traffic to Solr.
>> 
>> wunder
>> 
>> On 5/6/08 2:37 AM, "Nishant Soni"  wrote:
>> 
>>> Just FYI, we have also implemented a Trie approach (outside of solr, even
>>> though our mail search uses solr) at the link in the signature.
>>> 
>>> You can try out the auto-completion working on the comparison tool on the
>>> home
>>> page.
>>> 
>>> - nishant
>>> 
>>> www.reviewgist.com
>>> 
>>> 
>>>  
>>> 
>>> 
>>> ----- Original Message ----
>>> From: Vaijanath N. Rao
>>> To: solr-user@lucene.apache.org
>>> Sent: Tuesday, May 6, 2008 12:43:25 PM
>>> Subject: Re: Your valuable suggestion on autocomplete
>>> 
>>> Hi Rantjil Bould,
>>> 
>>> I would suggest you to give a thought on Trie data structure which is
>>> used for auto-complete.  Hitting Solr for every prefix looks time
>>> consuming job, but I might be wrong. I have Trie implementation and it
>>> works very fast (of course it is in memory data structure unlike solr
>>> index which lies on disk)
>>> 
>>> --Thanks and Regards
>>> Vaijanath
>>> 
>>> 
>>> 
>>> Rantjil Bould wrote:
>>>> Hi Group,
>>>>              I have already got some valuable suggestions from group. Based
>>>> on that, I have come out with following process to finally implement
>>>> autocomplete like fetaure in my system
>>>> 1- Index the whole documents
>>>> 2- Extract all terms using indexReader's terms() method
>>>> 
>>>> I am getting terms like vl,vla,vlan,vlana,vlanan,vlanand. But I would like
>>>> to get absolute terms i.e. vlanand. The field definition in solr is
>>>> 
>>>> 
>>>>       
>>>>         
>>>>         
>>>> words="stopwords.txt" enablePositionIncrements="true">
>>>>         
>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0">
>>>>         
>>>>         
>>>> protected="protwords.txt">
>>>>         
>>>>       
>>>>       
>>>>         
>>>>         
>>>> ignoreCase="true" expand="true">
>>>>         
>>>> words="stopwords.txt">
>>>>         
>>>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>>>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1">
>>>>         
>>>>         
>>>> protected="protwords.txt">
>>>>         
>>>>       
>>>>     
>>>> 
>>>> Would appreciate your input to get absolute terms??
>>>> 
>>>> 3- For each term, extract documents containing those term using termDocs()
>>>> method
>>>> 4- Create one more index with fields, term, frequency and docNo. This index
>>>> would be used for autocomplete feature.
>>>> 5- Any letter typed by user in search field, use Ajax script (like
>>>> scriptaculous or JQuery) to extract all terms using prefix query.
>>>> 6- Based on search term selected by user, keep track of document nos in
>>>> which this term belongs.
>>>> 7- For next search term selection using documents nos to select all terms
>>>> excluding currently selected term.
>>>> 
>>>> This somehow works. As new to SOlr ans also to Lucene, I would like to know
>>>> in case it can be improved?
>>>> 
>>>> - RB
>>>> 
>>>>  
>>> 
>>> 
>>>       
>>> ____________________________________________________________________________
>>> __
>>> ______
>>> Be a better friend, newshound, and
>>> know-it-all with Yahoo! Mobile.  Try it now.
>>> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>> 
>> 
> 
>

Re: Your valuable suggestion on autocomplete

Reply via email to