Re: autocomplete with popularity

2011-09-22 Thread Otis Gospodnetic
Eugeny, I think you want something more useful and less problematic as Wunder already pointed out. Wouldn't you want your suggestions to be ordered by how close of match they are?  And do you really want them to be purely prefix-based like in your example? What if people are searching for Mic

Re: autocomplete with popularity

2011-09-20 Thread Markus Jelsma
> The original request was for suggestions ranked purely by request count. > You have designed something more complicated that probably works better. > > When I built query completion at Netflix, I used the movie rental rates to > rank suggestions. That was simple and very effective. We didn't ne

Re: autocomplete with popularity

2011-09-20 Thread Walter Underwood
The original request was for suggestions ranked purely by request count. You have designed something more complicated that probably works better. When I built query completion at Netflix, I used the movie rental rates to rank suggestions. That was simple and very effective. We didn't need a more

Re: autocomplete with popularity

2011-09-20 Thread Markus Jelsma
> Of course you can fight spam. And the spammers can fight back. I prefer > algorithms that don't require an arms race with spammers. > > There are other problems with using query frequency. What about all the > legitimate users that type "google" or "facebook" into the query box > instead of int

Re: autocomplete with popularity

2011-09-20 Thread Walter Underwood
Of course you can fight spam. And the spammers can fight back. I prefer algorithms that don't require an arms race with spammers. There are other problems with using query frequency. What about all the legitimate users that type "google" or "facebook" into the query box instead of into the loca

Re: autocomplete with popularity

2011-09-20 Thread Markus Jelsma
A query log parser can be written to detect spam. At first you can use cookies (e.g. sessions) and IP-addresses to detect term spam. You can also limit a popularity spike to a reasonable mean size over a longer period. And you can limit rates using logarithms. There are many ways to deal with s

Re: autocomplete with popularity

2011-09-20 Thread Walter Underwood
Ranking suggestions based on query count would be trivially easy to spam. Have a bot make my preferred queries over and over again, and "boom" they are the most-preferred. wunder On Sep 20, 2011, at 3:41 PM, Markus Jelsma wrote: > At least, i assumed this is what the user asked for when i read

Re: autocomplete with popularity

2011-09-20 Thread Markus Jelsma
At least, i assumed this is what the user asked for when i read "which counts requests and sorts suggestions according to this count" > No. The spellchecker and suggester only operate on the index (tf*idf) and > do not account for user generated input which is what the user asks for. > > You nee

Re: autocomplete with popularity

2011-09-20 Thread Markus Jelsma
No. The spellchecker and suggester only operate on the index (tf*idf) and do not account for user generated input which is what the user asks for. You need to parse the query logs periodically index query strings and #occurences in the query logs as a float value (or use ExternalFileField) to o

Re: autocomplete with popularity

2011-09-20 Thread O. Klein
>From http://wiki.apache.org/solr/Suggester : spellcheck.onlyMorePopular=true - if this parameter is set to true then the suggestions will be sorted by weight ("popularity") - the count parameter will effectively limit this to a top-N list of best suggestions. If this is set to false then suggesti