Eugeny,
I think you want something more useful and less problematic as Wunder already
pointed out.
Wouldn't you want your suggestions to be ordered by how close of match they
are? And do you really want them to be purely prefix-based like in your
example?
What if people are searching for Mic
> The original request was for suggestions ranked purely by request count.
> You have designed something more complicated that probably works better.
>
> When I built query completion at Netflix, I used the movie rental rates to
> rank suggestions. That was simple and very effective. We didn't ne
The original request was for suggestions ranked purely by request count. You
have designed something more complicated that probably works better.
When I built query completion at Netflix, I used the movie rental rates to rank
suggestions. That was simple and very effective. We didn't need a more
> Of course you can fight spam. And the spammers can fight back. I prefer
> algorithms that don't require an arms race with spammers.
>
> There are other problems with using query frequency. What about all the
> legitimate users that type "google" or "facebook" into the query box
> instead of int
Of course you can fight spam. And the spammers can fight back. I prefer
algorithms that don't require an arms race with spammers.
There are other problems with using query frequency. What about all the
legitimate users that type "google" or "facebook" into the query box instead of
into the loca
A query log parser can be written to detect spam. At first you can use cookies
(e.g. sessions) and IP-addresses to detect term spam. You can also limit a
popularity spike to a reasonable mean size over a longer period. And you can
limit rates using logarithms.
There are many ways to deal with s
Ranking suggestions based on query count would be trivially easy to spam. Have
a bot make my preferred queries over and over again, and "boom" they are the
most-preferred.
wunder
On Sep 20, 2011, at 3:41 PM, Markus Jelsma wrote:
> At least, i assumed this is what the user asked for when i read
At least, i assumed this is what the user asked for when i read "which counts
requests and sorts suggestions according to this count"
> No. The spellchecker and suggester only operate on the index (tf*idf) and
> do not account for user generated input which is what the user asks for.
>
> You nee
No. The spellchecker and suggester only operate on the index (tf*idf) and do
not account for user generated input which is what the user asks for.
You need to parse the query logs periodically index query strings and
#occurences in the query logs as a float value (or use ExternalFileField) to
o
>From http://wiki.apache.org/solr/Suggester :
spellcheck.onlyMorePopular=true - if this parameter is set to true then the
suggestions will be sorted by weight ("popularity") - the count parameter
will effectively limit this to a top-N list of best suggestions. If this is
set to false then suggesti
10 matches
Mail list logo