To avoid the "users only see the first page" problem, one solution is: if
the result set has more than one page with high scores near each other,
scramble them.
That is, if the top 20 results range in score from 19.0 to 20.0, they really
are all about the same relevance, so just card-shuffle them.
yes, applying a boost would be a good addition.
patches are always welcome ;)
On Jan 30, 2009, at 10:56 AM, Matthew Runo wrote:
I've thought about patching the QueryElevationComponent to apply
boosts rather than a specific sort. Then the file might look like..
And I could write a scr
I've thought about patching the QueryElevationComponent to apply
boosts rather than a specific sort. Then the file might look like..
query>
And I could write a script that looks at click data once a day to fill
out this file.
Thanks for your time!
Matthew Runo
Software Engineer, Zappos.c
Matthew Runo wrote:
Which papers did you see that actually talked about using clicks? I
don't see those, beyond "Addressing Malicious Noise in Clickthrough
Data" by Filip Radlinski and also his "Query Chains: Learning to Rank
from Implicit Feedback" - but neither is really on topic.
Here are t
It may not be as fine-grained as you want, but also check the
QueryElevationComponent. This takes a preconfigured list of what the
top results should be for a given query and makes thoes documents the
top results.
Presumably, you could use click logs to determine what the top result
shou
"A Decision Theoretic Framework for Ranking using Implicit Feedback"
uses clicks, but the best part of that paper is all the side comments
about difficulties in evaluation. For example, if someone clicks on
three results, is that three times as good or two failures and a
success? We have to know th
Agreed, it seems that a lot of the algorithms in these papers would
almost be a whole new RequestHandler ala Dismax. Luckily a lot of them
seem to be built on Lucene (at least the ones that I looked at that
had code samples).
Which papers did you see that actually talked about using clicks?
Thanks, I didn't know there was so much research in this area.
Most of the papers at those workshops are about tuning the
entire ranking algorithm with machine learning techniques.
I am interested in adding one more feature, click data, to an
existing ranking algorithm. In my case, I have enough d
OK I've implemented this before, written academic papers and patents
related to this task.
Here are some hints:
- you're on the right track with the editorial boosting elevators
- http://wiki.apache.org/solr/UserTagDesign
- be darn careful about assuming that one click is enough evidence
I've been thinking about the same thing. We have a set of queries
that defy straightforward linguistics and ranking, like figuring
out how to match "charlie brown" to "It's the Great Pumpkin,
Charlie Brown" in October and to "A Charlie Brown Christmas"
in December.
I don't have any solutions yet,
Hello folks!
We've been thinking about ways to improve organic search results for a
while (really, who hasn't?) and I'd like to get some ideas on ways to
implement a feedback system that uses user behavior as input.
Basically, it'd work on the premise that what the user actually
clicked o
11 matches
Mail list logo