For relevance, I would also look at retention metrics. Harder to tie back to a specific search. But what happens after the conversion? Did they purchase the product and hate it? Or did they come back for more? Retention metrics say a lot about the whole experience. But for many search-heavy applications search is 90% of the user experience. Was it really relevant if they purchased a product, but were dissatisfied? Did search make a promise that wasn't delivered on? This is something I personally noodle about and not something I have a canned solution for.
There's an obsession with what I think of as "engagement metrics" or "session metrics". Engagement metrics like CTR and are handy because they're easy to tie to a search. Search, search, search <click>, search, search, <click> <purchase>. I'm always cautious of click-thru metrics. Beware of the biases in your clickthru metrics http://opensourceconnections.com/blog/2014/10/08/when-click-scoring-can-hurt-search-relevance-a-roadmap-to-better-signals-processing-in-search/ Another reason to be cautious is user behavioral data can require domain-specific interpretation. A good book on recommendor can talk more about interpreting user behavior to see if an item was relevant. For examples Pratical Recommender Systems by Kim Falk (a Manning MEAP) spends a great deal of time talking through gathering evidence whether a user liked the thing they clicked on or not. For example, did the user click a movie and go back immediately? Start watching a movie and go back in 5 minutes indicating they hated it? Or watch a movie all the way through? Related to interpreting behavior -- understand the kinds of searchers out there. Understand what sort of user experience you've built. Informational searchers doing research will look at every item and evaluate them. For example, a paralegal searching a legal application may need to examine every result carefully. Navigational searchers want to hunt for one thing. Everyday e-commerce searchers clicking on every result is probably disastrous. However, the purchasing dept of an organization MIGHT look at every result and that might be ok. Beware of search's long tail. You can gather metrics on all your searches and where users are clicking, but search has a notorious long tail. Many of my clients have meaningful metrics over perhaps the top 50 results before quickly going off into obscurity of statistical insignificance per search. This depends entirely on the type of search application you're developing. Some kind of niche product with a handful of searches per day? Or giant e-commerce site? Sometimes what's simpler is to do usability testing or to sit with an expert user and gather relevance judgments--grades on what's relevant and what's not. (this is what we do with Quepid). This works particularly well for these niche, expert search subjects Anyway, there's still quite a bit of art to interpreting search metrics. I would argue to keep the human and domain expert in the loop understanding and interpreting metrics. But its a yin-and-yang. You also need to be able to tell that supposed domain expert when they're wrong. Sorry for long winded email, but these topics dominate my dreams/nightmares these days :) Best -Doug On Wed, Feb 24, 2016 at 11:20 AM, Walter Underwood <wun...@wunderwood.org> wrote: > Click through rate (CTR) is fundamental. That is easy to understand and > integrates well with other business metrics like conversion. CTR is at > least one click anywhere in the result set (first page, second page, …). > Count multiple clicks as a single success. The metric is, “at least one > click”. > > No hit rate is sort of useful, but you need to know which queries are > getting no hits, so you can fix it. > > For latency metrics, look at 90th percentile or 95th percentile. Average > is useless because response time is a one-sided distribution, so it will be > thrown off by outliers. Percentiles have a direct customer satisfaction > interpretation. 90% of searches were under one second, for example. Median > response time should be very, very fast because of caching in Solr. During > busy periods, our median response time is about 1.5 ms. > > Number of different queries per conversion is a good way to look how query > assistance is working. Things like autosuggest, fuzzy, etc. > > About 10% of queries will be misspelled, so you do need to deal with that. > > Finding underperforming queries is trickier. I really need to write an > article on that. > > “Search Analytics for Your Site” by Lou Rosenfeld is a good introduction. > > http://rosenfeldmedia.com/books/search-analytics-for-your-site/ < > http://rosenfeldmedia.com/books/search-analytics-for-your-site/> > > Sea Urchin is doing some good work in search metrics: > https://seaurchin.io/ <https://seaurchin.io/> > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > Search Guy, Chegg > > > On Feb 24, 2016, at 2:38 AM, Emir Arnautovic < > emir.arnauto...@sematext.com> wrote: > > > > Hi Bill, > > You can take a look at Sematext's search analytics ( > https://sematext.com/search-analytics). It provides some of metrics you > mentioned, plus some additional (top queries, CTR, click stats, paging > stats etc.). In combination with Sematext's performance metrics ( > https://sematext.com/spm) you can have full picture of your search > infrastructure. > > > > Regards, > > Emir > > > > -- > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > > Solr & Elasticsearch Support * http://sematext.com/ > > > > > > On 24.02.2016 04:07, William Bell wrote: > >> How do others look at search metrics? > >> > >> 1. Search conversion? Do you look at searches and if the user does not > >> click on a result, and reruns the search that would be a failure? > >> > >> 2. How to measure auto complete success metrics? > >> > >> 3. Facets/filters could be considered negative, since we did not find > the > >> results that the user wanted, and now they are filtering - who to > measure? > >> > >> 4. One easy metric is searches with 0 results. We could auto expand the > geo > >> distance or ask the user "did you mean" ? > >> > >> 5. Another easy one would be tech performance: "time it takes in > seconds to > >> get a result". > >> > >> 6. How to measure fuzzy? How do you know you need more synonyms? How to > >> measure? > >> > >> 7. How many searches it takes before the user clicks on a result? > >> > >> Other ideas? Is there a video or presentation on search metrics that > would > >> be useful? > >> > > > > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.