Re: Input Delay Metric proposal

Ehsan Akhgari Thu, 20 Sep 2018 07:31:36 -0700

Hi Randell,

The last part of your email where you suggested this metric could be
proposed for the Performance API WG confused me a bit about its goals.  It
seemed to me from the description earlier that the goal behind MID is to
provide a metric useful for browser engineers who would like to measure how
the performance of the browser varies over time in an automated lab
environment, not as a metric useful for measuring the performance of live
web applications used by real users in the wild.  Is my understanding
correct?


I'm asking since it seems to me that FID actually achieves a pretty good
result in measuring things "at the right time" if the goal is to measure
the performance of a web application used by real users in the wild.  That,
of course, makes it fairly hard to use as a metric for browser engineers to
use to evaluate the performance of the browser as we make code changes, due
to the problem of not knowing when to capture it as you mentioned, and MID
sounds like a good way to address that problem.  However I have a hard time
seeing why web developers would be interested in MID over FID, since FID
correctly captures the pain of the user when they first try to interact
with their pages upon loading, and it seems unnecessary to then try to
guess when would be a good time to measure how bad the pain would be _if_
the user had interacted with the page at that time...

Thanks,
Ehsan

On Wed, Sep 19, 2018 at 2:45 PM Randell Jesup <rjesup.n...@jesup.org> wrote:

> Problem:
> Various measures have been tried to capture user frustration with having
> to wait to interact with a site they're loading (or to see the site
> data).  This includes:
>
> FID - First Input Delay --
> https://developers.google.com/web/updates/2018/05/first-input-delay
> TTI - Time To Interactive --
>
> https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#time_to_interactive
> related to: FCP - First Contentful Paint and FMP - First Meaningful Paint
> --
>
> https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#first_paint_and_first_contentful_paint
> TTVC (Time To Visually Complete), etc.
>
> None of these do a great job capturing the reality around pageload and
> interactivity.  FID is the latest suggestion, but it's very much based
> on watching user actions and reporting on them, and thus depends on how
> much they think the page is ready to interact with, and dozens of other
> things. It's only good for field measurements in bulk of a specific
> site, by the site author.  In particular, FID cannot reasonably be used
> in automation (or before wide deployment).
>
> Proposal:
>
> We should define a new measure based on FID name MID, for Median Input
> Delay, which is measurable in automation and captures the expected delay
> a user experiences during a load.  We can run this in automation against
> a set of captured pages, while also measuring related values like FCP
> and TTI, and dump this into a set of per-page graphs (perhaps on
> "areweinteractiveyet.com" :-) ).
>
> While FID depends on measuring the delay when the user *happens* to
> click, MID would measure the median (etc) delay that would be
> experienced at any point between (suggestion) FCP and TTI.  I.e. it
> would be based on "if a user input event were generated this
> millisecond, how long would it be before it ran?"  This would measure
> delay in the input event queue (probably 0 for this case) plus the time
> remaining until he current-running event for the mainthread finishes.
>
> This inherently assumes we measure TTI and FCP (or something
> approximating it).  This is somewhat problematic, as TTI is very noisy.
> I have a first cut at TTI measurement (fed into profiler markers) in
> bug 1299118 (without the "no more than 2 connections in flight" part).
>
> Value calculation:
> Median seems to be the best measure, but once we have data we can look
> at the distributions on real sites and our test harness and decide what
> has the most correlation to user experience.  We could also measure the
> 95% point, for example.  In automation, there might be some advantage to
> recording/reporting more data, like median and 95%, or median, average,
> and 95%, and max.
>
> Another issue with the calculation is that it won't capture burstiness
> in the results well (a distribution would).
>
> Range measured over:
> We could modify the starting point to be when the first object that
> could be interacted with is rendered (input object, link, adding a key
> event handler, etc).  This would be a more-accurate measure for web
> developers, and would matter only a little for our use.  Note that
> getting content on the screen earlier might in some cases hurt you by
> starting the measurement "early" when the MainThread is presumably busy.
>
> Likewise, there might very well be alternatives to TTI for the end-point
> (and on some pages, you never get to TTI, or it's a Long Time).  Using
> TTI does imply we must collect data until 5 seconds after the last "Long
> Task", and since some sites will never go 5 seconds without a long
> task, we'll need to upper-bound it (or progressively reduce the 5
> seconds over time, which may help).   Alternatively, we could use a
> shorter window, or put an arbitrary limit on it (5 seconds past
> 'loaded', or just to 'loaded'), etc.
>
> Issues:
>
> Defining the start and stop point, and the details around the exact way
> we calculate the result (I hand-wove about it above).  Note that
> "longer" endpoints will result generally in better scores, since it
> would average over probably a longer tail where less is happening
> (presumably).  OTOH if it ends at TTI on a "Long Task" (50+ms event),
> that rather implies that it was at least intermittently busy until then.
>
> If we want to start when something interact-able is rendered, there may
> be some work to figure that out.
>
> Note that this inherently is measuring the delay until the input event
> *starts* processing, not how long it takes to process (since there is no
> actual input event here).
>
> Once we have some experience with this, we could propose it for the
> Performance API WG.
>
> --
> Randell Jesup, Mozilla Corp
> remove "news" for personal email
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>


-- 
Ehsan
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Input Delay Metric proposal

Reply via email to