Sorry for the slow reply, I've had to focus on some other things in the
last month. The reply below covers a few overlapping topics that were
brought up in this thread by Yoav Weiss and Daniel Bratell.

On Fri, 28 Jul 2023 at 13:57, Yoav Weiss <[email protected]> wrote:

> Hey Mingyu & Fergal,
>
> Given the fact that getting this heuristic wrong can have *significant*
> negative implications on impacted users, I think we need to be very careful
> here.
> I'd love to see a plan to validate the heuristic we'd pick here such that
> its coverage is as close to 100% as we can in terms of hitting the
> "excessive cache limits" bucket (and missing the "user sensitive data"
> bucket). Alternatively, maybe we could use a mix of outreach and a new API
> (or fix Clear-Site-Data) to help us fish out the "excessive cache limits"
> crowd?
>

> Do we have a way to validate the heuristics y'all came up with and
understand what would be their hit rate vs. miss rate? What makes us
confident that no sensitive data would be caught in those heuristics as
cacheable content?

It's important to clarify something. Our goal is not to avoid restoring
pages containing *any* sensitive data. Our goal is to avoid restoring pages
containing sensitive data that the user no longer has access to. If cookies
have not changed, then the browser's HTTP request will be the same as last
time and our assumption is that the decisions about access will also be the
same. We know this assumption is not entirely correct, resources could be
deleted or access could be changed on the server, hence our shorter
timeout. We can break the problem down as follows

   -

   changes on the client result is loss of access (e.g.
   cookies/access-tokens changed/deleted) - we believe this is covered
   -

   changes on the server results in loss of access - this breaks down
   further
   -

      site has implemented a way to immediately reflect those changes to
      open pages, e.g. using EventSource - this will continue to work either by
      causing them to be evicted or by delivering events as soon as they are
      restored from BFCache
      -

      site has not implemented a way to immediately reflect those changes
      to open pages - in this case users can hold onto this data
indefinitely. We
      are adding an additional window where this data could also reappear from
      BFCache. It would be surprising for a site to be concerned about
the latter
      but not with the former.



XMLHttpRequest, fetch, WebTransport, WebRTC, WebSocket complicate this as
they give a route for sensitive data to appear in the page outside of
navigation-driven requests and that is why we block restoring if we see any
of those (for XHR and fetch we only block if the response is CCNS, for the
others we simply have to assume that all data is sensitive).

Given all that, we don't have any way to estimate how often it would happen
that sensitive content that the user has lost access to would appear from
BFCache. If we had a way to measure it, we would probably use that as a
heuristic to prevent BFCaching. We believe it would only happen if e.g. a
server-side change removed the user's access to the data as discussed above.

Daniel:
> Even if cache-control: no-store is being badly overused, and the numbers
you list seem to indicate that is the case, hasn't there been a promise to
web developers that such a resource will be forever gone once the page is
no longer shown, and is that a promise that can reasonably be broken?

There is no explicit promise that CCNS prevents BFCaching. The CCNS header,
or in general, all the Cache-control directives, are intended to control
the HTTP caching, so the explicit promise is about HTTP cache. BFCache is
not part of the HTTP caching, and developers should not interpret the CCNS
header as a promise that the page will not be BFCached. Indeed, there is a
consensus <https://github.com/whatwg/html/issues/5744> about not providing
any explicit way of preventing BFCache to avoid abuse. This web.dev article
<https://web.dev/bfcache/#minimize-use-of-cache-control-no-store> also
touches on the difference between BFCache and HTTP cache with the
Cache-control header.

There is an implicit promise that comes from the fact that CCNS has blocked
BFCache. It’s inevitable that some sites have already set their
expectations on the CCNS header blocking BFCache. Some sites will
malfunction when they start to be restored from BFCache. This was also true
when we launched BFCache for non-CCNS sites.

We believe that BFCache is a huge user benefit. We launched it knowing that
some pages would malfunction. The benefit to users outweighs the cost of
some devs having to update their sites to work with BFCache. We don't think
CCNS sites should get a special pass. The concern with CCNS-sites is that
they often deal with sensitive information and that is why we have put so
much effort into making sure that that kind of information is not
inappropriately resurfaced as a result of BFCache. There may be other types
of malfunction due to BFCache. It seems like those kinds of problems should
be subject to the same calculus of user benefit vs dev cost as went into
the original BFCache launch.

F




> Cheers :)
> Yoav
>
> On Thu, Jul 27, 2023 at 3:05 PM 'Mingyu Lei' via blink-dev <
> [email protected]> wrote:
>
>> Hi Mike,
>>
>> Following our previous response, we would like to share the usage data
>> that we have collected from the beta channel. 18.76% of history navigations
>> are not restored from BFCache because of the CCNS header only. The
>> following are the breakdowns:
>>
>>    - No RPC containing CCNS response header: 8.63%
>>    - *No cookie modification: 6.70%*
>>       - With non-HttpOnly cookie modifications only: 1.38%
>>       - With HttpOnly or non-HttpOnly cookie modifications: 0.55%
>>    - With RPC containing CCNS response header: 10.13%
>>    - No cookie modification: 1.01%
>>       - With non-HttpOnly cookie modifications only: 7.86%
>>       - With HttpOnly or non-HttpOnly cookie modifications: 1.26%
>>
>> Based on these figures, we will update the proposal to evict the BFCache
>> entry with any cookie modification for the current phase. This should give
>> us 6.70% improvement in cache hit rate.
>>
>> We could continue the HttpOnly cookie discussion in the future.
>>
>> On Sat, Jul 22, 2023 at 12:46 AM Mingyu Lei <[email protected]> wrote:
>>
>>> Hi Mike,
>>>
>>> Thanks for the comments. We have discussed the concerns you raised
>>> before and please find the replies below.
>>>
>>>
>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md#secure-cookies
>>>> references "HTTPS-only" cookies, as well as "secure" vs "insecure" cookies.
>>>> By "HTTPS-only", do you mean a cookie that sets the "secure" attribute
>>>> (including "__Secure-" prefixes), _and_ sets "HttpOnly"? Or something else?
>>>>
>>>
>>>
>>>> Later in
>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md#allow-ccns-documents-to-be-bfcached-without-the-api,
>>>> the proposal is that CCNS pages are safe to bfcache if no "HTTP-only"
>>>> cookies have changed. Are these cookies setting only the "HttpOnly"
>>>> attribute, or is this intended to say "HTTPS-only" as above?
>>>>
>>>
>>> The short answer is that we will only monitor HttpOnly cookies,
>>> regardless of whether they are secure or not. The terms in the explainer
>>> were unclear, and we will fix them.
>>>
>>> I see that
>>>> https://github.com/w3ctag/design-reviews/issues/786#issuecomment-1515742477
>>>> references this work. Did we learn anything from experimentation in the
>>>> wild (not sure if y'all ran an experiment)?
>>>>
>>>> I'm curious if y'all have looked at stats on the uptake of
>>>> secure/httponly cookies vs "non-secure" cookies being set by pages returned
>>>> from RPCs sent with an Authorization header (though I wouldn't be surprised
>>>> if we don't have UMA for that... perhaps just globally would be useful to
>>>> consider).
>>>>
>>>
>>> We are currently conducting a Finch experiment to collect the hit rate
>>> on beta, and the data will be available next week. We will share it with
>>> you again after we have the data.
>>>
>>> With that data, we will be able to tell the percentage of page loads
>>> that observe HttpOnly cookie changes, any cookie changes, or no cookie
>>> changes. There will also be another dimension about whether the page had
>>> sent out RPC with CCNS response. There is no pre-existing UMA for this, but
>>> we have recorded the reasons why BFCache is not restored.
>>>
>>> My only concern (which may not be grounded in reality) would be for
>>>> sites not following best practices...
>>>
>>>
>>> We expect that there will be cases where pages are restored
>>> inappropriately where sites are not following good practice. We don't have
>>> an idea of the size of this problem. We will have data from the beta
>>> channel soon that will tell us what the difference would be in terms of
>>> BFCache hit-rate between *monitoring all cookies* and *only monitoring
>>> HttpOnly cookies*. Our thought process looks like this:
>>>
>>> If *monitoring all cookies* already gives us large hit-rate
>>> improvement, *or* the difference between *monitoring all cookies* and *only
>>> monitoring HttpOnly cookies* is small, then we are happy to just be
>>> conservative and go with *monitoring all cookies*. Otherwise*, we would
>>> like to discuss this further.
>>>
>>>
>>> ** "Otherwise" means monitoring all cookies will only give us negligible
>>> cache hit-rate improvement, and monitoring HttpOnly cookies will give us a
>>> much larger increase.*
>>>
>>> Thanks,
>>> Mingyu
>>>
>>> On Wed, Jul 12, 2023 at 12:11 AM Mike Taylor <[email protected]>
>>> wrote:
>>>
>>>> On 7/11/23 2:19 AM, 'Fergal Daly' via blink-dev wrote:
>>>>
>>>> [BCC [email protected]]
>>>>
>>>> On Tue, 11 Jul 2023 at 15:16, Mingyu Lei <[email protected]> wrote:
>>>>
>>>>> +chrome-bfcache <[email protected]>
>>>>>
>>>>> On Tue, Jul 11, 2023 at 1:08 PM Mingyu Lei <[email protected]> wrote:
>>>>>
>>>>>> Contact emails [email protected], [email protected],
>>>>>> [email protected], [email protected]
>>>>>>
>>>>>> Specification
>>>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md
>>>>>>
>>>>>> Design docs
>>>>>> https://docs.google.com/document/d/1qX1w6L6laTzpFTh78dvT7wwC1060Z3he2Azp4BAwsUE/edit?usp=sharing
>>>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md
>>>>>>
>>>>> This is a really well-written explainer, thank you!
>>>>
>>>> One point of clarification:
>>>>
>>>>
>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md#secure-cookies
>>>> references "HTTPS-only" cookies, as well as "secure" vs "insecure" cookies.
>>>> By "HTTPS-only", do you mean a cookie that sets the "secure" attribute
>>>> (including "__Secure-" prefixes), _and_ sets "HttpOnly"? Or something else?
>>>>
>>>> Later in
>>>> https://github.com/fergald/explainer-bfcache-ccns/blob/main/README.md#allow-ccns-documents-to-be-bfcached-without-the-api,
>>>> the proposal is that CCNS pages are safe to bfcache if no "HTTP-only"
>>>> cookies have changed. Are these cookies setting only the "HttpOnly"
>>>> attribute, or is this intended to say "HTTPS-only" as above?
>>>>
>>>> Summary
>>>>>>
>>>>>> A behavior change to safely store (and restore) pages in the
>>>>>> Back/Forward Cache despite the presence of a "Cache-control: no-store" 
>>>>>> HTTP
>>>>>> header on HTTPS pages. This would allow pages to enter BFCache and be
>>>>>> restored as long as there are no changes to cookies or to RPCs using the
>>>>>> `Authorization:` header.
>>>>>>
>>>>>>
>>>>>> Blink component UI>Browser>Navigation>BFCache
>>>>>> <https://bugs.chromium.org/p/chromium/issues/list?q=component:UI%3EBrowser%3ENavigation%3EBFCache>
>>>>>>
>>>>>> Search tags bfcache <https://chromestatus.com/features#tags:bfcache>
>>>>>>
>>>>>> TAG review
>>>>>>
>>>>> I see that
>>>> https://github.com/w3ctag/design-reviews/issues/786#issuecomment-1515742477
>>>> references this work. Did we learn anything from experimentation in the
>>>> wild (not sure if y'all ran an experiment)?
>>>>
>>>>
>>>>>>
>>>>>> TAG review status Not applicable
>>>>>>
>>>>>> Risks
>>>>>>
>>>>>>
>>>>>> Interoperability and Compatibility
>>>>>>
>>>>> I'm curious if y'all have looked at stats on the uptake of
>>>> secure/httponly cookies vs "non-secure" cookies being set by pages returned
>>>> from RPCs sent with an Authorization header (though I wouldn't be surprised
>>>> if we don't have UMA for that... perhaps just globally would be useful to
>>>> consider).
>>>>
>>>> My only concern (which may not be grounded in reality) would be for
>>>> sites not following best practices...
>>>>
>>>>
>>>>>>
>>>>>> *Gecko*: No signal
>>>>>>
>>>>>> *WebKit*: No signal
>>>>>>
>>>>> Can we request signals?
>>>>
>>>>
>>>>>> *Web developers*: No signals
>>>>>>
>>>>>> *Other signals*:
>>>>>>
>>>>>> WebView application risks
>>>>>>
>>>>>> Does this intent deprecate or change behavior of existing APIs, such
>>>>>> that it has potentially high risk for Android WebView-based applications?
>>>>>>
>>>>>> None
>>>>>>
>>>>>>
>>>>>> Debuggability
>>>>>>
>>>>>> Will this feature be supported on all six Blink platforms (Windows,
>>>>>> Mac, Linux, Chrome OS, Android, and Android WebView)? No
>>>>>>
>>>>>> BFCache is not supported on WebView, so this change has no impact
>>>>>> there.
>>>>>>
>>>>>>
>>>>>> Is this feature fully tested by web-platform-tests
>>>>>> <https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md>
>>>>>> ? No
>>>>>>
>>>>>> Flag name on chrome://flags
>>>>>>
>>>>>> Finch feature name CacheControlNoStoreEnterBackForwardCache
>>>>>>
>>>>>> Requires code in //chrome? False
>>>>>>
>>>>>> Tracking bug
>>>>>> https://bugs.chromium.org/p/chromium/issues/detail?id=1228611
>>>>>>
>>>>>> Launch bug https://launch.corp.google.com/launch/4251651
>>>>>>
>>>>>> Estimated milestones
>>>>>> DevTrial on desktop 116
>>>>>> DevTrial on Android 116
>>>>>>
>>>>>> Anticipated spec changes
>>>>>>
>>>>>> Open questions about a feature may be a source of future web compat
>>>>>> or interop issues. Please list open issues (e.g. links to known github
>>>>>> issues in the project for the feature specification) whose resolution may
>>>>>> introduce web compat/interop risk (e.g., changing to naming or structure 
>>>>>> of
>>>>>> the API in a non-backward-compatible way).
>>>>>> None
>>>>>>
>>>>>> Link to entry on the Chrome Platform Status
>>>>>> https://chromestatus.com/feature/6705326844805120
>>>>>>
>>>>>> Links to previous Intent discussions
>>>>>>
>>>>>> This intent message was generated by Chrome Platform Status
>>>>>> <https://chromestatus.com/>.
>>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "blink-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAAozHLkbL7vmubNOsrA2PKngz4xeV%3DXyuLN73oS4XBea50Xe9A%40mail.gmail.com
>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAAozHLkbL7vmubNOsrA2PKngz4xeV%3DXyuLN73oS4XBea50Xe9A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "blink-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAN_fHtnGnrwo9wQXehgHkfoCYXa7icVqivTUC-ZuimKHkGbY1g%40mail.gmail.com
>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAN_fHtnGnrwo9wQXehgHkfoCYXa7icVqivTUC-ZuimKHkGbY1g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAAozHLk9C_nh%3DFPtY_sY5jZRma6G_w2467Uhhp3wrxB8RScq_Q%40mail.gmail.com.

Reply via email to