Dear Mike, Thank you for taking the time to review this document.
On Mon, Apr 13, 2026 at 07:04:50PM -0700, Mike Bishop via Datatracker wrote: > ------------------------------------------------------------------------------- > DISCUSS > ------------------------------------------------------------------------------- > > Section 2, paragraph 7 > > object in large files can be parsed independently. Files MAY be > > compressed with GZIP [RFC1952]. > > What is the motivation for specifying GZIP only? You're using HTTP > -- why not do normal content negotiation and allow clients and > servers to select any mutually acceptable content encoding? (See > https://www.rfc-editor.org/rfc/rfc9110.html#name-accept-encoding) Gzip compression is used in this protocol for a number of reasons. The data is very large "at rest", and the data compresses very well. For example, RIPE NCC's Snapshot file in uncompressed form is 5.0GB, while in Gzip compressed form only 353MB. Because of the unwieldy size of the data in uncompressed form, very early on in the NRTM v4 design process the implementers learned that materializing the data in compressed form substantially reduces disk IO. An unfortunate reality is that opportunistic compression through HTTP content encoding negotiation is not ubiquitously available throughout the ecosystem. Some examples: Akamai's "NodeBalancer" product which seems to strip off incoming Accept-Encoding headers - thereby neutering opportunity for content compression. Amazon's "CloudFront" CDN product appears to not perform on-the-fly compression if the origin object is larger than ~ 10 megabytes. HTTP Compression is disabled by default in popular HTTP server implementations such as Nginx & Apache. And so on. To me it seems that - by and large - the design of HTTP intermediary elements is based on an assumption that the original form of a resource will be (somewhat) compressed at the application layer, if the resource is large. I would agree with anyone pointing out that nowdays algorithms exist which yield even better compression ratios than Gzip, but the recurring cost of storage & repeated transfer of the data in uncompressed form far far exceeds any potential savings from relying on opportunistic compression in the HTTP layer in order to shave off an extra single digit percentage in hopes of negotiating a modern algorithm. It simply is too costly to even risk handling the data in an uncompressed form. Therefore, I believe it is a sound protocol design decision to not rely on opportunistic compression in lower layers of the stack when beforehand it is known the data will be very large. I base the above on extensive operational experience with RRDP [RFC8182], which has some similarities to NRTM v4 (except, in RRDP the 'snapshot' is not natively compressed). In the course of operating a large fleet of RRDP clients (perhaps one of the world's largest, www.rpkiviews.org) I've encountered numerous situations where I had to actively reach out to RRDP server operators and ask them to enable the offering of HTTP Gzip compression. And, as mentioned before, because compression is not enabled by default in most HTTP server implementations, it appears to be a somewhat fragile process to get it enabled properly. Quite often I had to go back and forth with the RRDP server operators that "yes, compression is now enabled on IPv4, but not on the IPv6 endpoints" (operator had forgotten to deploy the config change to all frontends), or "yes, compression seems to be enabled, but only appears to be working for small objects", and sometimes "sorry, we cannot enable compression because $vendor doesnt support it", etc. See section 4.4 of https://doi.org/10.48550/arXiv.2512.16369 for a recent analysis of the RPKI ecosystem in this regard. Gzip also happened to be the de facto standard in previous versions of NRTM for the 'fetch a full dump' functionality, which is analogous to NRTMv4 Snapshots. Finally, of all the IETF-standardized compression algorithms, good ol' Gzip arguably is the most widely-deployed with implementations available for any and all systems and architectures. So if one has to pick _something_ as fixed algorithm, Gzip seems a reasonable choice. Kind regards, Job _______________________________________________ GROW mailing list -- [email protected] To unsubscribe send an email to [email protected]
