Dear Mike,

Thank you for taking the time to review this document.

On Mon, Apr 13, 2026 at 07:04:50PM -0700, Mike Bishop via Datatracker wrote:
> -------------------------------------------------------------------------------
> DISCUSS
> -------------------------------------------------------------------------------
> 
> Section 2, paragraph 7
> >    object in large files can be parsed independently.  Files MAY be
> >    compressed with GZIP [RFC1952].
> 
> What is the motivation for specifying GZIP only? You're using HTTP
> -- why not do normal content negotiation and allow clients and
> servers to select any mutually acceptable content encoding? (See
> https://www.rfc-editor.org/rfc/rfc9110.html#name-accept-encoding)

Gzip compression is used in this protocol for a number of reasons.

The data is very large "at rest", and the data compresses very well. For
example, RIPE NCC's Snapshot file in uncompressed form is 5.0GB, while
in Gzip compressed form only 353MB. Because of the unwieldy size of the
data in uncompressed form, very early on in the NRTM v4 design process
the implementers learned that materializing the data in compressed form
substantially reduces disk IO.

An unfortunate reality is that opportunistic compression through HTTP
content encoding negotiation is not ubiquitously available throughout
the ecosystem. Some examples: Akamai's "NodeBalancer" product which
seems to strip off incoming Accept-Encoding headers - thereby neutering
opportunity for content compression. Amazon's "CloudFront" CDN product
appears to not perform on-the-fly compression if the origin object is
larger than ~ 10 megabytes. HTTP Compression is disabled by default in
popular HTTP server implementations such as Nginx & Apache. And so on.

To me it seems that - by and large - the design of HTTP intermediary
elements is based on an assumption that the original form of a resource
will be (somewhat) compressed at the application layer, if the resource
is large.

I would agree with anyone pointing out that nowdays algorithms exist
which yield even better compression ratios than Gzip, but the recurring
cost of storage & repeated transfer of the data in uncompressed form
far far exceeds any potential savings from relying on opportunistic
compression in the HTTP layer in order to shave off an extra single
digit percentage in hopes of negotiating a modern algorithm. It simply
is too costly to even risk handling the data in an uncompressed form.

Therefore, I believe it is a sound protocol design decision to not
rely on opportunistic compression in lower layers of the stack when
beforehand it is known the data will be very large.

I base the above on extensive operational experience with RRDP
[RFC8182], which has some similarities to NRTM v4 (except, in RRDP the
'snapshot' is not natively compressed). In the course of operating
a large fleet of RRDP clients (perhaps one of the world's largest,
www.rpkiviews.org) I've encountered numerous situations where I had
to actively reach out to RRDP server operators and ask them to enable
the offering of HTTP Gzip compression. And, as mentioned before,
because compression is not enabled by default in most HTTP server
implementations, it appears to be a somewhat fragile process to get it
enabled properly. Quite often I had to go back and forth with the RRDP
server operators that "yes, compression is now enabled on IPv4, but not
on the IPv6 endpoints" (operator had forgotten to deploy the config
change to all frontends), or "yes, compression seems to be enabled, but
only appears to be working for small objects", and sometimes "sorry, we
cannot enable compression because $vendor doesnt support it", etc.
See section 4.4 of https://doi.org/10.48550/arXiv.2512.16369 for a
recent analysis of the RPKI ecosystem in this regard.

Gzip also happened to be the de facto standard in previous versions of
NRTM for the 'fetch a full dump' functionality, which is analogous to
NRTMv4 Snapshots.

Finally, of all the IETF-standardized compression algorithms, good ol'
Gzip arguably is the most widely-deployed with implementations
available for any and all systems and architectures. So if one has to
pick _something_ as fixed algorithm, Gzip seems a reasonable choice.

Kind regards,

Job

_______________________________________________
GROW mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to