Hello,

After a few months, I reupload the patch to enable httpd static 
compression using "location {}" instructions.

I use it without any issue on my own website and to serve 
https://webzine.pufy.cafe.
Anyone else tried it?

I emphasize on the fact it is admin responsibility to enable or not 
this feature ans webmaster's to deliver gzipped files.

Regards.

prx


* Ingo Schwarze <schwarzeusta!de> le [05-11-2021 13:37:15 +0000]:
> Hi Theo,
> 
> Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> > prx <p...@si3t.ch> wrote:
> >> On 2021/11/04 14:21, prx wrote:
> 
> >>> The attached patch add support for static gzip compression.
> >>> 
> >>> In other words, if a client support gzip compression, when "file" is
> >>> requested, httpd will check if "file.gz" is avaiable to serve.
> 
> >> This diff doesn't compress "on the fly".
> >> It's up to the webmaster to compress files **before** serving them.
> 
> > Does any other program work this way?
> 
> Yes.  The man(1) program does.  At least on the vast majority of
> Linux systems (at least those using the man-db implementation
> of man(1)), on FreeBSD, and on DragonFly BSD.
> 
> Those systems store most manual pages as gzipped man(7) and mdoc(7)
> files, and man(1) decompresses them every time a user wants to look
> at one of them.  You say "man ls", and what you get is actually
> /usr/share/man/man1/ls.1.gz or something like that.
> 
> For man(1), that is next to useless because du -sh /usr/share/man =
> 42.6M uncompressed.  But it has repeatedly caused bugs in the past.
> I would love to remove the feature from mandoc, but even though it is
> rarely used in OpenBSD (some ports installed gzipped manuals in the
> past, but i think the ports tree has been clean now for quite some
> time; you might still need the feature when installing software
> or unpacking foreign manual page packages without using ports)
> it would be a bad idea to remove it because it is too widely used
> elsewhere.  Note that even the old BSD man(1) supported it.
> 
> > Where you request one filename, and it gives you another?
> 
> You did not ask what web servers do, but we are discussing a patch to
> a webserver.  For this reason, let me note in passing that even some
> otherwise extremely useful sites get it very wrong the other way round:
> 
>  $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> Trying 130.89.148.77...
> Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> 100% |**************************************************|  8050       00:00   
>  
> 8050 bytes received in 0.00 seconds (11.74 MB/s)
>  $ file ls.1.en.gz
> ls.1.en.gz: troff or preprocessor input text
>  $ grep -A 1 '^.SH NAME' ls.1.en.gz  
> .SH NAME
> ls \- list directory contents
>  $ gunzip ls.1.en.gz                                            
> gunzip: ls.1.en.gz: unrecognized file format
> 
> > I have a difficult time understanding why gzip has to sneak it's way
> > into everything.
> > 
> > I always prefer software that does precisely what I expect it to do.
> 
> Certainly.
> 
> I have no strong opinion whether a webserver qualifies as "everything",
> though, nor did i look at the diff.  While all manpages are small in the
> real world, some web servers may have to store huge amounts of data that
> clients might request, so disk space might occasionally be an issue on
> a web server even in 2021.  Also, some websites deliver huge amounts of
> data to the client even when the user merely asked for some text (not sure
> such sites would consider running OpenBSD httpd(8), but whatever :) - when
> browsing the web, bandwidth is still occasionally an issue even in 2021,
> even though that is a rather absurd fact.
> 
> Yours,
>   Ingo
Index: httpd.conf.5
===================================================================
RCS file: /cvs/src/usr.sbin/httpd/httpd.conf.5,v
retrieving revision 1.119
diff -u -p -r1.119 httpd.conf.5
--- httpd.conf.5        24 Oct 2021 16:01:04 -0000      1.119
+++ httpd.conf.5        5 Nov 2021 14:04:22 -0000
@@ -425,6 +425,10 @@ A variable that is set to a comma separa
 features in use
 .Pq omitted when TLS client verification is not in use .
 .El
+.It Ic gzip_static
+Enable static gzip compression.
+.Pp
+When a file is requested, serves the file with .gz added to its path if it 
exists.
 .It Ic hsts Oo Ar option Oc
 Enable HTTP Strict Transport Security.
 Valid options are:
Index: httpd.h
===================================================================
RCS file: /cvs/src/usr.sbin/httpd/httpd.h,v
retrieving revision 1.158
diff -u -p -r1.158 httpd.h
--- httpd.h     24 Oct 2021 16:01:04 -0000      1.158
+++ httpd.h     5 Nov 2021 14:04:22 -0000
@@ -87,6 +87,7 @@
 #define SERVER_DEF_TLS_LIFETIME        (2 * 3600)
 #define SERVER_MIN_TLS_LIFETIME        (60)
 #define SERVER_MAX_TLS_LIFETIME        (24 * 3600)
+#define SERVER_DEFAULT_GZIP_STATIC 0
 
 #define MEDIATYPE_NAMEMAX      128     /* file name extension */
 #define MEDIATYPE_TYPEMAX      64      /* length of type/subtype */
@@ -546,6 +547,7 @@ struct server_config {
        struct server_fcgiparams fcgiparams;
        int                      fcgistrip;
        char                     errdocroot[HTTPD_ERRDOCROOT_MAX];
+       int                      gzip_static;
 
        TAILQ_ENTRY(server_config) entry;
 };
Index: parse.y
===================================================================
RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.127
diff -u -p -r1.127 parse.y
--- parse.y     24 Oct 2021 16:01:04 -0000      1.127
+++ parse.y     5 Nov 2021 14:04:22 -0000
@@ -141,7 +141,7 @@ typedef struct {
 %token TIMEOUT TLS TYPE TYPES HSTS MAXAGE SUBDOMAINS DEFAULT PRELOAD REQUEST
 %token ERROR INCLUDE AUTHENTICATE WITH BLOCK DROP RETURN PASS REWRITE
 %token CA CLIENT CRL OPTIONAL PARAM FORWARDED FOUND NOT
-%token ERRDOCS
+%token ERRDOCS GZIPSTATIC
 %token <v.string>      STRING
 %token  <v.number>     NUMBER
 %type  <v.port>        port
@@ -553,6 +553,7 @@ serveroptsl : LISTEN ON STRING opttls po
                | logformat
                | fastcgi
                | authenticate
+               | gzip_static
                | filter
                | LOCATION optfound optmatch STRING     {
                        struct server           *s;
@@ -1217,6 +1218,14 @@ fcgiport : NUMBER                {
                }
                ;
 
+gzip_static    : NO GZIPSTATIC                 {
+                       srv->srv_conf.gzip_static = SERVER_DEFAULT_GZIP_STATIC;
+               }
+               | GZIPSTATIC {
+                       srv->srv_conf.gzip_static = 1;
+               }
+               ;
+
 tcpip          : TCP '{' optnl tcpflags_l '}'
                | TCP tcpflags
                ;
@@ -1441,6 +1450,7 @@ lookup(char *s)
                { "fastcgi",            FCGI },
                { "forwarded",          FORWARDED },
                { "found",              FOUND },
+               { "gzip_static",        GZIPSTATIC },
                { "hsts",               HSTS },
                { "include",            INCLUDE },
                { "index",              INDEX },
Index: server_file.c
===================================================================
RCS file: /cvs/src/usr.sbin/httpd/server_file.c,v
retrieving revision 1.70
diff -u -p -r1.70 server_file.c
--- server_file.c       29 Apr 2021 18:23:07 -0000      1.70
+++ server_file.c       5 Nov 2021 14:04:22 -0000
@@ -229,20 +229,49 @@ server_file_request(struct httpd *env, s
                goto abort;
        }
 
+       media = media_find_config(env, srv_conf, path);
+
        if ((ret = server_file_modified_since(clt->clt_descreq, st)) != -1) {
                /* send the header without a body */
-               media = media_find_config(env, srv_conf, path);
                if ((ret = server_response_http(clt, ret, media, -1,
                    MINIMUM(time(NULL), st->st_mtim.tv_sec))) == -1)
                        goto fail;
                goto done;
        }
 
+       /* change path to path.gz if necessary. */
+       if (srv_conf->gzip_static) {
+               struct http_descriptor  *req = clt->clt_descreq;
+               struct http_descriptor  *resp = clt->clt_descresp;
+               struct stat             gzst;
+               struct kv               *r, key;
+               char                    gzpath[PATH_MAX];
+
+               /* check Accept-Encoding header */
+               key.kv_key = "Accept-Encoding";
+               r = kv_find(&req->http_headers, &key);
+
+               if (r != NULL) {
+                       if (strstr(r->kv_value, "gzip") != NULL) {
+                               /* append ".gz" to path and check existence */
+                               strlcpy(gzpath, path, sizeof(gzpath));
+                               strlcat(gzpath, ".gz", sizeof(gzpath));
+
+                               if ((access(gzpath, R_OK) == 0) &&
+                                       (stat(gzpath, &gzst) == 0)) {
+                                       path = gzpath;
+                                       st = &gzst;
+                                       kv_add(&resp->http_headers,
+                                               "Content-Encoding", "gzip");
+                               }
+                       }
+               }
+       }
+
        /* Now open the file, should be readable or we have another problem */
        if ((fd = open(path, O_RDONLY)) == -1)
                goto abort;
 
-       media = media_find_config(env, srv_conf, path);
        ret = server_response_http(clt, 200, media, st->st_size,
            MINIMUM(time(NULL), st->st_mtim.tv_sec));
        switch (ret) {

Reply via email to