Bug#568149: zlib: please allow resuming after EINTR

Jonathan Nieder Wed, 03 Feb 2010 21:31:16 -0800

Hi Mark,

First of all, sorry for the nonsensical original report.


Mark Brown wrote:
> Oh, I doubt anything in gzio.c will ever be touched again if it can
> possibly be avoided on the basis that it's been working for people for
> years and every attempt to change things causes some kind of
> interoperability issue.

That makes sense.  Here’s what I found from skimming over the new
implementation.  Some of these things may be worth fixing, some not:

gz_open (can segfault if it runs out of memory for path)
gz_comp (reports partial writes with "slow" devices as errors.
        This is wrong in two ways: it discards data, and errno
        contains random nonsense.)
gz_load (blocks in non-blocking situations with "slow" devices)
gzprintf (uses gz_zero and gz_comp, inherently nonresumable)
gzclose_r (does not report I/O errors!)
gzclose_w (uses gz_zero, gz_comp; inherently nonresumable)
gzclose (uses gzclose_r, gzclose_w)

gz{_zero,setparams,write,put[sc],flush} (uses gz_comp)
NEXT gz{_{avail,head,next4,decomp,make,skip},read,gets,{un,}getc} (uses gz_load)

Of these issues, the most important to me is the gz_comp one.  I
would like not to discard data.  For my uses, it would be fine to
restart write() to finish up any partial write.

Doing that would break applications that expect compression not to
block for so long, either because they used gzdopen on a file
descriptor with O_NONBLOCK or because the write to a "slow" device was
interrupted by a signal.  Luckily, it is possible to make both these
people and me happy without introducing new API or ABI.

The most flexible thing is to report enough information to allow
resuming after a partial write.  This means:

 - A field is added to gz_state to represent whether a flush is pending.
   gz_comp deals with any pending flush before starting other I/O.
 - gzwrite returns the number of input bytes consumed.  If not all bytes
   reached the file yet, that’s fine, so there is no need to distinguish
   successful writes from partial writes that consumed all input.
 - This would make gzputc and gzputs automatically behave
   appropriately.
 - A short write in gzflush needs to not go undetected.  gzflush should set
   errno = EINTR in this case and return E_ERRNO so the application
   knows it can be resumed.

A fix for the gz_open bug mentioned above follows (warning: untested).

-- %< --
Subject: Check for malloc() failure in gz_open()

---
 gzlib.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/gzlib.c b/gzlib.c
index bcef6c2..b74cb6b 100644
--- a/gzlib.c
+++ b/gzlib.c
@@ -153,6 +153,14 @@ local gzFile gz_open(path, fd, mode, use64)
         return NULL;
     }
 
+    /* save the path name for error messages */
+    state->path = malloc(strlen(path) + 1);
+    if (state->path == NULL) {
+        free(state);
+        return NULL;
+    }
+    strcpy(state->path, path);
+
     /* open the file with the appropriate mode (or just use fd) */
     state->fd = fd != -1 ? fd :
         open(path,
@@ -176,10 +184,6 @@ local gzFile gz_open(path, fd, mode, use64)
     if (state->mode == GZ_APPEND)
         state->mode = GZ_WRITE;         /* simplify later checks */
 
-    /* save the path name for error messages */
-    state->path = malloc(strlen(path) + 1);
-    strcpy(state->path, path);
-
     /* save the current position for rewinding (only if reading) */
     if (state->mode == GZ_READ) {
         state->start = LSEEK(state->fd, 0, SEEK_CUR);



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#568149: zlib: please allow resuming after EINTR

Reply via email to