Package: pv
Version: 1.6.6-1
Severity: normal
Tags: upstream patch

Hello,

This was reported upstream some time ago but it doesn't seem new pv
versions are coming out so reporting here instead, maybe we can do
something about it in debian.

This was found on ubuntu initially, sorry I haven't tested debian per se
but from what I can see it's affected as well as package sources are
identical.

Consider following scenario:

I was running a really long task that generates a 800Mb file at about
50k/s. Something like:

   while true ; do ls -l /* ; sleep 1 ; done > /tmp/file

Sometime after it started I thought of using pv to keep track of
progress with:

   tail -n +1 -f /tmp/file | pv -s 800m >/dev/null

I was interested in ETA mostly, unfortunately if you do this after a few
mb have been generated the ETA is wildly wrong for a very long time:
transfer rate is really fast initially and fools pv into thinking we're
going to get something like that over time, and it takes a really long
time for the average rate to drop back to 50k/s which is all we're
getting actually.

It'd be nice to have an ETA based on current rate average here, say over
last 30s or last 10s, instead of global rate average.

I've been playing with this patch on top of pv 1.6.6
(also at https://github.com/lemonsqueeze/pv)

Which keeps track of history periodically over a given time window and uses
that to compute current average rate for ETA. Looks like it works pretty
well in this case. Also added a --rate-window option to change time window
(default: last 10s) (name isn't great, maybe could find a better one).


-- System Information:
Debian Release: stretch/sid
  APT prefers xenial-updates
  APT policy: (500, 'xenial-updates'), (500, 'xenial-security'), (500, 
'xenial'), (100, 'xenial-backports')
Architecture: i386 (i686)

Kernel: Linux 4.4.0-174-generic (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages pv depends on:
ii  libc6  2.23-0ubuntu11

pv recommends no packages.

Versions of packages pv suggests:
pn  doc-base  <none>

-- no debconf information
diff --git a/doc/quickref.1.in b/doc/quickref.1.in
index 5245e3f..3a0835c 100644
--- a/doc/quickref.1.in
+++ b/doc/quickref.1.in
@@ -263,6 +263,11 @@ Wait
 seconds between updates.  The default is to update every second.
 Note that this can be a decimal such as 0.1.
 .TP
+.B \-m SEC, \-\-rate-window SEC
+Compute current average rate over a
+.B SEC
+seconds window for ETA (default 10s).
+.TP
 .B \-w WIDTH, \-\-width WIDTH
 Assume the terminal is
 .B WIDTH
diff --git a/src/include/options.h b/src/include/options.h
index 0f8029e..c2834bb 100644
--- a/src/include/options.h
+++ b/src/include/options.h
@@ -42,6 +42,7 @@ struct opts_s {           /* structure describing run-time 
options */
        double delay_start;            /* delay before first display */
        unsigned int watch_pid;        /* process to watch fds of */
        int watch_fd;                  /* fd to watch */
+       unsigned int rate_window;      /* time window in seconds for current 
average rate */
        unsigned int width;            /* screen width */
        unsigned int height;           /* screen height */
        char *name;                    /* process name, if any */
diff --git a/src/include/pv-internal.h b/src/include/pv-internal.h
index 46d7496..9b49ba3 100644
--- a/src/include/pv-internal.h
+++ b/src/include/pv-internal.h
@@ -44,7 +44,13 @@ extern "C" {
 
 #define MAXIMISE_BUFFER_FILL   1
 
+       
+typedef struct pvhistory {
+       long long   total_bytes;
+       long double elapsed_sec;
+} pvhistory_t;         
 
+       
 /*
  * Structure for holding PV internal state. Opaque outside the PV library.
  */
@@ -113,6 +119,15 @@ struct pvstate_s {
        long double prev_elapsed_sec;
        long double prev_rate;
        long double prev_trans;
+
+       /* Keep track of progress over last intervals to compute current 
average rate. */
+       pvhistory_t *history;            /* state at previous intervals 
(circular buffer) */
+       int history_len;                 /* total size */
+       int history_interval;            /* seconds between each history entry 
*/
+       int history_first;
+       int history_last;
+       long double current_avg_rate;    /* current average rate over last 
history intervals */
+       
        unsigned long long initial_offset;
        char *display_buffer;
        long display_buffer_size;
diff --git a/src/include/pv.h b/src/include/pv.h
index 7a3970c..a93faf4 100644
--- a/src/include/pv.h
+++ b/src/include/pv.h
@@ -93,6 +93,7 @@ extern void pv_state_name_set(pvstate_t, const char *);
 extern void pv_state_format_string_set(pvstate_t, const char *);
 extern void pv_state_watch_pid_set(pvstate_t, unsigned int);
 extern void pv_state_watch_fd_set(pvstate_t, int);
+extern void pv_state_rate_window_set(pvstate_t, int);
 
 extern void pv_state_inputfiles(pvstate_t, int, const char **);
 
diff --git a/src/main/help.c b/src/main/help.c
index d33bdec..857182b 100644
--- a/src/main/help.c
+++ b/src/main/help.c
@@ -40,6 +40,8 @@ void display_help(void)
                 N_("show data transfer rate counter")},
                {"-a", "--average-rate", 0,
                 N_("show data transfer average rate counter")},
+               {"-m", "--rate-window", N_("SEC"),
+                N_("compute current average rate over a SEC seconds window for 
ETA (default 10s)")},
                {"-b", "--bytes", 0,
                 N_("show number of bytes transferred")},
                {"-T", "--buffer-percent", 0,
diff --git a/src/main/main.c b/src/main/main.c
index 24366ee..815d92a 100644
--- a/src/main/main.c
+++ b/src/main/main.c
@@ -202,6 +202,7 @@ int main(int argc, char **argv)
        pv_state_format_string_set(state, opts->format);
        pv_state_watch_pid_set(state, opts->watch_pid);
        pv_state_watch_fd_set(state, opts->watch_fd);
+       pv_state_rate_window_set(state, opts->rate_window);
 
        pv_state_set_format(state, opts->progress, opts->timer, opts->eta,
                            opts->fineta, opts->rate, opts->average_rate,
diff --git a/src/main/options.c b/src/main/options.c
index 847d63c..6fecaac 100644
--- a/src/main/options.c
+++ b/src/main/options.c
@@ -80,12 +80,13 @@ opts_t opts_parse(int argc, char **argv)
                {"remote", 1, 0, 'R'},
                {"pidfile", 1, 0, 'P'},
                {"watchfd", 1, 0, 'd'},
+               {"rate-window", 1, 0, 'm'},
                {0, 0, 0, 0}
        };
        int option_index = 0;
 #endif
        char *short_options =
-           "hVpteIrabTA:fnqcWD:s:l0i:w:H:N:F:L:B:CESR:P:d:";
+           "hVpteIrabTA:fnqcWD:s:l0i:w:H:N:F:L:B:CESR:P:d:m:";
        int c, numopts;
        unsigned int check_pid;
        int check_fd;
@@ -124,6 +125,7 @@ opts_t opts_parse(int argc, char **argv)
        opts->delay_start = 0;
        opts->watch_pid = 0;
        opts->watch_fd = -1;
+       opts->rate_window = 10;
 
        do {
 #ifdef HAVE_GETOPT_LONG
@@ -147,6 +149,7 @@ opts_t opts_parse(int argc, char **argv)
                case 'L':
                case 'B':
                case 'R':
+               case 'm':
                        if (pv_getnum_check(optarg, PV_NUMTYPE_INTEGER) !=
                            0) {
                                fprintf(stderr, "%s: -%c: %s\n",
@@ -311,6 +314,9 @@ opts_t opts_parse(int argc, char **argv)
                        sscanf(optarg, "%u:%d", &(opts->watch_pid),
                               &(opts->watch_fd));
                        break;
+               case 'm':
+                       opts->rate_window = pv_getnum_i(optarg);
+                       break;
                default:
 #ifdef HAVE_GETOPT_LONG
                        fprintf(stderr,
diff --git a/src/pv/display.c b/src/pv/display.c
index 2627152..e2ca524 100644
--- a/src/pv/display.c
+++ b/src/pv/display.c
@@ -74,16 +74,14 @@ static long pv__calc_percentage(long long so_far, const 
long long total)
  * number of seconds until completion.
  */
 static long pv__calc_eta(const long long so_far, const long long total,
-                        const long elapsed)
+                        const long rate)
 {
        long long amount_left;
 
        if (so_far < 1)
                return 0;
 
-       amount_left = total - so_far;
-       amount_left *= (long long) elapsed;
-       amount_left /= so_far;
+       amount_left = (total - so_far) / rate;
 
        return (long) amount_left;
 }
@@ -405,6 +403,38 @@ static long bound_long(long x, long min, long max)
        return x < min ? min : x > max ? max : x;
 }
 
+/* Update history and current average rate */
+static void update_history_avg_rate(pvstate_t state, long long total_bytes,
+                                   long double elapsed_sec, long double rate)
+{
+       int first = state->history_first;
+       int last = state->history_last;
+       long double last_elapsed = state->history[last].elapsed_sec;
+
+       if (!(last_elapsed == 0.0 ||  /* Empty */
+             elapsed_sec > last_elapsed + state->history_interval))
+               return;
+       
+       if (last_elapsed) { /* Not empty, add new entry in circular buffer */
+               int len = state->history_len;
+               state->history_last = last = (last + 1) % len;
+               if (last == first)
+                       state->history_first = first = (first + 1) % len;
+       }
+               
+       state->history[last].elapsed_sec = elapsed_sec;
+       state->history[last].total_bytes = total_bytes;
+       
+       if (first == last)
+               state->current_avg_rate = rate;
+       else {
+               long long bytes = (state->history[last].total_bytes -
+                                  state->history[first].total_bytes);
+               long double sec = (state->history[last].elapsed_sec -
+                                  state->history[first].elapsed_sec);
+               state->current_avg_rate = bytes / sec;
+       }
+}
 
 /*
  * Return a pointer to a string (which must not be freed), containing status
@@ -465,23 +495,9 @@ static char *pv__format(pvstate_t state,
        }
        state->prev_rate = rate;
 
-       /*
-        * We only calculate the overall average rate if this is the last
-        * update or if the average rate display is enabled. Otherwise it's
-        * not worth the extra CPU cycles.
-        */
-       if ((bytes_since_last < 0)
-           || ((state->components_used & PV_DISPLAY_AVERAGERATE) != 0)) {
-               /* Sanity check to avoid division by zero */
-               if (elapsed_sec < 0.000001)
-                       elapsed_sec = 0.000001;
-               average_rate =
-                   (((long double) total_bytes) -
-                    ((long double) state->initial_offset)) /
-                   (long double) elapsed_sec;
-               if (bytes_since_last < 0)
-                       rate = average_rate;
-       }
+       /* Update history and current average rate for ETA. */
+       update_history_avg_rate(state, total_bytes, elapsed_sec, rate);
+       average_rate = state->current_avg_rate;
 
        if (state->size <= 0) {
                /*
@@ -654,7 +670,7 @@ static char *pv__format(pvstate_t state,
                eta =
                    pv__calc_eta(total_bytes - state->initial_offset,
                                 state->size - state->initial_offset,
-                                elapsed_sec);
+                                state->current_avg_rate);
 
                /*
                 * Bounds check, so we don't overrun the suffix buffer. This
@@ -706,7 +722,7 @@ static char *pv__format(pvstate_t state,
                eta =
                    pv__calc_eta(total_bytes - state->initial_offset,
                                 state->size - state->initial_offset,
-                                elapsed_sec);
+                                state->current_avg_rate);
 
                /*
                 * Bounds check, so we don't overrun the suffix buffer. This
diff --git a/src/pv/state.c b/src/pv/state.c
index 5673c5b..6ed7137 100644
--- a/src/pv/state.c
+++ b/src/pv/state.c
@@ -7,6 +7,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <assert.h>
 
 
 /*
@@ -56,11 +57,28 @@ void pv_state_free(pvstate_t state)
                free(state->transfer_buffer);
        state->transfer_buffer = NULL;
 
+       if (state->history)
+               free(state->history);
+       state->history = NULL;
+       
        free(state);
 
        return;
 }
 
+/* alloc / realloc history buffer */
+static void pv_alloc_history(pvstate_t state)
+{
+       if (state->history)
+               free(state->history);
+       
+       assert(state->history_len);
+       assert(state->history_interval);
+
+       state->history = calloc(state->history_len, sizeof(state->history[0]));
+       state->history_first = state->history_last = 0;
+       state->history[0].elapsed_sec = 0.0;  /* to be safe, memset() not 
recommended for doubles */
+}
 
 /*
  * Set the formatting string, given a set of old-style formatting options.
@@ -205,6 +223,19 @@ void pv_state_watch_fd_set(pvstate_t state, int val)
        state->watch_fd = val;
 };
 
+void pv_state_rate_window_set(pvstate_t state, int val)
+{      
+       if (val >= 20) {
+               state->history_len = val / 5 + 1;
+               state->history_interval = 5;
+       }
+       else {
+               state->history_len = val + 1;
+               state->history_interval = 1;            
+       }
+       pv_alloc_history(state);
+};
+
 
 /*
  * Set the array of input files.

Reply via email to