Hi,
HAProxy 2.7.4 was released on 2023/03/10. It added 110 new commits
after version 2.7.3.
The vast majority of the commits are HTTP3/QUIC updates. However, as
indicated in the 2.8-dev5 announce, a concurrency bug introduced in 2.5
was fixed in this version, that may cause freezes and crashes when some
HTTP/1 backend connections are closed by the server exactly at the same
time they're going to be reused by another thread. Another different bug
also affecting idle connections since 2.2 was fixed, possibly causing an
occasional crash. One possible work-around if you've faced such issues
recently is to disable inter-thread connection reuse with this directive
in the global section:
tune.idle-pool.shared off
But beware that this may increase the total number of connections kept
established with your backend servers depending the reuse frequency and
the number of threads.
In master-worker mode, when performing an upgrade from an old version
(before 1.9) to a newer version (>=2.5) the HAPROXY_PROCESSES environment
variable was missing, and this combined with a missing element in an
internal structure representing old processes will result in a null-deref
which will crash the master process after the reload. It's very unlikely
to hit this one, except during migration attempts where it can make one
think the new version doesn't work, and encourage to roll back to the
older one. The reported uptime for processes was also fixed so that wall
clock time is used instead of the internal timer.
A few issues affecting the Lua mapping of the HTTP client were addressed;
one of them is a small memory leak by which a few bytes could leak per
request, which could become problematic if used heavily. Another one is
a concurrency issue with Lua's garbage collector that didn't sufficiently
lock other threads' items while trying to free them.
A bug in the watchdog in 2.7 could occasionally make the wrong thread
being measured, which could sometimes trigger it on highly asymmetric
loads, such as if frontends are bound to different thread sets and one
saturates the process while the other one remains fully idle.
It was found that the low-latency scheduling of TLS handshakes can
degenerate during extreme loads, and take a long time to recover. The
problem is that in order to prevent TLS handshakes from causing high
latency spikes to the rest of the traffic, they're placed in a dedicated
scheduling class that executes one of them per polling loop. But if there
are too many pending due to a big burst, the extra latency caused to the
pending ones can make clients give up and try again, reaching the point
where none of the processed tasks yields anything useful since they were
already abandonned. Now the number of handshakes per loop will grow as
the number of pending ones grows, and this addresses the problem without
adding extra latency even under extreme loads.
There were as usual a significant number of QUIC updates, aiming at
addressing some issues reported by users, and to continue to improve
reliability and interoperability. Among the visible ones, the client-fin
timeout is now honored and allows to close faster when a last response
was sent if the client disappeared. This could help reduce the number of
apparent concurrent connections. In addition, some improvements were made
on memory usage. Some failures to connect at high rates (such as from
h2load) were finally fixed. The soft-stop is now fully functional when
"tune.quic.socket-owner" is set to "connection". The old process will
then continue to handle its connections and the new one will have its own
connections. Low-level errors are now better handled, with some errors
such as ICMP port unreachable reported by the stack causing an immediate
termination of the connection since it indicates the client has closed
(e.g. clicked stop in browser, or Ctrl-C in Curl).
The cache failed to cache a response for a request that had the "no-cache"
directive (typically a forced reload). This prevented from refreshing the
cache this way, this is now fixed.
An infinite loop could happen on limited listeners (rate-limited or limited
by their maxconn value) due to the loss of some volatile casts during an
API cleanup in 2.7.
In some rare cases it was possible to freeze a compressing stream if there
was exactly one byte left at the end of the buffer, which was insufficient
to place a new HTX block and prevented any progress from being made. This
has been the case since 2.5 so it doesn't seem easy to trigger!
Layer7 retries did not work anymore on the "empty-response" condition due
to a change that was made in 2.4.
The dump of the supported config language keywords with -dK incorrectly
attributed some of the crt-list specific keywords to "bind ... ssl", which
could cause confusion for those designing config parsers or generators by
regularly checking for new stuff there. Now an explicit "crt-list" sub-
section is dumped and "bind ssl" only dumps keywords really supported on
"bind" lines.
The global directive "no numa-cpu-mapping" that forces haproxy to bind to
multiple CPU sockets even if it should result in lower performance was lost
across reloads in master-worker mode, because the master in wait mode
doesn't see it, thus applies the restriction to itself, and that one is
inherited by subsequent masters that pass it to their workers.
And a few other minor updates aside, that's about all. Those with high
request rates or who already noticed crashes or strange errors are strongly
encouraged to update and try again. Those heavily using QUIC as well,
though I suspect that many of them are often on 2.8-dev. 2.7.4 catches up
with 2.8-dev on the QUIC front so if you want something more stable, that's
the way to go.
As mentioned in the 2.8-dev5 announce, we noticed a regression affecting
all 2.7 versions. If you connect to the CLI over a UNIX socket and the
client closes the input channel, the connection will not be closed. Given
that the number of connections on the CLI is limited to 10 by default, it
can quickly happen that the CLI becomes unusable. We'll work on it next
week, but in the mean time it can be prudent to increase that limit a
little bit in your global section:
stats maxconn 100 # for 2.8 <= 2.8-dev5 or 2.7 <= 2.7.4
Just keep this in mind if you're thinking about upgrading from 2.6 to 2.7,
better wait for 2.7.5 for the final deployment in this case. If you're
already on 2.7 and did not notice anything, no need to worry.
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/2.7/src/
Git repository : https://git.haproxy.org/git/haproxy-2.7.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy-2.7.git
Changelog : https://www.haproxy.org/download/2.7/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
Willy
---
Complete changelog :
Amaury Denoyelle (29):
MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame
MINOR: h3: add traces on decode_qcs callback
MINOR: quic: adjust request reject when MUX is already freed
BUG/MINOR: quic: also send RESET_STREAM if MUX released
BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
BUG/MINOR: h3: prevent hypothetical demux failure on int overflow
MEDIUM: h3: enforce GOAWAY by resetting higher unhandled stream
MINOR: mux-quic: define qc_shutdown()
MINOR: mux-quic: define qc_process()
MINOR: mux-quic: implement client-fin timeout
MEDIUM: mux-quic: properly implement soft-stop
MINOR: quic: mark quic-conn as jobs on socket allocation
MEDIUM: quic: trigger fast connection closing on process stopping
MEDIUM: quic: improve fatal error handling on send
MINOR: quic: consider EBADF as critical on send()
MINOR: quic: simplify return path in send functions
MINOR: quic: implement qc_notify_send()
MINOR: quic: purge txbuf before preparing new packets
MEDIUM: quic: implement poller subscribe on sendto error
MINOR: quic: notify on send ready
BUG/MEDIUM: quic: properly handle duplicated STREAM frames
BUG/MINOR: cli: fix CLI handler "set anon global-key" call
BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated
MINOR: h3: add traces on h3_init_uni_stream() error paths
MINOR: quic: create a global list dedicated for closing QUIC conns
MINOR: quic: handle new closing list in show quic
MEDIUM: quic: release closing connections on stopping
Aurelien DARRAGON (3):
BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send()
BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy
BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and
fd_rm_from_fd_list
Christopher Faulet (13):
BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was
reached
DOC: config: Fix description of options about HTTP connection modes
DOC: config: Add the missing tune.fail-alloc option from global listing
REGTESTS: Fix ssl_errors.vtc script to wait for connections close
BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during
parsing
DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section
BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle
list
BUG/MINOR: mux-h1: Don't report an error on an early response close
BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format
body
BUG/MINOR: http-check: Skip C-L header for empty body when it's not
mandatory
BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7
retry
BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data
Frédéric Lécaille (38):
BUG/MINOR: quic: Possible unexpected counter incrementation on send*()
errors
MINOR: quic: Add new traces about by connection RX buffer handling
MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication
deadlock
BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
MINOR: quic: Simplication for qc_set_timer()
MINOR: quic: Kill the connections on ICMP (port unreachable) packet
receipt
MINOR: quic: Add traces to qc_kill_conn()
MINOR: quic: Make qc_dgrams_retransmit() return a status.
BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
MINOR: quic: Add a trace to identify connections which sent Initial
packet.
MINOR: quic: Add <pto_count> to the traces
BUG/MINOR: quic: Do not probe with too little Initial packets
BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
BUG/MINOR: quic: Missing padding for short packets
BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()
BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del()
BUILD: thead: Fix several 32 bits compilation issues with uint64_t
variables
BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
BUG/MINOR: quic: v2 Initial packets decryption failed
MINOR: quic: Add traces about QUIC TLS key update
BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting
frames
BUG/MINOR: quic: Do not resend already acked frames
BUG/MINOR: quic: Missing detections of amplification limit reached
MINOR: quic: Send PING frames when probing Initial packet number space
MINOR: quic: Do not accept wrong active_connection_id_limit values
MINOR: quic: Store the next connection IDs sequence number in the
connection
MINOR: quic: Typo fix for ACK_ECN frame
MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
MINOR: quic: Add spin bit support
MINOR: quic: Add transport parameters to "show quic"
BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
MINOR: quic: Do not stress the peer during retransmissions of lost packets
BUG/MINOR: quic: Missing listener accept queue tasklet wakeups
Michael Prokop (1):
DOC/CLEANUP: fix typos
Remi Tricot-Le Breton (3):
BUG/MINOR: cache: Cache response even if request has "no-cache" directive
BUG/MINOR: cache: Check cache entry is complete in case of Vary
BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback
William Lallemand (8):
BUG/MINOR: mworker: stop doing strtok directly from the env
BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old
versions
BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master
FD is wrong
MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
BUG/MINOR: mworker: prevent incorrect values in uptime
MINOR: ssl: rename confusing ssl_bind_kws
BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
Willy Tarreau (15):
BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping
BUG/MINOR: sched: properly report long_rq when tasks remain in the queue
BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed
MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in
h2s_frt_handle_headers()
BUG/MINOR: ring: do not realign ring contents on resize
BUG/MINOR: fd: used the update list from the fd's group instead of tgid
BUG/MEDIUM: fd: make fd_delete() support being called from a different
group
CLEANUP: listener: only store conn counts for local threads
BUG/MAJOR: fd/thread: fix race between updates and closing FD
MINOR: fd/cli: report the polling mask in "show fd"
BUG/MINOR: init: properly detect NUMA bindings on large systems
BUG/MINOR: thread: report thread and group counts in the correct order
BUG/MAJOR: fd/threads: close a race on closing connections after takeover
MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
---