Hi,
HAProxy 2.8-dev5 was released on 2023/03/10. It added 199 new commits
after version 2.8-dev4.
This version got a bit delayed due to spending a full week on a difficult
concurrency bug that was introduced during the 2.5 development cycle and
that can definitely explain some of the occasional strange reports we've
seen from time to time (connections not dying, crashes or CPU usage peaks).
The issue happened when the file descriptor management API was reworked to
introduce support for thread groups. A tiny race that was not possible
before was introduced and could occasionally permit a file descriptor to
be processed immediately after it was migrated to another thread in case
an I/O happened at the same moment (in fact most exclusively a persistent
connection to a backend server that was dropped by the server during its
migration). Due to the way FDs are allocated, it could even happen quite
often that the FD was immediately reassigned for another (or the same)
outgoing server or for an incoming request, possibly also leading to
wrong events being reported there (e.g. connection errors being delivered
on the next connection). Interestingly, the support for thread groups in
2.7 that required refcounting offered more possibilities to fix it but
increased the effect of the race so that it's easy to see frozen
connections. However, not having this refcounting prior to 2.7 made it
almost impossible to fix the issue in 2.5 or 2.6, requiring to backport
some of these mechanisms there in order to close the race (more on that
on the respective announce messages).
In short, I'll ask that those who face request timeouts on servers, or
abnormal CPU peaks from time to time, or even strange crashes whose
backtrace shows fd_update_events() try again on the updated versions.
Other old issues were addressed such as a possible infinite loop when
a listener gets rate-limited or is used at its maxconn at a high rate.
Enough speaking of the bugs, a total of 74 were fixed in this version
anyway.
Among the structural changes for heading for 2.8, Christopher's changes
to continue to improve error reporting between internal layers got another
update. As usual, lots of testing, no regressions expected, etc etc, but
please report anything strange (e.g. change of status flags in logs or
connections staying in CLOSE_WAIT).
Aurélien addressed some structural limitations of how listeners are
suspended and resumed during a failed reload. For example if an abns
listener couldn't be resumed (since they don't support pause and need
to be stopped), this could trigger a crash, which is not exactly what
you want when a new process failed to start given that it may indicate
a faulty config. Some of these might be backported to stable versions
after some observation time.
QUIC's error handling was improved, including at the socket level. There
are now less losses thanks to the sender now subscribing to the poller.
There are also a number of small improvements that I'm totally unable to
explain, but which resulted in both the interop and quic tracker tools
to report even less failures. Now we're at a point where haproxy is among
the most successful stacks on both sites, this is great!
The config predicates used with .if or -cc on the command line now got
two new functions, "enabled(name)" to test if a runtime feature is enabled
(e.g. "SPLICE" etc), and "strstr(subject,patter)" that is convenient to
check for the presence of some patterns in environment variables. By the
way a new config-time environment variable $HAPROXY_BRANCH now contains
the current branch. This is helpful during migrations to switch certain
options to one version or the other.
JWT now supports RSA-PSS signatures, which could report an "Unmanaged
algorithm" error before.
Also, the "option httpclose" used to cause some trouble for some time,
since for a long time the union of the frontend's and the backend's were
used when deciding how to handle a backend connection. While this used to
make sense before 1.8 where the same stream was reset and recycled for
all subsequent requests, it has become completely counter-intuitive now
to imagine that "option httpclose" in a frontend will result in the
backend connection to be killed after the response. Given that explanations
starting with "well, this is for historical reasons" are generally wrong,
it was about time to address this one and do what users think it does (and
update the doc to reflect this and remove the exception).
Rémi merged some OCSP update patches as well. There are status counters
and info that are dumped in "show ssl crt-list". Also the new "show ssl
ocsp-updates" report new info. All these automated updates will now have
their own log format.
Finally some new regtests were added and others updated.
Oh, by the way, we noticed a regression that affects 2.7 and 2.8. If you
connect to the CLI over a UNIX socket and the client closes the input
channel, the connection will not be closed. Given that the number of
connections on the CLI is limited to 10 by default, it can quickly happen
that the CLI becomes unusable. We'll work on it next week, but in the mean
time it can be prudent to increase that limit a little bit in your global
section:
stats maxconn 100 # for 2.8 <= 2.8-dev5 or 2.7 <= 2.7.10
Some closing words, we're already in March, time flies. If you have started
changes that you would like to see merged, please at least make them public
before the end of the month so that we can use the remaining two months to
stabilize everything once integrated together.
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/2.8/src/
Git repository : https://git.haproxy.org/git/haproxy.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy.git
Changelog : https://www.haproxy.org/download/2.8/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
Willy
---
Complete changelog :
Amaury Denoyelle (30):
MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame
MINOR: h3: add traces on decode_qcs callback
MINOR: quic: adjust request reject when MUX is already freed
BUG/MINOR: quic: also send RESET_STREAM if MUX released
BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
BUG/MINOR: h3: prevent hypothetical demux failure on int overflow
MEDIUM: h3: enforce GOAWAY by resetting higher unhandled stream
MINOR: mux-quic: define qc_shutdown()
MINOR: mux-quic: define qc_process()
MINOR: mux-quic: implement client-fin timeout
MEDIUM: mux-quic: properly implement soft-stop
MINOR: quic: mark quic-conn as jobs on socket allocation
MEDIUM: quic: trigger fast connection closing on process stopping
MEDIUM: quic: improve fatal error handling on send
MINOR: quic: consider EBADF as critical on send()
MINOR: quic: simplify return path in send functions
MINOR: quic: implement qc_notify_send()
MINOR: quic: purge txbuf before preparing new packets
MEDIUM: quic: implement poller subscribe on sendto error
MINOR: quic: notify on send ready
BUG/MEDIUM: quic: properly handle duplicated STREAM frames
BUG/MINOR: cli: fix CLI handler "set anon global-key" call
BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
BUG/MEDIUM: dns: ensure ring offset is properly reajusted to head
BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated
MINOR: h3: add traces on h3_init_uni_stream() error paths
MINOR: quic: create a global list dedicated for closing QUIC conns
MINOR: quic: handle new closing list in show quic
MEDIUM: quic: release closing connections on stopping
Aurelien DARRAGON (23):
BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send()
BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy
BUG/MINOR: proto_ux: report correct error when bind_listener fails
BUG/MINOR: protocol: fix minor memory leak in protocol_bind_all()
MINOR: proto_uxst: add resume method
MINOR: listener/api: add lli hint to listener functions
MINOR: listener: add relax_listener() function
MINOR: listener: workaround for closing a tiny race between
resume_listener() and stopping
MINOR: listener: make sure we don't pause/resume bypassed listeners
BUG/MEDIUM: listener: fix pause_listener() suspend return value handling
BUG/MINOR: listener: fix resume_listener() resume return value handling
BUG/MEDIUM: resume from LI_ASSIGNED in default_resume_listener()
MINOR: listener: pause_listener() becomes suspend_listener()
BUG/MEDIUM: listener/proxy: fix listeners notify for proxy resume
BUG/MINOR: sock_unix: match finalname with tempname in sock_unix_addrcmp()
MEDIUM: proto_ux: properly suspend named UNIX listeners
MINOR: proto_ux: ability to dump ABNS names in error messages
MINOR: haproxy: always protocol unbind on startup error path
BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and
fd_rm_from_fd_list
MINOR: http_ext: adding some documentation, forgot to inline function
BUG/MEDIUM: sink/forwarder: ensure ring offset is properly readjusted to
head
BUG/MINOR: dns: fix ring offset calculation on first read
BUG/MINOR: dns: fix ring offset calculation in dns_resolve_send()
Christopher Faulet (52):
BUG/MEDIUM: http-ana: Detect closed SC on opposite side during body
forwarding
BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was
reached
MINOR: global: Add an option to disable the data fast-forward
MINOR: haproxy: Add an command option to disable data fast-forward
REGTESTS: Remove unsupported feature command in http_splicing.vtc
DEBUG: stream: Add a BUG_ON to never exit process_stream with an expired
task
DOC: config: Fix description of options about HTTP connection modes
MINOR: proxy: Only consider backend httpclose option for server
connections
BUG/MINOR: haproxy: Fix option to disable the fast-forward
DOC: config: Add the missing tune.fail-alloc option from global listing
MINOR: cfgcond: Implement strstr condition expression
MINOR: cfgcond: Implement enabled condition expression
REGTESTS: Skip http_splicing.vtc script if fast-forward is disabled
REGTESTS: Fix ssl_errors.vtc script to wait for connections close
MEDIUM: channel: Remove CF_READ_NOEXP flag
MAJOR: channel: Remove flags to report READ or WRITE errors
DEBUG: stream/trace: Add sedesc flags in trace messages
MINOR: channel/stconn: Move rto/wto from the channel to the stconn
MEDIUM: channel/stconn: Move rex/wex timer from the channel to the sedesc
MEDIUM: stconn: Don't requeue the stream's task after I/O
MEDIUM: stconn: Replace read and write timeouts by a unique I/O timeout
MEDIUM: stconn: Add two date to track successful reads and blocked sends
MINOR: applet/stconn: Add a SE flag to specify an endpoint does not
expect data
MAJOR: stream: Use SE descriptor date to detect read/write timeouts
MINOR: stream: Dump the task expiration date in trace messages
MINOR: stream: Report rex/wex value using the sedesc date in trace
messages
MINOR: stream: Use relative expiration date in trace messages
MINOR: stconn: Always report READ/WRITE event on shutr/shutw
CLEANUP: stconn: Remove old read and write expiration dates
MINOR: stconn: Set half-close timeout using proxy settings
MINOR: stconn: Remove half-closed timeout
REGTESTS: cache: Use rxresphdrs to only get headers for 304 responses
MINOR: stconn: Add functions to set/clear SE_FL_EXP_NO_DATA flag from
endpoint
BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during
parsing
BUG/MINOR: stream: Remove BUG_ON about the task expiration in
process_stream()
MINOR: stream: Handle stream's timeouts in a dedicated function
MEDIUM: stream: Eventually handle stream timeouts when exiting
process_stream()
MINOR: stconn: Report a send activity when endpoint is willing to consume
data
BUG/MEDIUM: stconn: Report a blocked send if some output data are not
consumed
MEDIUM: mux-h1: Don't expect data from server as long as request is
unfinished
MEDIUM: mux-h2: Don't expect data from server as long as request is
unfinished
MEDIUM: mux-quic: Don't expect data from server as long as request is
unfinished
DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section
DOC: config: Replace TABs by spaces
BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle
list
BUG/MINOR: mux-h1: Don't report an error on an early response close
BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format
body
BUG/MINOR: http-check: Skip C-L header for empty body when it's not
mandatory
BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7
retry
BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
BUG/MEDIUM: http-ana: Don't close request side when waiting for response
BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data
Frédéric Lécaille (38):
BUG/MINOR: quic: Possible unexpected counter incrementation on send*()
errors
MINOR: quic: Add new traces about by connection RX buffer handling
MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication
deadlock
BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
MINOR: quic: Simplication for qc_set_timer()
MINOR: quic: Kill the connections on ICMP (port unreachable) packet
receipt
MINOR: quic: Add traces to qc_kill_conn()
MINOR: quic: Make qc_dgrams_retransmit() return a status.
BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
MINOR: quic: Add a trace to identify connections which sent Initial
packet.
MINOR: quic: Add <pto_count> to the traces
BUG/MINOR: quic: Do not probe with too little Initial packets
BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
BUG/MINOR: quic: Missing padding for short packets
BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts()
BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del()
BUILD: thead: Fix several 32 bits compilation issues with uint64_t
variables
BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams
BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
BUG/MINOR: quic: v2 Initial packets decryption failed
MINOR: quic: Add traces about QUIC TLS key update
BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting
frames
BUG/MINOR: quic: Do not resend already acked frames
BUG/MINOR: quic: Missing detections of amplification limit reached
MINOR: quic: Send PING frames when probing Initial packet number space
MINOR: quic: Do not accept wrong active_connection_id_limit values
MINOR: quic: Store the next connection IDs sequence number in the
connection
MINOR: quic: Typo fix for ACK_ECN frame
MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX)
MINOR: quic: Useless TLS context allocations in qc_do_rm_hp()
MINOR: quic: Add spin bit support
MINOR: quic: Add transport parameters to "show quic"
BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check
MINOR: quic: Do not stress the peer during retransmissions of lost packets
BUG/MINOR: quic: Missing listener accept queue tasklet wakeups
Michael Prokop (1):
DOC/CLEANUP: fix typos
Oto Valek (2):
BUG/MINOR: http-fetch: recognize IPv6 addresses in square brackets in
req.hdr_ip()
REGTEST: added tests covering smp_fetch_hdr_ip()
Remi Tricot-Le Breton (21):
BUG/MINOR: cache: Cache response even if request has "no-cache" directive
BUG/MINOR: cache: Check cache entry is complete in case of Vary
MINOR: ssl: Destroy ocsp update http_client during cleanup
MINOR: ssl: Reinsert ocsp update entries later in case of unknown error
MINOR: ssl: Add ocsp update success/failure counters
MINOR: ssl: Store specific ocsp update errors in response and update ctx
MINOR: ssl: Add certificate's path to certificate_ocsp structure
MINOR: ssl: Add 'show ssl ocsp-updates' CLI command
MINOR: ssl: Add sample fetches related to OCSP update
MINOR: ssl: Use dedicated proxy and log-format for OCSP update
MINOR: ssl: Reorder struct certificate_ocsp members
MINOR: ssl: Increment OCSP update replay delay in case of failure
MINOR: ssl: Add way to dump ocsp response in base64
MINOR: ssl: Add global options to modify ocsp update min/max delay
REGTESTS: ssl: Fix ocsp update crt-lists
REGTESTS: ssl: Add test for new ocsp update cli commands
MINOR: ssl: Add ocsp-update information to "show ssl crt-list"
BUG/MINOR: ssl: Fix ocsp-update when using "add ssl crt-list"
MINOR: ssl: Replace now.tv_sec with date.tv_sec in ocsp update task
BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback
MINOR: jwt: Add support for RSA-PSS signatures (PS256 algorithm)
Sébaastien Gross (1):
MINOR: config: add HAPROXY_BRANCH environment variable
William Lallemand (8):
MINOR: ssl: rename confusing ssl_bind_kws
BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
BUG/MINOR: mworker: prevent incorrect values in uptime
BUG/MINOR: mworker: stop doing strtok directly from the env
BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old
versions
BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master
FD is wrong
MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value
Willy Tarreau (23):
BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping
BUG/MINOR: sched: properly report long_rq when tasks remain in the queue
BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed
MINOR: threads: add flags to know if a thread is started and/or running
MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in
h2s_frt_handle_headers()
MINOR: compiler: add a TOSTR() macro to turn a value into a string
BUG/MINOR: ring: do not realign ring contents on resize
MEDIUM: ring: make the offset relative to the head/tail instead of
absolute
CLEANUP: ring: remove the now unused ring's offset
BUG/MINOR: fd: used the update list from the fd's group instead of tgid
BUG/MEDIUM: fd: make fd_delete() support being called from a different
group
CLEANUP: listener: only store conn counts for local threads
MINOR: tinfo: make thread_set functions return nth group/mask instead of
first
BUG/MAJOR: fd/thread: fix race between updates and closing FD
MINOR: fd/cli: report the polling mask in "show fd"
CLEANUP: sock: always perform last connection updates before wakeup
BUG/MINOR: init: properly detect NUMA bindings on large systems
BUG/MINOR: thread: report thread and group counts in the correct order
BUG/MAJOR: fd/threads: close a race on closing connections after takeover
MINOR: debug: add random delay injection with "debug dev delay-inj"
MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb()
DOC: config: fix typo "dependeing" in bind thread description
---