When LTO is enabled, GCC inlines through the entire soring call chain
from test code into the ring element copy functions. With always_inline,
the compiler is forced to inline __rte_ring_dequeue_elems_128() which
copies 32 bytes per element. GCC's static analysis then warns about
potential buffer overflow because it cannot prove the 128-bit element
path is unreachable when the ring is configured for 4-byte elements:

  warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=]

By using plain inline instead of always_inline on the soring enqueue
and dequeue functions, the compiler regains discretion over inlining
decisions. This introduces an analysis boundary that prevents GCC from
connecting the test's buffer sizes to the unreachable 128-bit code path,
eliminating the false positive warning.

Performance impact is expected to be negligible. At -O2/-O3, the
compiler will still inline these small, hot functions based on its
own heuristics. The difference only matters in debug builds or with
-Os, where slightly less aggressive inlining is acceptable.

Signed-off-by: Stephen Hemminger <[email protected]>
---
 lib/ring/soring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/ring/soring.c b/lib/ring/soring.c
index 797484d6bf..3b90521bdb 100644
--- a/lib/ring/soring.c
+++ b/lib/ring/soring.c
@@ -249,7 +249,7 @@ __rte_soring_stage_move_head(struct soring_stage_headtail 
*d,
        return n;
 }
 
-static __rte_always_inline uint32_t
+static inline uint32_t
 soring_enqueue(struct rte_soring *r, const void *objs,
        const void *meta, uint32_t n, enum rte_ring_queue_behavior behavior,
        uint32_t *free_space)
@@ -278,7 +278,7 @@ soring_enqueue(struct rte_soring *r, const void *objs,
        return n;
 }
 
-static __rte_always_inline uint32_t
+static inline uint32_t
 soring_dequeue(struct rte_soring *r, void *objs, void *meta,
        uint32_t num, enum rte_ring_queue_behavior behavior,
        uint32_t *available)
-- 
2.51.0

Reply via email to