cijiugechu opened a new pull request, #2840:
URL: https://github.com/apache/iggy/pull/2840

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements. You can link an issue to this PR using the GitHub syntax. For 
example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   None.
   
   ## Rationale
   
   Avoid unnecessary O(n) lookup in `IggyMessagesBatch::last_offset()` to 
improve throughput.
   
   Bench (single run each; Apple M4, macOS 15.7.4; pinned-producer/TCP; 8 
producers, 8 streams; 1000B msgs; 1000 msgs/batch; total data 5GB):
   
   <!--
   Why is this change needed? If the issue explains it well, a one-liner is 
fine.
   -->
   
   ### Results:
   
   Throughput delta: ~88.75Mb/s
   
   #### master: 
   
   <img width="3200" height="2400" alt="Throughput - Pinned Producer Benchmark 
(master)" 
src="https://github.com/user-attachments/assets/8de1f604-315a-4564-a37b-40b5664aca10";
 />
   
   
   #### PR: 
   <img width="3200" height="2400" alt="Throughput - Pinned Producer Benchmark 
(last_offset_o1)" 
src="https://github.com/user-attachments/assets/8c6d7fa3-d02a-4c4f-9634-9e1f24000dc6";
 />
   
   
   ## What changed?
   
   `IggyMessagesBatch::last_offset()` previously used `iter().last()`, which 
walk the whole batch and made the call O(n) even though the last element is 
known. In producer hot paths with large `messages_per_batch`, that extra scan 
is on the critical path.
   
   The method now reads the last message via indexed `get(count - 1)` (O(1)), 
with an iterator fallback only if the indexed lookup fails.
   
   <!--
   2-4 sentences. Problem first (before), then solution (after).
   
   GOOD:
   
   "Messages were unavailable when background message_saver committed the
   journal and started async disk I/O before completion. Polling during
   this window found neither journal nor disk data.
   
   The fix freezes journal batches in the in-flight buffer before async 
persist."
   
   GOOD:
   
   "When many small messages accumulate in the journal, the flush passes
   thousands of IO vectors to writev(), exceeding IOV_MAX (1024 on Linux)."
   
   BAD:
   - Walls of text
   - "This PR adds..." (we can see the diff)
   -->
   
   ## Local Execution
   
   - Passed
   - Pre-commit hooks ran 
   
   <!--
   You must run your code locally before submitting.
   "Relying on CI" is not acceptable - PRs from authors who haven't run the 
code will be closed.
   
   Did you have `prek` installed? It runs automatically on commit and covers 
all project languages. See 
[CONTRIBUTING.md](https://github.com/apache/iggy/blob/master/CONTRIBUTING.md).
   -->
   
   ## AI Usage
   
   None.
   
   <!--
   If AI tools were used, please answer:
   1. Which tools? (e.g., GitHub Copilot, Claude, ChatGPT)
   2. Scope of usage? (e.g., autocomplete, generated functions, entire 
implementation)
   3. How did you verify the generated code works correctly?
   4. Can you explain every line of the code if asked?
   
   If no AI tools were used, write "None" or delete this section.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to