praveenc7 commented on PR #18262: URL: https://github.com/apache/pinot/pull/18262#issuecomment-4436671906
> Thanks for the contribution Praveen — the four-class layering is clean and the test coverage for null/non-null round-trips is thorough. Before this merges I'd like to discuss two things: the overall merge strategy, and the ownership model for off-heap memory. > > **Merge strategy** > > Small, reviewable PRs are the right way to develop this incrementally, and we're happy to review them that way. However, we'd prefer not to merge partial Arrow infrastructure into `apache/pinot` master until the design is complete and proven end-to-end. Dead code in master (classes that exist but nothing uses) creates maintenance burden and makes it harder to reason about the codebase. > > A better workflow would be: > > * Create a long-lived integration branch — either in `apache/pinot` or your own fork > * Stack your incremental PRs against that branch so each one is reviewable in isolation > * Open a single draft PR from that branch into `apache/pinot` master, so we have full visibility into the end state > > This way we get the review ergonomics of small PRs without shipping partial infrastructure to master prematurely. > > **Ownership model: off-heap memory requires explicit release** > > With heap-backed blocks (`RowHeapDataBlock`, `SerializedDataBlock`) the GC handles reclamation automatically. Arrow blocks are different — the off-heap memory is invisible to the GC and must be explicitly freed. This means the design needs a clear answer to: who releases each block, and when? > > **Why a simple single-owner model is insufficient** > > Most of the operator pipeline is fine with single-owner linear transfer — a block is consumed, rows extracted, and then released. The case that breaks this is the **local broadcast / singleton exchange**: the same block is enqueued into multiple local mailbox queues without copying. The first consumer to release the block would free memory that other consumers are still reading. > > Note: for hashing distribution this is not a problem — new disjoint blocks are created per bucket, each with a single owner. > > **Proposed ownership model: retain / release** > > The right primitive here is reference counting: > > * Initial refcount = 1 (the producer holds the first reference) > * `retain()` — increments refcount; throws if already 0 (use-after-free guard) > * `release()` — decrements; frees the underlying `VectorSchemaRoot` only when it reaches 0 > > A few notes on API design: > > * The primary API should be `retain()` / `release()`, not `close()`. `close()` implies "destroy unconditionally", which conflicts with ref counting semantics. > * I'd recommend **not** implementing `AutoCloseable` on `ArrowBlock`. `try-with-resources` calls `close()` unconditionally regardless of refcount, and callers that share a block would silently release it early. Making the release explicit forces the caller to reason about ownership rather than relying on scope. > * `VectorSchemaRoot` in Arrow Java doesn't have block-level ref counting built in; you'd implement it with an `AtomicInteger` on `ArrowBlock`, similar to Netty's `ReferenceCounted` pattern. > > **Mailbox implications** > > For the local mailbox send path, the sender would call `block.retain()` once per local recipient before enqueuing, then `release()` its own producer reference after all enqueues complete. The remote path serializes to bytes (a copy), so no retain is needed — the block can be released after serialization. Each consumer calls `release()` when done. > > This is worth designing explicitly before the operator and transport PRs are written, so the contract is clear from the start. > > Happy to discuss — wanted to raise this while the API surface is still small. Thanks for looking into this. I will get back to you by this week on the details -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
