andygrove opened a new issue, #3873:
URL: https://github.com/apache/datafusion-comet/issues/3873

   ## Describe the bug
   
   When Spark's memory manager is under pressure and calls `spill()` on Comet's 
`NativeMemoryConsumer`, it returns 0 — meaning Spark cannot reclaim any memory 
from Comet's native operators. This prevents cross-task memory eviction and 
causes Comet to require significantly more off-heap memory than necessary at 
scale.
   
   From `CometTaskMemoryManager.java`:
   
   ```java
   private class NativeMemoryConsumer extends MemoryConsumer {
       public long spill(long size, MemoryConsumer trigger) {
           return 0; // No spilling
       }
   }
   ```
   
   ## Impact
   
   When multiple concurrent tasks share a constrained off-heap pool (e.g., 16 
tasks sharing 16GB), each task's shuffle writer greedily buffers data until 
`try_grow()` fails. Since Spark cannot reclaim memory from any of them, tasks 
compete for the shared pool without coordination, leading to OOM at lower 
memory settings.
   
   Benchmarking TPC-H SF100 Q9 with `local[4]` showed Comet's memory growing 
elastically with the offHeap budget (448 MB increase from 4g to 8g), while 
Spark's native operators stay flat because they participate in the spill 
protocol.
   
   ## Expected behavior
   
   `NativeMemoryConsumer.spill()` should signal the native memory pool to apply 
backpressure, causing DataFusion operators (Sort, Aggregate, Shuffle) to spill 
their internal state to disk. The actual bytes freed should be returned to 
Spark.
   
   ## Proposed approach
   
   Add a `SpillState` struct with atomics and a condvar for cross-thread 
coordination:
   
   1. When Spark calls `spill(size)`, JNI into native to set spill pressure
   2. The pool's `try_grow()` checks pressure and returns `ResourcesExhausted`
   3. DataFusion operators catch this and spill internally
   4. As operators call `shrink()`, freed bytes are tracked and returned to 
Spark
   
   See `docs/source/contributor-guide/memory-management.md` for full analysis 
including comparison with Gluten's approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to