morningman opened a new pull request, #61732:
URL: https://github.com/apache/doris/pull/61732

   ### What problem does this PR solve?
   
   Issue Number: close #xxx
   
   Problem Summary: When using `INSERT INTO local("backend_id" = "X" ...)`, the 
data should only be written to the BE node specified by `backend_id`. However, 
the Coordinator schedules the sink fragment to an arbitrary backend because the 
fragment uses `UNPARTITIONED` partition, which causes 
`SimpleScheduler.getHost()` to pick any available BE. This results in file 
creation failures when the target directory only exists on the intended BE.
   
   **Root Cause:**
   - The read path (`SELECT FROM local(...)`) correctly handles this via 
`TVFScanNode.initBackendPolicy()`, restricting the scan to the specified 
backend.
   - The write path (`INSERT INTO local(...)`) had no equivalent logic. 
`PhysicalPlanTranslator.visitPhysicalTVFTableSink()` creates the fragment as 
`UNPARTITIONED`, and `Coordinator.computeFragmentHosts()` assigns it to a 
random BE.
   
   **Fix:**
   Added backend_id-aware scheduling in `Coordinator.computeFragmentHosts()` 
for local `TVFTableSink`, forcing the sink fragment to execute on the 
designated backend. This is consistent with the existing `DictionarySink` 
pattern that also overrides fragment scheduling for specific sink types.
   
   **Changes:**
   1. `TVFTableSink.java` - Added `getTvfName()` and `getBackendId()` accessor 
methods
   2. `Coordinator.java` - Added check before UNPARTITIONED scheduling: if the 
sink is a local TVFTableSink with a specific backend_id, force the fragment 
onto that backend
   
   ### Release note
   
   Fixed INSERT INTO local TVF to correctly route writes to the backend 
specified by the backend_id parameter.
   
   ### Check List (For Author)
   
   - Test: Regression test (test_insert_into_local_tvf.groovy)
   - Behavior changed: No
   - Does this need documentation: No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to