Re: [PR] [improve](streaming-job) make from-to streaming task timeout progress-aware [doris]

via GitHub Tue, 16 Jun 2026 19:41:58 -0700


liaoxin01 commented on PR #64301:
URL: https://github.com/apache/doris/pull/64301#issuecomment-4725407818


   Suggestion on the timeout path: `processTimeoutTasks` calls 
`fetchProgress()` (a blocking brpc to cdc_client) on **every** tick, but the 
progress only matters when the task already looks timed out. Better to do the 
cheap local check first and only fetch to confirm:
   
   ```
   if (!runningMultiTask.budgetExceeded()) {   // now - lastProgressMs <= 
timeoutMs, no RPC
       return;
   }
   StreamingTaskProgress progress = runningMultiTask.fetchProgress();   // only 
at the boundary
   // then under writeLock: isTimeout(progress) -> renew or kill
   ```
   
   Why it matters: the running tick fires every `max_interval` (default 
**10s**) on the shared `insert-task-execute` pool 
(`job_insert_task_consumer_thread_num`, default **10**, shared by all 
INSERT/streaming jobs), and each RPC blocks up to 
`streaming_cdc_light_rpc_timeout_sec` (**90s**). With the current code every 
running job does an unconditional `fetchProgress` every 10s; plus 
`detectTaskFailure` adds a second unconditional `getFailReason` RPC per tick. 
If cdc_client degrades, a single tick can hold a pool thread for ~180s, and 
enough such jobs can starve the whole pool.
   
   With the lazy check, `fetchProgress` drops from once/10s to at most once per 
timeout window (>=300s). Consider throttling the `detectTaskFailure` 
getFailReason similarly (e.g. every N ticks, or piggyback the fail reason on 
the existing `fetchMeta` response) so the two RPCs do not stack on that bounded 
pool.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [improve](streaming-job) make from-to streaming task timeout progress-aware [doris]

Reply via email to