Copilot commented on code in PR #61693:
URL: https://github.com/apache/doris/pull/61693#discussion_r2985439678


##########
be/src/exec/scan/olap_scanner.cpp:
##########
@@ -640,6 +640,11 @@ Status OlapScanner::close(RuntimeState* state) {
 }
 
 void OlapScanner::update_realtime_counters() {
+    if (!_has_prepared) {
+        // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
+        // will open tablet reader during prepare, if not prepare 
successfully, tablet reader == nullptr.

Review Comment:
   Comment says “it maybe core”, which is ambiguous; if the intent is “it may 
crash / cause a core dump”, consider wording it explicitly to avoid confusion 
for readers unfamiliar with the term.
   ```suggestion
           // Counter update requires successful preparation; otherwise it may 
crash (e.g. core dump).
           // For example, olap scanner will open tablet reader during prepare; 
if prepare does not
           // succeed, tablet reader == nullptr.
   ```



##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer
+
     MonotonicStopWatch max_run_time_watch;
     max_run_time_watch.start();
     scanner->update_wait_worker_timer();
     scanner->start_scan_cpu_timer();
 
-    // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
-    // will open tablet reader during prepare, if not prepare successfully, 
tablet reader == nullptr.
-    bool need_update_profile = scanner->has_prepared();
+    bool need_update_profile = true;
     auto update_scanner_profile = [&]() {
         if (need_update_profile) {
             scanner->update_scan_cpu_timer();
             scanner->update_realtime_counters();
             need_update_profile = false;
         }
     };
-    Defer defer_scanner([&] {
-        // WorkloadGroup Policy will check cputime realtime, so that should 
update the counter
-        // as soon as possible, could not update it on close.
-        update_scanner_profile();
-    });
+
     Status status = Status::OK();
     bool eos = false;
+    Defer defer_scanner([&] {
+        if (status.ok() && !eos) {
+            // if status is not ok, it means the scanner is failed, and the 
counter may be not updated correctly, so no need to update counter again. if 
eos is true, it means the scanner is finished successfully, and the counter is 
updated correctly, so no need to update counter again.

Review Comment:
   `defer_scanner` no longer calls `update_scan_cpu_timer()` (or 
`update_realtime_counters()`), so CPU time and realtime counters are only 
updated on the final `eos` path. For scanners that yield blocks and get 
rescheduled, this drops CPU usage from earlier slices and can break 
workload-group realtime accounting. Consider calling `update_scanner_profile()` 
in the defer (and keep `start_wait_worker_timer()` conditional) so each scan 
slice records its CPU/counters before rescheduling.
   ```suggestion
               // if status is not ok, it means the scanner is failed, and the 
counter may be not updated correctly, so no need to update counter again. if 
eos is true, it means the scanner is finished successfully, and the counter is 
updated correctly, so no need to update counter again.
               update_scanner_profile();
   ```



##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer

Review Comment:
   The new timing-order comment is hard to read and has several grammar issues 
(e.g. “according below order”, “cpu timer include”). Consider 
rewording/splitting it into shorter sentences so the intended sequencing is 
unambiguous for future maintainers.
   ```suggestion
       // The following timer operations must be performed in this exact order 
to ensure
       // that all scanner time is correctly accounted:
       // 1. Call update_wait_worker_timer() before get_block() so the time 
spent waiting
       //    for a worker thread is recorded in the timer.
       // 2. Call start_scan_cpu_timer() so the CPU timer includes the time of 
open() and
       //    get_block(), which together represent the scanner's actual CPU 
time.
       // 3. In the defer block, call update_scan_cpu_timer() so the CPU timer 
still
       //    includes the CPU time used by open() and get_block().
       // 4. In the defer block, call start_wait_worker_timer() so any 
subsequent time
       //    spent waiting for a worker thread is also recorded in the timer.
   ```



##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void 
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
         Thread::set_thread_nice_value();
     }
 #endif
+
+    // we set and get counter according below order, to make sure the counter 
is updated before get_block, and the time of get_block is recorded in the 
counter.
+    // 1. update_wait_worker_timer to make sure the time of waiting for worker 
thread is recorded in the timer
+    // 2. start_scan_cpu_timer to make sure the cpu timer include the time of 
open and get_block, which is the real cpu time of scanner
+    // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include 
the time of open and get_block, which is the real cpu time of scanner
+    // 4. start_wait_worker_timer when defer, to make sure the time of waiting 
for worker thread is recorded in the timer
+
     MonotonicStopWatch max_run_time_watch;
     max_run_time_watch.start();
     scanner->update_wait_worker_timer();
     scanner->start_scan_cpu_timer();
 
-    // Counter update need prepare successfully, or it maybe core. For 
example, olap scanner
-    // will open tablet reader during prepare, if not prepare successfully, 
tablet reader == nullptr.
-    bool need_update_profile = scanner->has_prepared();
+    bool need_update_profile = true;
     auto update_scanner_profile = [&]() {
         if (need_update_profile) {
             scanner->update_scan_cpu_timer();
             scanner->update_realtime_counters();
             need_update_profile = false;

Review Comment:
   `need_update_profile` was changed to always `true`, and 
`update_scanner_profile()` calls `scanner->update_realtime_counters()`. If 
`prepare()` fails, `eos` becomes true and the `eos` block still invokes 
`update_scanner_profile()`, which can dereference uninitialized members in 
scanners that don’t guard their realtime counters (e.g. 
`FileScanner::update_realtime_counters()` uses 
`_file_reader_stats`/`_file_cache_statistics`). Either restore the 
`scanner->has_prepared()` gating here, or make all `update_realtime_counters()` 
implementations safe when not prepared (similar to the new OlapScanner guard).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to