Copilot commented on code in PR #61693:
URL: https://github.com/apache/doris/pull/61693#discussion_r2985439678
##########
be/src/exec/scan/olap_scanner.cpp:
##########
@@ -640,6 +640,11 @@ Status OlapScanner::close(RuntimeState* state) {
}
void OlapScanner::update_realtime_counters() {
+ if (!_has_prepared) {
+ // Counter update need prepare successfully, or it maybe core. For
example, olap scanner
+ // will open tablet reader during prepare, if not prepare
successfully, tablet reader == nullptr.
Review Comment:
Comment says “it maybe core”, which is ambiguous; if the intent is “it may
crash / cause a core dump”, consider wording it explicitly to avoid confusion
for readers unfamiliar with the term.
```suggestion
// Counter update requires successful preparation; otherwise it may
crash (e.g. core dump).
// For example, olap scanner will open tablet reader during prepare;
if prepare does not
// succeed, tablet reader == nullptr.
```
##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
Thread::set_thread_nice_value();
}
#endif
+
+ // we set and get counter according below order, to make sure the counter
is updated before get_block, and the time of get_block is recorded in the
counter.
+ // 1. update_wait_worker_timer to make sure the time of waiting for worker
thread is recorded in the timer
+ // 2. start_scan_cpu_timer to make sure the cpu timer include the time of
open and get_block, which is the real cpu time of scanner
+ // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include
the time of open and get_block, which is the real cpu time of scanner
+ // 4. start_wait_worker_timer when defer, to make sure the time of waiting
for worker thread is recorded in the timer
+
MonotonicStopWatch max_run_time_watch;
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
- // Counter update need prepare successfully, or it maybe core. For
example, olap scanner
- // will open tablet reader during prepare, if not prepare successfully,
tablet reader == nullptr.
- bool need_update_profile = scanner->has_prepared();
+ bool need_update_profile = true;
auto update_scanner_profile = [&]() {
if (need_update_profile) {
scanner->update_scan_cpu_timer();
scanner->update_realtime_counters();
need_update_profile = false;
}
};
- Defer defer_scanner([&] {
- // WorkloadGroup Policy will check cputime realtime, so that should
update the counter
- // as soon as possible, could not update it on close.
- update_scanner_profile();
- });
+
Status status = Status::OK();
bool eos = false;
+ Defer defer_scanner([&] {
+ if (status.ok() && !eos) {
+ // if status is not ok, it means the scanner is failed, and the
counter may be not updated correctly, so no need to update counter again. if
eos is true, it means the scanner is finished successfully, and the counter is
updated correctly, so no need to update counter again.
Review Comment:
`defer_scanner` no longer calls `update_scan_cpu_timer()` (or
`update_realtime_counters()`), so CPU time and realtime counters are only
updated on the final `eos` path. For scanners that yield blocks and get
rescheduled, this drops CPU usage from earlier slices and can break
workload-group realtime accounting. Consider calling `update_scanner_profile()`
in the defer (and keep `start_wait_worker_timer()` conditional) so each scan
slice records its CPU/counters before rescheduling.
```suggestion
// if status is not ok, it means the scanner is failed, and the
counter may be not updated correctly, so no need to update counter again. if
eos is true, it means the scanner is finished successfully, and the counter is
updated correctly, so no need to update counter again.
update_scanner_profile();
```
##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
Thread::set_thread_nice_value();
}
#endif
+
+ // we set and get counter according below order, to make sure the counter
is updated before get_block, and the time of get_block is recorded in the
counter.
+ // 1. update_wait_worker_timer to make sure the time of waiting for worker
thread is recorded in the timer
+ // 2. start_scan_cpu_timer to make sure the cpu timer include the time of
open and get_block, which is the real cpu time of scanner
+ // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include
the time of open and get_block, which is the real cpu time of scanner
+ // 4. start_wait_worker_timer when defer, to make sure the time of waiting
for worker thread is recorded in the timer
Review Comment:
The new timing-order comment is hard to read and has several grammar issues
(e.g. “according below order”, “cpu timer include”). Consider
rewording/splitting it into shorter sentences so the intended sequencing is
unambiguous for future maintainers.
```suggestion
// The following timer operations must be performed in this exact order
to ensure
// that all scanner time is correctly accounted:
// 1. Call update_wait_worker_timer() before get_block() so the time
spent waiting
// for a worker thread is recorded in the timer.
// 2. Call start_scan_cpu_timer() so the CPU timer includes the time of
open() and
// get_block(), which together represent the scanner's actual CPU
time.
// 3. In the defer block, call update_scan_cpu_timer() so the CPU timer
still
// includes the CPU time used by open() and get_block().
// 4. In the defer block, call start_wait_worker_timer() so any
subsequent time
// spent waiting for a worker thread is also recorded in the timer.
```
##########
be/src/exec/scan/scanner_scheduler.cpp:
##########
@@ -152,28 +152,36 @@ void
ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
Thread::set_thread_nice_value();
}
#endif
+
+ // we set and get counter according below order, to make sure the counter
is updated before get_block, and the time of get_block is recorded in the
counter.
+ // 1. update_wait_worker_timer to make sure the time of waiting for worker
thread is recorded in the timer
+ // 2. start_scan_cpu_timer to make sure the cpu timer include the time of
open and get_block, which is the real cpu time of scanner
+ // 3. update_scan_cpu_timer when defer, to make sure the cpu timer include
the time of open and get_block, which is the real cpu time of scanner
+ // 4. start_wait_worker_timer when defer, to make sure the time of waiting
for worker thread is recorded in the timer
+
MonotonicStopWatch max_run_time_watch;
max_run_time_watch.start();
scanner->update_wait_worker_timer();
scanner->start_scan_cpu_timer();
- // Counter update need prepare successfully, or it maybe core. For
example, olap scanner
- // will open tablet reader during prepare, if not prepare successfully,
tablet reader == nullptr.
- bool need_update_profile = scanner->has_prepared();
+ bool need_update_profile = true;
auto update_scanner_profile = [&]() {
if (need_update_profile) {
scanner->update_scan_cpu_timer();
scanner->update_realtime_counters();
need_update_profile = false;
Review Comment:
`need_update_profile` was changed to always `true`, and
`update_scanner_profile()` calls `scanner->update_realtime_counters()`. If
`prepare()` fails, `eos` becomes true and the `eos` block still invokes
`update_scanner_profile()`, which can dereference uninitialized members in
scanners that don’t guard their realtime counters (e.g.
`FileScanner::update_realtime_counters()` uses
`_file_reader_stats`/`_file_cache_statistics`). Either restore the
`scanner->has_prepared()` gating here, or make all `update_realtime_counters()`
implementations safe when not prepared (similar to the new OlapScanner guard).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]