Copilot commented on code in PR #61886:
URL: https://github.com/apache/doris/pull/61886#discussion_r3008504003
##########
regression-test/suites/cloud_p0/tablets/test_clean_stale_rs_file_cache.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_clean_stale_rs_file_cache', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file via environment
variable
def enablePackedFileEnv = System.getenv("DORIS_ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv != null) ?
enablePackedFileEnv.toBoolean() : false
```
##########
regression-test/suites/cloud_p0/balance/test_warmup_rebalance.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_warmup_rebalance_in_cloud', 'multi_cluster,
docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file, configurable via env
var
def enablePackedFileEnv = System.getenv("DORIS_ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv != null ?
enablePackedFileEnv.toBoolean() : false)
```
##########
regression-test/suites/cloud_p0/cache/test_topn_broadcast.groovy:
##########
@@ -21,6 +21,9 @@ import org.apache.doris.regression.suite.ClusterOptions
suite("test_topn_broadcast", "docker") {
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment for deterministic
behavior
def enablePackedFileEnv = System.getenv("ENABLE_PACKED_FILE")
def enablePackedFile = enablePackedFileEnv != null ?
Boolean.parseBoolean(enablePackedFileEnv) : false
```
##########
regression-test/suites/cloud_p0/tablets/test_clean_tablet_when_drop_force_table.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_clean_tablet_when_drop_force_table', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file via environment
variable.
// Set DORIS_ENABLE_PACKED_FILE=true/false in CI to exercise each path
explicitly.
def enablePackedFileEnv = System.getenv('DORIS_ENABLE_PACKED_FILE')
def enablePackedFile = enablePackedFileEnv != null ?
enablePackedFileEnv.toBoolean() : false
logger.info("Running test with enable_packed_file=${enablePackedFile}
(env: ${enablePackedFileEnv})")
```
##########
be/src/cloud/cloud_tablet.cpp:
##########
@@ -1683,10 +1693,17 @@ void
CloudTablet::_submit_inverted_index_download_task(const RowsetSharedPtr& rs
// clang-format off
const auto& rowset_meta = rs->rowset_meta();
auto self = std::dynamic_pointer_cast<CloudTablet>(shared_from_this());
+ // Use rowset_meta->fs() instead of storage_resource->fs to support packed
file for idx files.
+ auto file_system = rowset_meta->fs();
+ if (!file_system) {
+ LOG(WARNING) << "failed to get file system for tablet_id=" <<
_tablet_meta->tablet_id()
+ << ", rowset_id=" << rowset_meta->rowset_id();
+ return;
+ }
Review Comment:
Same concern as segment warmup: calling `rowset_meta->fs()` per
inverted-index download task may repeatedly rebuild the
PackedFileSystem/encryption wrapper. Consider reusing a single wrapped
filesystem instance per rowset across all downloads within the warmup loop.
##########
regression-test/suites/cloud_p0/balance/test_balance_warm_up_use_peer_cache.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_warm_up_use_peer_cache', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file deterministically (env override with
default)
def enablePackedFileEnv = System.getenv("ENABLE_PACKED_FILE")
def enablePackedFile = enablePackedFileEnv != null ?
Boolean.parseBoolean(enablePackedFileEnv) : true
logger.info("Running test with enable_packed_file=${enablePackedFile}
(from env: ${enablePackedFileEnv})")
```
##########
regression-test/suites/cloud_p0/balance/test_balance_metrics.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_metrics', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file based on environment
def enablePackedFileEnv = System.getenv("ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv != null) ?
Boolean.parseBoolean(enablePackedFileEnv) : false
logger.info("Running test with enable_packed_file=${enablePackedFile},
env ENABLE_PACKED_FILE='${enablePackedFileEnv}'")
```
##########
regression-test/suites/cloud_p0/balance/test_balance_warm_up_with_compaction_use_peer_cache.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_warm_up_with_compaction_use_peer_cache',
'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file, controlled via
environment variable.
// Use ENABLE_PACKED_FILE env var if set ("true"/"false"), otherwise
default to true.
def enablePackedFileEnv = System.getenv("ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv == null) ? true :
enablePackedFileEnv.equalsIgnoreCase("true")
```
##########
regression-test/suites/cloud_p0/balance/test_alter_compute_group_properties.groovy:
##########
@@ -22,12 +22,20 @@ suite('test_alter_compute_group_properties', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file, configurable via env
var
def enablePackedFileEnv = System.getenv('ENABLE_PACKED_FILE')
def enablePackedFile = enablePackedFileEnv != null ?
Boolean.parseBoolean(enablePackedFileEnv) : false
```
##########
regression-test/suites/cloud_p0/balance/test_balance_warm_up.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_warm_up', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from config or environment to keep tests
deterministic
def enablePackedFileConfig =
context.config.otherConfigs.get("enable_packed_file")
if (enablePackedFileConfig == null) {
enablePackedFileConfig = System.getenv("ENABLE_PACKED_FILE")
}
def enablePackedFile =
enablePackedFileConfig?.toString()?.equalsIgnoreCase("true") ?: false
```
##########
be/src/cloud/cloud_tablet.cpp:
##########
@@ -1645,10 +1645,20 @@ void CloudTablet::_submit_segment_download_task(const
RowsetSharedPtr& rs,
// clang-format off
const auto& rowset_meta = rs->rowset_meta();
auto self = std::dynamic_pointer_cast<CloudTablet>(shared_from_this());
+ // Use rowset_meta->fs() instead of storage_resource->fs to support packed
file.
+ // RowsetMeta::fs() wraps the underlying FileSystem with PackedFileSystem
when
+ // packed_slice_locations is not empty, which correctly maps segment file
paths
+ // to their actual locations within packed files.
+ auto file_system = rowset_meta->fs();
+ if (!file_system) {
+ LOG(WARNING) << "failed to get file system for tablet_id=" <<
_tablet_meta->tablet_id()
+ << ", rowset_id=" << rowset_meta->rowset_id();
+ return;
Review Comment:
`rowset_meta->fs()` is invoked each time a segment warmup task is submitted.
`RowsetMeta::fs()` builds a PackedFileSystem wrapper by iterating
`packed_slice_locations`, so calling it once per segment can be unnecessarily
expensive. Consider computing the wrapped `file_system` once per rowset (in the
caller loop) and passing/reusing it for all segment/index download submissions.
##########
regression-test/suites/cloud_p0/balance/test_peer_read_async_warmup.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_peer_read_async_warmup', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file using an environment
variable.
// Set DORIS_ENABLE_PACKED_FILE=true/false to control this; default is
false if unset.
def enablePackedFileEnv = System.getenv("DORIS_ENABLE_PACKED_FILE")
def enablePackedFile = enablePackedFileEnv != null ?
enablePackedFileEnv.toBoolean() : false
logger.info("Running test with enable_packed_file=${enablePackedFile}
(env=${enablePackedFileEnv})")
```
##########
be/src/io/cache/block_file_cache_downloader.cpp:
##########
@@ -210,6 +210,16 @@ void FileCacheBlockDownloader::download_file_cache_block(
LOG(WARNING) << storage_resource.error();
return;
}
+ // Use RowsetMeta::fs() instead of storage_resource->fs to support
packed file.
+ // RowsetMeta::fs() wraps the underlying FileSystem with
PackedFileSystem when
+ // packed_slice_locations is not empty, which correctly maps segment
file paths
+ // to their actual locations within packed files.
+ auto file_system = find_it->second->fs();
+ if (!file_system) {
+ LOG(WARNING) << "download_file_cache_block: failed to get file
system for tablet_id="
+ << meta.tablet_id() << ", rowset_id=" <<
meta.rowset_id();
+ return;
+ }
Review Comment:
`RowsetMeta::fs()` rebuilds the wrapped filesystem (packed index map +
encryption wrapper) on each call. Since this method is called inside the
per-block `metas` loop, this can become a hot path and add significant overhead
when many blocks belong to the same rowset. Consider caching the
`FileSystemSPtr` per `rowset_id` (or per `RowsetMetaSharedPtr`) within
`download_file_cache_block()` and reusing it across metas for that rowset.
##########
regression-test/suites/cloud_p0/balance/test_balance_warm_up_sync_global_config.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_warm_up_sync_cache', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Deterministically enable or disable packed_file based on env var to
make tests reproducible
def enablePackedFileEnv = System.getenv("DORIS_TEST_ENABLE_PACKED_FILE")
def enablePackedFile = enablePackedFileEnv != null ?
enablePackedFileEnv.toBoolean() : false
logger.info("Running test with enable_packed_file=${enablePackedFile},
DORIS_TEST_ENABLE_PACKED_FILE='${enablePackedFileEnv}'")
```
##########
regression-test/suites/cloud_p0/tablets/test_clean_tablet_when_rebalance.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_clean_tablet_when_rebalance', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from env var for deterministic behavior.
// Use env var DORIS_ENABLE_PACKED_FILE; default to false if unset.
def enablePackedFileEnv = System.getenv('DORIS_ENABLE_PACKED_FILE')
def enablePackedFile = (enablePackedFileEnv != null) ?
enablePackedFileEnv.toBoolean() : false
logger.info("Running test with enable_packed_file=${enablePackedFile}
(env DORIS_ENABLE_PACKED_FILE='${enablePackedFileEnv}')")
```
##########
regression-test/suites/cloud_p0/tablets/test_clean_stale_rs_index_file_cache.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_clean_stale_rs_index_file_cache', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment to keep tests
deterministic.
// CI can run this suite twice with DORIS_ENABLE_PACKED_FILE=true/false
to cover both paths.
def enablePackedFileEnv = System.getenv('DORIS_ENABLE_PACKED_FILE')
def enablePackedFile = enablePackedFileEnv != null &&
enablePackedFileEnv.equalsIgnoreCase('true')
logger.info("Running test with enable_packed_file=${enablePackedFile}
(DORIS_ENABLE_PACKED_FILE='${enablePackedFileEnv}')")
```
##########
regression-test/suites/cloud_p0/cache/test_load_cache.groovy:
##########
@@ -33,6 +33,10 @@ changes to statistics are possible, only a reasonable range
is required.
*/
suite('test_load_cache', 'docker') {
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment for deterministic
behavior.
// Set ENABLE_PACKED_FILE=true or false in CI/local runs to control this.
def enablePackedFileEnv = System.getenv("ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv != null) ?
enablePackedFileEnv.toBoolean() : false
```
##########
regression-test/suites/cloud_p0/balance/test_balance_use_compute_group_properties.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_use_compute_group_properties', 'docker')
{
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment for deterministic
behavior.
// Set ENABLE_PACKED_FILE=true/false in the environment to override;
default is false.
def envEnablePackedFile = System.getenv('ENABLE_PACKED_FILE')
def enablePackedFile = (envEnablePackedFile != null) ?
envEnablePackedFile.toBoolean() : false
logger.info("Running test with enable_packed_file=${enablePackedFile}
(env ENABLE_PACKED_FILE='${envEnablePackedFile}')")
```
##########
regression-test/suites/cloud_p0/balance/test_balance_warm_up_task_abnormal.groovy:
##########
@@ -22,6 +22,11 @@ suite('test_balance_warm_up_task_abnormal', 'docker') {
if (!isCloudMode()) {
return;
}
+
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment for deterministic
behavior
def enablePackedFileEnv = System.getenv('ENABLE_PACKED_FILE')
def enablePackedFile = enablePackedFileEnv != null ?
enablePackedFileEnv.toBoolean() : false
```
##########
regression-test/suites/cloud_p0/balance/test_expanding_node_balance.groovy:
##########
@@ -23,6 +23,10 @@ suite('test_expanding_node_balance', 'docker') {
return;
}
+ // Randomly enable or disable packed_file to test both scenarios
+ def enablePackedFile = new Random().nextBoolean()
+ logger.info("Running test with enable_packed_file=${enablePackedFile}")
Review Comment:
This makes the suite nondeterministic: `enable_packed_file` is chosen
randomly, so a given CI run may not exercise the packed-file path (the focus of
this PR), and any failure will be hard to reproduce locally. Prefer a
deterministic toggle (e.g., drive it from `context.config`/env var, or run the
suite twice with `enable_packed_file=true/false`).
```suggestion
// Determine enable_packed_file from environment for deterministic
behavior
def enablePackedFileEnv = System.getenv("DORIS_ENABLE_PACKED_FILE")
def enablePackedFile = (enablePackedFileEnv != null) ?
Boolean.parseBoolean(enablePackedFileEnv) : false
logger.info("Running test with enable_packed_file=${enablePackedFile},
DORIS_ENABLE_PACKED_FILE='${enablePackedFileEnv}'")
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]