Copilot commented on code in PR #61775:
URL: https://github.com/apache/doris/pull/61775#discussion_r2994454780
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/S3ObjStorage.java:
##########
@@ -584,6 +585,28 @@ private GlobListResult globListInternal(String remotePath,
List<RemoteFile> resu
}
bucket = uri.getBucket();
+
+ // Optimization: For deterministic paths (no wildcards like *, ?),
+ // use HEAD requests instead of listing to avoid requiring
ListBucket permission.
+ // This is useful when only GetObject permission is granted.
+ // Controlled by config: s3_skip_list_for_deterministic_path
+ // Note: Skip when using path style because path-style parsing of
virtual-host URLs
+ // can produce accidental HEAD successes where LIST would
correctly fail.
+ // (e.g., http://bucket.endpoint/key with path_style=true: HEAD
URL coincidentally
+ // matches the correct virtual-host URL, while LIST URL format is
different and fails)
+ String keyPattern = uri.getKey();
+ if (Config.s3_skip_list_for_deterministic_path
+ && !isUsePathStyle
+ && S3Util.isDeterministicPattern(keyPattern)
+ && !hasLimits && startFile == null) {
+ GlobListResult headResult = globListByHeadRequests(
+ bucket, keyPattern, result, fileNameOnly, startTime);
+ if (headResult != null) {
+ return headResult;
+ }
+ // If headResult is null, fall through to use listing
+ }
Review Comment:
The HEAD fast-path returns early, but the outer `globListInternal()`
`finally` block will still log `elementCnt/matchCnt` from the LIST code path
(which remain 0). This makes debug logs misleading when the HEAD optimization
is used. Consider tracking a `usedHeadPath` flag and either skip the LIST-path
metrics log or update the counters based on the HEAD results before returning.
##########
fe/fe-core/src/main/java/org/apache/doris/common/util/S3Util.java:
##########
@@ -433,4 +433,236 @@ public static void validateAndTestEndpoint(String
endpoint) throws UserException
SecurityChecker.getInstance().stopSSRFChecking();
}
}
+
+ /**
+ * Check if a path pattern is deterministic, meaning all file paths can be
determined
+ * without listing. A pattern is deterministic if it contains no true
wildcard characters
+ * (*, ?) but may contain brace patterns ({...}) and non-negated bracket
patterns ([abc], [0-9])
+ * which can be expanded to concrete paths.
+ *
+ * Negated bracket patterns ([!abc], [^abc]) are NOT deterministic because
they match
+ * any character except those listed, requiring a listing to discover
matches.
+ *
+ * This allows skipping S3 ListBucket operations when only GetObject
permission is available.
+ *
+ * @param pathPattern Path that may contain glob patterns
+ * @return true if the pattern is deterministic (expandable without
listing)
+ */
+ public static boolean isDeterministicPattern(String pathPattern) {
+ // Check for wildcard characters that require listing
+ // Note: '{' is NOT a wildcard - it's a brace expansion pattern that
can be deterministically expanded
+ // Note: '[' is conditionally deterministic - [abc] can be expanded,
but [!abc]/[^abc] cannot
+ char[] wildcardChars = {'*', '?'};
+ for (char c : wildcardChars) {
+ if (pathPattern.indexOf(c) != -1) {
+ return false;
+ }
+ }
+ // Check for escaped characters which indicate complex patterns
+ if (pathPattern.indexOf('\\') != -1) {
+ return false;
+ }
+ // Check bracket patterns: [abc] and [0-9] are deterministic, [!abc]
and [^abc] are not
+ if (!areBracketPatternsDeterministic(pathPattern)) {
+ return false;
+ }
+ return true;
+ }
+
+ /**
+ * Check if all bracket patterns in the path are deterministic
(non-negated).
+ * - [abc], [0-9], [a-zA-Z] are deterministic (can be expanded to finite
character sets)
+ * - [!abc], [^abc] are non-deterministic (negation requires listing)
+ * - Malformed brackets (no closing ]) are non-deterministic
+ */
+ private static boolean areBracketPatternsDeterministic(String pattern) {
+ int i = 0;
+ while (i < pattern.length()) {
+ if (pattern.charAt(i) == '[') {
+ int end = pattern.indexOf(']', i + 1);
+ if (end == -1) {
+ // Malformed bracket - no closing ], treat as
non-deterministic
+ return false;
+ }
+ int contentStart = i + 1;
+ if (contentStart == end) {
+ // Empty brackets [] - malformed, treat as
non-deterministic
+ return false;
+ }
+ // Check for negation
+ char first = pattern.charAt(contentStart);
+ if (first == '!' || first == '^') {
+ return false;
+ }
+ i = end + 1;
+ } else {
+ i++;
+ }
+ }
+ return true;
+ }
+
+ /**
+ * Expand bracket character class patterns to brace patterns.
+ * This converts [abc] to {a,b,c} and [0-9] to {0,1,2,...,9} so that
+ * the existing brace expansion can handle them.
+ *
+ * Only call this on patterns already verified as deterministic by
isDeterministicPattern()
+ * (i.e., no negated brackets like [!...] or [^...]).
+ *
+ * Examples:
+ * - "file[abc].csv" => "file{a,b,c}.csv"
+ * - "file[0-9].csv" => "file{0,1,2,3,4,5,6,7,8,9}.csv"
+ * - "file[a-cX].csv" => "file{a,b,c,X}.csv"
+ * - "file.csv" => "file.csv" (no brackets)
+ *
+ * @param pathPattern Path with optional bracket patterns (must not
contain negated brackets)
+ * @return Path with brackets converted to brace patterns
+ */
+ public static String expandBracketPatterns(String pathPattern) {
+ StringBuilder result = new StringBuilder();
+ int i = 0;
+ while (i < pathPattern.length()) {
+ if (pathPattern.charAt(i) == '[') {
+ int end = pathPattern.indexOf(']', i + 1);
+ if (end == -1) {
+ // Malformed, keep as-is
+ result.append(pathPattern.charAt(i));
+ i++;
+ continue;
+ }
+ String content = pathPattern.substring(i + 1, end);
+ List<Character> chars = expandBracketContent(content);
+ result.append('{');
+ for (int j = 0; j < chars.size(); j++) {
+ if (j > 0) {
+ result.append(',');
+ }
+ result.append(chars.get(j));
+ }
+ result.append('}');
+ i = end + 1;
+ } else {
+ result.append(pathPattern.charAt(i));
+ i++;
+ }
+ }
+ return result.toString();
+ }
+
+ private static List<Character> expandBracketContent(String content) {
+ List<Character> chars = new ArrayList<>();
+ int i = 0;
+ while (i < content.length()) {
+ if (i + 2 < content.length() && content.charAt(i + 1) == '-') {
+ // Range like a-z or 0-9
+ char start = content.charAt(i);
+ char end = content.charAt(i + 2);
+ if (start <= end) {
+ for (char c = start; c <= end; c++) {
+ if (!chars.contains(c)) {
+ chars.add(c);
+ }
+ }
+ } else {
+ for (char c = start; c >= end; c--) {
+ if (!chars.contains(c)) {
+ chars.add(c);
+ }
+ }
+ }
+ i += 3;
+ } else {
+ char c = content.charAt(i);
+ if (!chars.contains(c)) {
+ chars.add(c);
+ }
+ i++;
+ }
+ }
+ return chars;
+ }
+
+ /**
+ * Expand brace patterns in a path to generate all concrete file paths.
+ * Handles nested and multiple brace patterns.
+ *
+ * Examples:
+ * - "file{1,2,3}.csv" => ["file1.csv", "file2.csv", "file3.csv"]
+ * - "data/part{1..3}/file.csv" => ["data/part1/file.csv",
"data/part2/file.csv", "data/part3/file.csv"]
+ * - "file.csv" => ["file.csv"] (no braces)
+ *
+ * @param pathPattern Path with optional brace patterns (already processed
by extendGlobs)
+ * @return List of expanded concrete paths
+ */
+ public static List<String> expandBracePatterns(String pathPattern) {
+ List<String> result = new ArrayList<>();
+ expandBracePatternsRecursive(pathPattern, result);
+ return result;
+ }
Review Comment:
`expandBracePatterns()` always fully expands into a `List` with no way to
cap the number of generated paths. Since this is used to guard
HEAD/getProperties fan-out, consider adding a limit-aware variant (e.g.
`expandBracePatterns(pattern, maxPaths)`) or stopping recursion once a
threshold is exceeded, to avoid large allocations on patterns that will later
be rejected.
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java:
##########
@@ -436,6 +453,86 @@ public Status globList(String remotePath, List<RemoteFile>
result, boolean fileN
return st;
}
+ /**
+ * Get file metadata using getProperties requests for deterministic paths.
+ * This avoids requiring list permission when only read permission is
granted.
+ *
+ * @param bucket Azure container name
+ * @param keyPattern The key pattern (may contain {..} brace or [...]
bracket patterns but no wildcards)
+ * @param result List to store matching RemoteFile objects
+ * @param fileNameOnly If true, only store file names; otherwise store
full paths
+ * @param startTime Start time for logging duration
+ * @return Status if successful, null if should fall back to listing
+ */
+ private Status globListByGetProperties(String bucket, String keyPattern,
+ List<RemoteFile> result, boolean fileNameOnly, long startTime) {
+ try {
+ // First expand [...] brackets to {...} braces, then expand {..}
ranges, then expand braces
+ String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
+ expandedPattern = S3Util.extendGlobs(expandedPattern);
+ List<String> expandedPaths =
S3Util.expandBracePatterns(expandedPattern);
+
+ // Fall back to listing if too many paths to avoid overwhelming
Azure with requests
+ // Controlled by config: s3_head_request_max_paths
+ if (expandedPaths.size() > Config.s3_head_request_max_paths) {
+ LOG.info("Expanded path count {} exceeds limit {}, falling
back to LIST",
+ expandedPaths.size(),
Config.s3_head_request_max_paths);
+ return null;
+ }
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Using getProperties requests for deterministic path
pattern, expanded to {} paths",
+ expandedPaths.size());
+ }
+
+ BlobContainerClient containerClient =
getClient().getBlobContainerClient(bucket);
+ long matchCnt = 0;
+ for (String key : expandedPaths) {
+ String fullPath = constructS3Path(key, bucket);
+ try {
+ BlobClient blobClient = containerClient.getBlobClient(key);
+ BlobProperties props = blobClient.getProperties();
+
+ matchCnt++;
+ RemoteFile remoteFile = new RemoteFile(
+ fileNameOnly ?
Paths.get(key).getFileName().toString() : fullPath,
+ true, // isFile
+ props.getBlobSize(),
+ props.getBlobSize(),
+ props.getLastModified() != null
+ ? props.getLastModified().toEpochSecond()
: 0
Review Comment:
`RemoteFile` modification times elsewhere in FE are generally in epoch
**milliseconds** (e.g. HDFS `FileStatus#getModificationTime()`, S3
`toEpochMilli()`), and `FileGroupInfo` passes this through to scan ranges.
Using `toEpochSecond()` here will make Azure deterministic-path results
inconsistent and likely incorrect by 1000x. Prefer
`props.getLastModified().toInstant().toEpochMilli()` (and consider aligning the
existing Azure LIST path too).
```suggestion
?
props.getLastModified().toInstant().toEpochMilli() : 0
```
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java:
##########
@@ -436,6 +453,86 @@ public Status globList(String remotePath, List<RemoteFile>
result, boolean fileN
return st;
}
+ /**
+ * Get file metadata using getProperties requests for deterministic paths.
+ * This avoids requiring list permission when only read permission is
granted.
+ *
+ * @param bucket Azure container name
+ * @param keyPattern The key pattern (may contain {..} brace or [...]
bracket patterns but no wildcards)
+ * @param result List to store matching RemoteFile objects
+ * @param fileNameOnly If true, only store file names; otherwise store
full paths
+ * @param startTime Start time for logging duration
+ * @return Status if successful, null if should fall back to listing
+ */
+ private Status globListByGetProperties(String bucket, String keyPattern,
+ List<RemoteFile> result, boolean fileNameOnly, long startTime) {
+ try {
+ // First expand [...] brackets to {...} braces, then expand {..}
ranges, then expand braces
+ String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
+ expandedPattern = S3Util.extendGlobs(expandedPattern);
+ List<String> expandedPaths =
S3Util.expandBracePatterns(expandedPattern);
+
+ // Fall back to listing if too many paths to avoid overwhelming
Azure with requests
+ // Controlled by config: s3_head_request_max_paths
+ if (expandedPaths.size() > Config.s3_head_request_max_paths) {
+ LOG.info("Expanded path count {} exceeds limit {}, falling
back to LIST",
+ expandedPaths.size(),
Config.s3_head_request_max_paths);
+ return null;
+ }
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Using getProperties requests for deterministic path
pattern, expanded to {} paths",
+ expandedPaths.size());
+ }
+
+ BlobContainerClient containerClient =
getClient().getBlobContainerClient(bucket);
+ long matchCnt = 0;
+ for (String key : expandedPaths) {
+ String fullPath = constructS3Path(key, bucket);
+ try {
+ BlobClient blobClient = containerClient.getBlobClient(key);
+ BlobProperties props = blobClient.getProperties();
+
+ matchCnt++;
+ RemoteFile remoteFile = new RemoteFile(
+ fileNameOnly ?
Paths.get(key).getFileName().toString() : fullPath,
+ true, // isFile
+ props.getBlobSize(),
+ props.getBlobSize(),
+ props.getLastModified() != null
+ ? props.getLastModified().toEpochSecond()
: 0
+ );
+ result.add(remoteFile);
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("getProperties success for {}: size={}",
fullPath, props.getBlobSize());
+ }
+ } catch (BlobStorageException e) {
+ if (e.getStatusCode() == HttpStatus.SC_NOT_FOUND
+ ||
BlobErrorCode.BLOB_NOT_FOUND.equals(e.getErrorCode())) {
+ // File does not exist, skip it (this is expected for
some expanded patterns)
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("File does not exist (skipped): {}",
fullPath);
+ }
+ } else {
+ throw e;
+ }
+ }
+ }
+
+ if (LOG.isDebugEnabled()) {
+ long duration = System.nanoTime() - startTime;
+ LOG.debug("Deterministic path getProperties requests: checked
{} paths, found {} files, took {} ms",
+ expandedPaths.size(), matchCnt, duration / 1000 /
1000);
+ }
+
+ return Status.OK;
+ } catch (Exception e) {
+ LOG.warn("Failed to use getProperties requests, falling back to
listing: {}", e.getMessage());
Review Comment:
The fallback log drops the exception stack trace (`LOG.warn(...,
e.getMessage())`), which makes it hard to understand why getProperties failed
(permission vs missing blob vs transient errors). Log the throwable (and
include container/keyPattern) so operational debugging is possible.
```suggestion
LOG.warn("Failed to use getProperties requests for container
'{}' and keyPattern '{}', "
+ "falling back to listing", bucket, keyPattern,
e);
```
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/S3ObjStorage.java:
##########
@@ -718,6 +741,91 @@ private GlobListResult globListInternal(String remotePath,
List<RemoteFile> resu
}
}
+ /**
+ * Get file metadata using HEAD requests for deterministic paths.
+ * This avoids requiring ListBucket permission when only GetObject
permission is granted.
+ *
+ * @param bucket S3 bucket name
+ * @param keyPattern The key pattern (may contain {..} brace or [...]
bracket patterns but no wildcards)
+ * @param result List to store matching RemoteFile objects
+ * @param fileNameOnly If true, only store file names; otherwise store
full S3 paths
+ * @param startTime Start time for logging duration
+ * @return GlobListResult if successful, null if should fall back to
listing
+ */
+ private GlobListResult globListByHeadRequests(String bucket, String
keyPattern,
+ List<RemoteFile> result, boolean fileNameOnly, long startTime) {
+ try {
+ // First expand [...] brackets to {...} braces, then expand {..}
ranges, then expand braces
+ String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
+ expandedPattern = S3Util.extendGlobs(expandedPattern);
+ List<String> expandedPaths =
S3Util.expandBracePatterns(expandedPattern);
+
+ // Fall back to listing if too many paths to avoid overwhelming S3
with HEAD requests
+ // Controlled by config: s3_head_request_max_paths
+ if (expandedPaths.size() > Config.s3_head_request_max_paths) {
+ LOG.info("Expanded path count {} exceeds limit {}, falling
back to LIST",
+ expandedPaths.size(),
Config.s3_head_request_max_paths);
+ return null;
+ }
Review Comment:
`expandedPaths` is fully materialized before checking
`s3_head_request_max_paths`. Large brace/range expansions (e.g. `{1..100000}`
or multi-brace cartesian products) can cause high CPU/memory usage even though
you later fall back to LIST. To make `s3_head_request_max_paths` actually
protective, consider adding an early-stop/limit-aware expansion API in `S3Util`
(or short-circuit during recursion) that stops once the limit is exceeded.
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/S3ObjStorage.java:
##########
@@ -718,6 +741,91 @@ private GlobListResult globListInternal(String remotePath,
List<RemoteFile> resu
}
}
+ /**
+ * Get file metadata using HEAD requests for deterministic paths.
+ * This avoids requiring ListBucket permission when only GetObject
permission is granted.
+ *
+ * @param bucket S3 bucket name
+ * @param keyPattern The key pattern (may contain {..} brace or [...]
bracket patterns but no wildcards)
+ * @param result List to store matching RemoteFile objects
+ * @param fileNameOnly If true, only store file names; otherwise store
full S3 paths
+ * @param startTime Start time for logging duration
+ * @return GlobListResult if successful, null if should fall back to
listing
+ */
+ private GlobListResult globListByHeadRequests(String bucket, String
keyPattern,
+ List<RemoteFile> result, boolean fileNameOnly, long startTime) {
+ try {
+ // First expand [...] brackets to {...} braces, then expand {..}
ranges, then expand braces
+ String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
+ expandedPattern = S3Util.extendGlobs(expandedPattern);
+ List<String> expandedPaths =
S3Util.expandBracePatterns(expandedPattern);
+
+ // Fall back to listing if too many paths to avoid overwhelming S3
with HEAD requests
+ // Controlled by config: s3_head_request_max_paths
+ if (expandedPaths.size() > Config.s3_head_request_max_paths) {
+ LOG.info("Expanded path count {} exceeds limit {}, falling
back to LIST",
+ expandedPaths.size(),
Config.s3_head_request_max_paths);
+ return null;
+ }
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Using HEAD requests for deterministic path pattern,
expanded to {} paths",
+ expandedPaths.size());
+ }
+
+ long matchCnt = 0;
+ for (String key : expandedPaths) {
+ String fullPath = "s3://" + bucket + "/" + key;
+ try {
+ HeadObjectResponse headResponse = getClient()
+ .headObject(HeadObjectRequest.builder()
+ .bucket(bucket)
+ .key(key)
+ .build());
+
+ matchCnt++;
+ RemoteFile remoteFile = new RemoteFile(
+ fileNameOnly ?
Paths.get(key).getFileName().toString() : fullPath,
+ true, // isFile
+ headResponse.contentLength(),
+ headResponse.contentLength(),
+ headResponse.lastModified() != null
+ ?
headResponse.lastModified().toEpochMilli() : 0
+ );
+ result.add(remoteFile);
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("HEAD success for {}: size={}", fullPath,
headResponse.contentLength());
+ }
+ } catch (NoSuchKeyException e) {
+ // File does not exist, skip it (this is expected for some
expanded patterns)
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("File does not exist (skipped): {}",
fullPath);
+ }
+ } catch (S3Exception e) {
+ if (e.statusCode() == HttpStatus.SC_NOT_FOUND) {
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("File does not exist (skipped): {}",
fullPath);
+ }
+ } else {
+ throw e;
+ }
+ }
+ }
+
+ if (LOG.isDebugEnabled()) {
+ long duration = System.nanoTime() - startTime;
+ LOG.debug("Deterministic path HEAD requests: checked {} paths,
found {} files, took {} ms",
+ expandedPaths.size(), matchCnt, duration / 1000 /
1000);
+ }
+
+ return new GlobListResult(Status.OK, "", bucket, "");
+ } catch (Exception e) {
+ LOG.warn("Failed to use HEAD requests, falling back to listing:
{}", e.getMessage());
Review Comment:
The fallback log drops the stack trace (`LOG.warn(..., e.getMessage())`),
which makes diagnosing why HEAD failed (auth vs throttling vs networking) much
harder. Log the exception as the throwable (and ideally include
bucket/keyPattern) so operators can debug without reproducing.
```suggestion
LOG.warn("Failed to use HEAD requests for bucket={},
keyPattern={}, falling back to listing",
bucket, keyPattern, e);
```
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java:
##########
@@ -357,8 +358,24 @@ public Status globList(String remotePath, List<RemoteFile>
result, boolean fileN
try {
remotePath =
AzurePropertyUtils.validateAndNormalizeUri(remotePath);
S3URI uri = S3URI.create(remotePath, isUsePathStyle,
forceParsingByStandardUri);
- String globPath = S3Util.extendGlobs(uri.getKey());
String bucket = uri.getBucket();
+
+ // Optimization: For deterministic paths (no wildcards like *, ?),
+ // use getProperties requests instead of listing to avoid
requiring list permission.
+ // Controlled by config: s3_skip_list_for_deterministic_path
+ // Note: Skip when using path style (see S3ObjStorage for detailed
explanation)
+ String keyPattern = uri.getKey();
+ if (Config.s3_skip_list_for_deterministic_path
+ && !isUsePathStyle
+ && S3Util.isDeterministicPattern(keyPattern)) {
+ Status headStatus = globListByGetProperties(bucket,
keyPattern, result, fileNameOnly, startTime);
+ if (headStatus != null) {
+ return headStatus;
+ }
+ // If headStatus is null, fall through to use listing
+ }
Review Comment:
The early return for the deterministic-path optimization means the `finally`
block still logs LIST-path counters (`elementCnt/matchCnt`) which remain 0.
This makes the INFO log misleading when the optimization is taken. Consider
skipping the LIST metrics log when returning from `globListByGetProperties()`
or updating the counters based on `expandedPaths`/`result` size.
##########
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java:
##########
@@ -436,6 +453,86 @@ public Status globList(String remotePath, List<RemoteFile>
result, boolean fileN
return st;
}
+ /**
+ * Get file metadata using getProperties requests for deterministic paths.
+ * This avoids requiring list permission when only read permission is
granted.
+ *
+ * @param bucket Azure container name
+ * @param keyPattern The key pattern (may contain {..} brace or [...]
bracket patterns but no wildcards)
+ * @param result List to store matching RemoteFile objects
+ * @param fileNameOnly If true, only store file names; otherwise store
full paths
+ * @param startTime Start time for logging duration
+ * @return Status if successful, null if should fall back to listing
+ */
+ private Status globListByGetProperties(String bucket, String keyPattern,
+ List<RemoteFile> result, boolean fileNameOnly, long startTime) {
+ try {
+ // First expand [...] brackets to {...} braces, then expand {..}
ranges, then expand braces
+ String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
+ expandedPattern = S3Util.extendGlobs(expandedPattern);
+ List<String> expandedPaths =
S3Util.expandBracePatterns(expandedPattern);
+
+ // Fall back to listing if too many paths to avoid overwhelming
Azure with requests
+ // Controlled by config: s3_head_request_max_paths
+ if (expandedPaths.size() > Config.s3_head_request_max_paths) {
+ LOG.info("Expanded path count {} exceeds limit {}, falling
back to LIST",
+ expandedPaths.size(),
Config.s3_head_request_max_paths);
+ return null;
Review Comment:
Same as the S3 implementation: `expandedPaths` is fully expanded into a list
before comparing to `s3_head_request_max_paths`. Very large brace/range
expansions can consume significant CPU/memory even though you then fall back to
LIST. Consider implementing a limit-aware expansion that stops once the
threshold is exceeded.
```suggestion
// First expand [...] brackets to {...} braces, then expand {..}
ranges.
// NOTE: We intentionally avoid unbounded brace expansion here
to prevent
// potential CPU/memory exhaustion on very large patterns.
String expandedPattern =
S3Util.expandBracketPatterns(keyPattern);
expandedPattern = S3Util.extendGlobs(expandedPattern);
final List<String> expandedPaths;
if (expandedPattern.contains("{")) {
// Pattern still contains brace/range expressions. To avoid
potentially
// huge expansions, conservatively fall back to LIST-based
globbing.
LOG.info("Pattern '{}' contains brace/range expressions
after normalization; "
+ "skipping deterministic getProperties optimization
and falling back to LIST",
expandedPattern);
return null;
} else {
// Deterministic single path; no brace expansion needed.
expandedPaths = new ArrayList<>(1);
expandedPaths.add(expandedPattern);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]