github-actions[bot] commented on code in PR #61518:
URL: https://github.com/apache/doris/pull/61518#discussion_r3342311626
##########
fe/be-java-extensions/java-common/src/main/java/org/apache/doris/common/jni/vec/VectorColumn.java:
##########
@@ -1605,6 +1605,7 @@ public byte[] getBytesVarbinary(int rowId) {
}
public void updateMeta(VectorColumn meta) {
+ meta.appendLong(isConst ? 1L : 0L);
Review Comment:
This changes the per-column metadata layout that BE reads from
`VectorTable.getMetaAddress()`, but the BE reader still expects the first slot
for each column to be the null-map pointer. `JniDataBridge::fill_column()`
calls `address.next_meta_as_ptr()` as `null_map_ptr` before reading any
type-specific data, so a non-const column now sends `0` and BE returns
`Unsupported type ... in java side`; a const column sends address `1`. Please
update the BE metadata reader/writer contract in the same change, or do not
prepend this field on the Java-to-BE path.
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/SchemaTable.java:
##########
@@ -738,6 +738,8 @@ public class SchemaTable extends Table {
.column("TABLET_ID",
ScalarType.createType(PrimitiveType.BIGINT))
.column("SIZE",
ScalarType.createType(PrimitiveType.BIGINT))
.column("TYPE",
ScalarType.createStringType())
+ .column("TABLE_NAME",
ScalarType.createStringType())
Review Comment:
Adding these columns in the middle of `information_schema.file_cache_info`
shifts the ordinal positions of the existing `REMOTE_PATH`, `CACHE_PATH`, and
`BE_ID` columns. `SELECT *` consumers and clients reading by column index will
now see different data in previously stable positions, and mixed FE/BE
expectations are harder during upgrades. Please append the new columns after
the existing columns instead.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java:
##########
@@ -93,12 +94,93 @@ protected void toThrift(TPlanNode planNode) {
TFileScanNode fileScanNode = new TFileScanNode();
fileScanNode.setTupleId(desc.getId().asInt());
if (desc.getTable() != null) {
- fileScanNode.setTableName(desc.getTable().getName());
+
fileScanNode.setTableName(desc.getTable().getNameWithFullQualifiers());
}
planNode.setFileScanNode(fileScanNode);
super.toThrift(planNode);
}
+ public static void fillTablePartitionContext(TFileRangeDesc range, TableIf
table, String partitionName) {
+ if (range == null) {
+ return;
+ }
+ if (partitionName != null && !partitionName.isEmpty()) {
+ range.setPartitionName(partitionName);
+ }
+ }
+
+ public static String buildPartitionName(List<String> partitionKeys,
List<String> partitionValues) {
+ List<TPartitionKeyValue> partitionKeyValues =
buildPartitionKeyValues(partitionKeys, partitionValues);
+ if (partitionKeyValues.isEmpty()) {
+ return "";
+ }
+ return buildPartitionName(partitionKeyValues);
+ }
+
+ public static List<TPartitionKeyValue> buildPartitionKeyValues(
+ List<String> partitionKeys, List<String> partitionValues) {
+ if (partitionKeys == null || partitionValues == null) {
+ return Collections.emptyList();
+ }
+ if (partitionKeys.isEmpty() || partitionKeys.size() !=
partitionValues.size()) {
+ return Collections.emptyList();
+ }
+ List<TPartitionKeyValue> partitionKeyValues =
Lists.newArrayListWithCapacity(partitionKeys.size());
+ for (int i = 0; i < partitionKeys.size(); i++) {
Review Comment:
The list-based partition-value path does not handle null values, unlike the
map-based overload below. `PluginDrivenSplit.getPartitionValuesInKeyOrder()`
can add `partValues.get(partitionKey)` directly, including null, and this
constructor sets `TPartitionKeyValue.value` (a required thrift string) without
setting `is_null`. That can either serialize a null required field or make
partition columns/cache partition names treat SQL NULL as an ordinary
empty/missing value. Please normalize nulls here the same way as the map
overload (`value = ""`, `is_null = true`).
##########
fe/fe-filesystem/fe-filesystem-cos/src/main/java/org/apache/doris/filesystem/cos/CosObjStorage.java:
##########
@@ -111,6 +111,18 @@ static Map<String, String> toS3Props(Map<String, String>
cosProps) {
return s3Props;
}
+ private static Map<String, String> toValidatedS3Props(Map<String, String>
cosProps) {
+ Map<String, String> s3Props = toS3Props(cosProps);
Review Comment:
This validation runs before `S3FileSystemProperties.of()` normalizes aliases
and derives a region from the endpoint. As a result, COS configurations using
S3-compatible aliases accepted by the parent storage path, such as `s3.region`
or `region`, fail here because `toS3Props()` only maps `COS_REGION` to
`AWS_REGION`. Please validate through the normalized `S3FileSystemProperties`
result or include the same aliases/derivation rules, otherwise previously valid
COS/S3-compatible properties are rejected at construction time.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java:
##########
@@ -362,21 +365,29 @@ private void setIcebergParams(TFileRangeDesc rangeDesc,
IcebergSplit icebergSpli
tableFormatFileDesc.setIcebergParams(fileDesc);
Map<String, String> partitionValues =
icebergSplit.getIcebergPartitionValues();
if (partitionValues != null) {
- List<String> fromPathKeys = new ArrayList<>();
- List<String> fromPathValues = new ArrayList<>();
- List<Boolean> fromPathIsNull = new ArrayList<>();
- for (Map.Entry<String, String> entry : partitionValues.entrySet())
{
- fromPathKeys.add(entry.getKey());
- fromPathValues.add(entry.getValue() != null ? entry.getValue()
: "");
- fromPathIsNull.add(entry.getValue() == null);
- }
- rangeDesc.setColumnsFromPathKeys(fromPathKeys);
Review Comment:
This new cache partition-context fill only runs when
`icebergSplit.getIcebergPartitionValues()` was populated, but
`createIcebergSplit()` only calls `setIcebergPartitionValues()` inside
`sessionVariable.isEnableRuntimeFilterPartitionPrune()`. If that session
variable is false, partitioned Iceberg scans still read the same files but
never send `partition_values`, and `FileQueryScanNode` cannot recover them from
the path because Iceberg path partition keys are empty. File-cache
observability should not depend on enabling runtime-filter partition pruning;
please populate identity partition values for the cache context independently
of that optimization flag.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]