wuchong commented on code in PR #2763:
URL: https://github.com/apache/fluss/pull/2763#discussion_r2969292154


##########
fluss-client/src/main/java/org/apache/fluss/client/utils/ClientRpcMessageUtils.java:
##########
@@ -555,8 +555,12 @@ public static List<PartitionInfo> 
toPartitionInfos(ListPartitionInfosResponse re
                         pbPartitionInfo ->
                                 new PartitionInfo(
                                         pbPartitionInfo.getPartitionId(),
-                                        toResolvedPartitionSpec(
-                                                
pbPartitionInfo.getPartitionSpec())))
+                                        
toResolvedPartitionSpec(pbPartitionInfo.getPartitionSpec()),
+                                        // For backward compatibility, results 
returned by old
+                                        // clusters do not include the remote 
data dir
+                                        pbPartitionInfo.hasRemoteDataDir()

Review Comment:
   When callers use `Admin#listPartitionInfos()` to discover per-partition 
storage roots, this new client field is never populated. 
`RpcServiceBase.listPartitionInfos()` still builds the response from a 
`Map<String, Long>` via 
`ServerRpcMessageUtils.toListPartitionInfosResponse(...)`, and that helper only 
serializes `partition_id` plus `partition_spec`, so updated clusters still 
return `PartitionInfo` objects whose `remoteDataDir` is always `null`.



##########
fluss-server/src/main/java/org/apache/fluss/server/zk/ZooKeeperClient.java:
##########
@@ -576,7 +586,11 @@ public void registerTable(
     /** Get the table in ZK. */
     public Optional<TableRegistration> getTable(TablePath tablePath) throws 
Exception {
         Optional<byte[]> bytes = getOrEmpty(TableZNode.path(tablePath));
-        return bytes.map(TableZNode::decode);
+        Optional<TableRegistration> tableRegistration = 
bytes.map(TableZNode::decode);
+        // Set the default remote data dir for a node generated by an older 
version which does not
+        // have remote data dir
+        return tableRegistration.map(

Review Comment:
   On upgraded clusters with tables created before `remote_data_dir` existed, 
this fallback only covers `getTable()`. `getTables()` still decodes 
`TableZNode` directly, and `MetadataManager.getTables()` uses that batch path 
during coordinator startup and `RpcServiceBase` metadata fetches, so legacy 
tables still reach `toPbTableMetadata().setRemoteDataDir(...)` with a null 
value.



##########
fluss-server/src/test/java/org/apache/fluss/server/zk/data/PartitionRegistrationJsonSerdeTest.java:
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.fluss.server.zk.data;
+
+import org.apache.fluss.utils.json.JsonSerdeTestBase;
+
+/** Test for {@link PartitionRegistrationJsonSerde}. */
+class PartitionRegistrationJsonSerdeTest extends 
JsonSerdeTestBase<PartitionRegistration> {

Review Comment:
   Add a compatibility test to verify that parsing a legacy `TablePartition` 
JSON result into `PartitionRegistration` functions correctly. Refer to 
`org.apache.fluss.server.zk.data.lake.LakeTableJsonSerdeTest#testVersion1Compatibility`
 for an example of such a compatibility test.



##########
fluss-server/src/test/java/org/apache/fluss/server/zk/data/TableRegistrationJsonSerdeTest.java:
##########
@@ -98,8 +102,8 @@ protected TableRegistration[] createObjects() {
     protected String[] expectedJsons() {
         return new String[] {
             
"{\"version\":1,\"table_id\":1234,\"comment\":\"first-table\",\"partition_key\":[\"a\",\"b\"],"
-                    + 
"\"bucket_key\":[\"b\",\"c\"],\"bucket_count\":16,\"properties\":{},\"custom_properties\":{\"custom-3\":\"\\\"300\\\"\"},\"created_time\":1735538268,\"modified_time\":1735538270}",
-            
"{\"version\":1,\"table_id\":1234,\"comment\":\"second-table\",\"bucket_count\":32,\"properties\":{\"option-3\":\"300\"},\"custom_properties\":{},\"created_time\":-1,\"modified_time\":-1}",
+                    + 
"\"bucket_key\":[\"b\",\"c\"],\"bucket_count\":16,\"properties\":{},\"custom_properties\":{\"custom-3\":\"\\\"300\\\"\"},\"remote_data_dir\":\"file://local/remote\",\"created_time\":1735538268,\"modified_time\":1735538270}",

Review Comment:
   Add a compatibility test to verify that parsing a legacy `TableRegistration` 
JSON result (without `remote_data_dir`) still works. 



##########
fluss-common/src/main/java/org/apache/fluss/metadata/PartitionInfo.java:
##########
@@ -22,19 +22,22 @@
 import java.util.Objects;
 
 /**
- * Information of a partition metadata, includes the partition's name and the 
partition id that
- * represents the unique identifier of the partition.
+ * Information of a partition metadata, includes partition id (unique 
identifier of the partition),
+ * partition name, remote data dir for partitioned data storage, etc.
  *
  * @since 0.2
  */
 @PublicEvolving
 public class PartitionInfo {
     private final long partitionId;
     private final ResolvedPartitionSpec partitionSpec;
+    private final String remoteDataDir;

Review Comment:
   This field is nullable, add `@Nullable` to member fields, constructor 
parameter and `getRemoteDataDir()` method. 



##########
fluss-server/src/main/java/org/apache/fluss/server/utils/ServerRpcMessageUtils.java:
##########
@@ -588,6 +589,14 @@ private static TableMetadata 
toTableMetaData(PbTableMetadata pbTableMetadata) {
                         tableId,
                         pbTableMetadata.getSchemaId(),
                         
TableDescriptor.fromJsonBytes(pbTableMetadata.getTableJson()),
+                        // For backword capability. When an older Coordinator 
sends an

Review Comment:
   ```suggestion
                           // For backward capability. When an older 
Coordinator sends an
   ```



##########
fluss-common/src/main/java/org/apache/fluss/metadata/TableInfo.java:
##########
@@ -59,6 +59,7 @@ public final class TableInfo {
     private final Configuration properties;
     private final TableConfig tableConfig;
     private final Configuration customProperties;
+    private final String remoteDataDir;

Review Comment:
   This field is nullable, add `@Nullable` to member fields, constructor 
parameter and `getRemoteDataDir()` method. 



##########
fluss-server/src/main/java/org/apache/fluss/server/zk/data/TableRegistration.java:
##########
@@ -52,6 +52,17 @@ public class TableRegistration {
     public final int bucketCount;
     public final Map<String, String> properties;
     public final Map<String, String> customProperties;
+
+    /**
+     * The remote data directory of the table. It is null if and only if it is 
deserialized by
+     * {@link TableRegistrationJsonSerde} from an existing node produced by an 
older version that
+     * does not support multiple remote paths. But immediately after that, we 
will set it as the
+     * default remote file path configured by {@link 
ConfigOptions#REMOTE_DATA_DIR} (see {@link
+     * org.apache.fluss.server.zk.ZooKeeperClient#getTable}). This unifies 
subsequent usage and
+     * eliminates the need to account for differences between versions.
+     */
+    public final String remoteDataDir;

Review Comment:
   This field is nullable, add `@Nullable` to member fields, constructor 
parameter and `getRemoteDataDir()` method. 



##########
fluss-common/src/main/java/org/apache/fluss/config/FlussConfigUtils.java:
##########
@@ -59,6 +59,25 @@ public static boolean isAlterableTableOption(String key) {
         return ALTERABLE_TABLE_OPTIONS.contains(key);
     }
 
+    /**
+     * Returns the default remote data directory from the configuration. Used 
as a fallback for
+     * tables or partitions that do not contain remote data directory metadata.
+     *
+     * @param conf the Fluss configuration
+     * @return the default remote data directory path, never {@code null} if 
the configuration is
+     *     valid (i.e., at least one of {@code remote.data.dir} or {@code 
remote.data.dirs} is set)
+     * @see ConfigOptions#REMOTE_DATA_DIR
+     * @see ConfigOptions#REMOTE_DATA_DIRS
+     */
+    public static String getDefaultRemoteDataDir(Configuration conf) {
+        List<String> remoteDataDirs = conf.get(ConfigOptions.REMOTE_DATA_DIRS);
+        if (!remoteDataDirs.isEmpty()) {
+            return remoteDataDirs.get(0);
+        }
+
+        return conf.get(ConfigOptions.REMOTE_DATA_DIR);

Review Comment:
   Throws exception if the value of `conf.get(ConfigOptions.REMOTE_DATA_DIR)` 
is null. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to