liurenjie1024 commented on code in PR #1625:
URL: https://github.com/apache/iceberg-rust/pull/1625#discussion_r2303487259


##########
crates/iceberg/src/writer/file_writer/location_generator.rs:
##########
@@ -21,14 +21,24 @@ use std::sync::Arc;
 use std::sync::atomic::AtomicU64;
 
 use crate::Result;
-use crate::spec::{DataFileFormat, TableMetadata};
+use crate::spec::{DataFileFormat, PartitionKey, PartitionKeyExt, 
TableMetadata};
 
 /// `LocationGenerator` used to generate the location of data file.
 pub trait LocationGenerator: Clone + Send + 'static {
-    /// Generate an absolute path for the given file name.
-    /// e.g.
-    /// For file name "part-00000.parquet", the generated location maybe 
"/table/data/part-00000.parquet"
-    fn generate_location(&self, file_name: &str) -> String;
+    /// Generate an absolute path for the given file name that includes the 
partition path.
+    ///
+    /// # Arguments
+    ///
+    /// * `partition_key` - The partition key of the file. If None, generate a 
non-partitioned path.
+    /// * `file_name` - The name of the file
+    ///
+    /// # Returns
+    ///
+    /// An absolute path that includes the partition path, e.g.,
+    /// "/table/data/id=1/name=alice/part-00000.parquet"
+    /// or non-partitioned path:
+    /// "/table/data/part-00000.parquet"
+    fn generate_location(&self, partition_key: Option<&PartitionKey>, 
file_name: &str) -> String;

Review Comment:
   This is a breaking change, but it seems reasonable.



##########
crates/iceberg/src/spec/partition.rs:
##########
@@ -176,6 +176,46 @@ impl PartitionSpec {
     }
 }
 
+/// A partition key represents a specific partition in a table, containing the 
partition spec,
+/// schema, and the actual partition values.
+#[derive(Clone, Debug)]
+pub struct PartitionKey {
+    /// The partition spec that contains the partition fields.
+    spec: PartitionSpec,
+    /// The schema to which the partition spec is bound.
+    schema: SchemaRef,
+    /// Partition fields' values in struct.
+    data: Struct,
+}
+
+impl PartitionKey {
+    /// Creates a new partition key with the given spec, schema, and data.
+    pub fn new(spec: PartitionSpec, schema: SchemaRef, data: Struct) -> Self {
+        Self { spec, schema, data }
+    }
+
+    /// Generates a partition path based on the partition values.
+    pub fn to_path(&self) -> String {
+        self.spec.partition_to_path(&self.data, self.schema.clone())
+    }
+}
+
+/// Extension to help check if a partition key is effectively none.
+pub trait PartitionKeyExt {

Review Comment:
   This is a little over design. What I mean was just like following:
   ```
   impl PartitionKey {
        fn is_effectively_none(input: Option<&PartitionKey>) -> bool {
       ...
       }
   }
   ```



##########
crates/iceberg/src/writer/file_writer/location_generator.rs:
##########
@@ -136,8 +159,17 @@ pub(crate) mod test {
     }
 
     impl LocationGenerator for MockLocationGenerator {
-        fn generate_location(&self, file_name: &str) -> String {
-            format!("{}/{}", self.root, file_name)
+        fn generate_location(&self, partition: Option<&PartitionKey>, 
file_name: &str) -> String {

Review Comment:
   Do we really need this `MockLocationGenerator`? Why not just add a method 
for `DefaultLocationGenerator` such as `with_data_location` so that all the  
tests cover using it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to