amogh-jahagirdar commented on code in PR #12886:
URL: https://github.com/apache/iceberg/pull/12886#discussion_r2076728051


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java:
##########
@@ -946,7 +946,7 @@ public static List<SparkPartition> getPartitions(
                         Object catalystValue = 
partition.values().get(fieldIndex, field.dataType());
                         Object value =
                             
CatalystTypeConverters.convertToScala(catalystValue, field.dataType());
-                        values.put(field.name(), String.valueOf(value));
+                        values.put(field.name(), (value == null) ? null : 
value.toString());

Review Comment:
   Ok while I think in the long run I'd prefer to refactor all this 
SparkPartition code to just store a `PartitionData` which is a struct like, and 
ultimately build the ICeberg data file from real partition values instead of 
hive path inference, I think for the fix we're trying to achieve it's probably 
more work than it's worth at the moment.  
   
   I think an effective alternative is to treat the null case as a 
[HIVE_DEFAULT_PARTITION] 
(https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/Conversions.java#L40)
 in the existing partition values map. When building the data file metadata 
from the path, that will ultimately also get treated as null and is logically 
consistent since we're talking about the add files/import from hive tables case 
here. It also makes it clear to anything consuming the spark partition values 
that it's null since that's the convention at least for the hive case. 
   
   Let me know if that makes sense @hariuserx ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to