Re: [PR] [SPARK] Fix add_files type conversion exception and incorrect partition value when handling null partitions [iceberg]

via GitHub Fri, 02 May 2025 17:58:23 -0700


amogh-jahagirdar commented on code in PR #12886:
URL: https://github.com/apache/iceberg/pull/12886#discussion_r2072266707



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java:
##########
@@ -946,7 +946,7 @@ public static List<SparkPartition> getPartitions(
                         Object catalystValue = 
partition.values().get(fieldIndex, field.dataType());
                         Object value =
                             
CatalystTypeConverters.convertToScala(catalystValue, field.dataType());
-                        values.put(field.name(), String.valueOf(value));
+                        values.put(field.name(), (value == null) ? null : 
value.toString());

Review Comment:
   I definitely agree that this is a problem since these should be considered 
as null identity partition values in Iceberg but I feel like this solution is 
kind of brittle since we're essentially relying on the map having null values. 
   
   The awkwardness of that is some downstream consumer taking the map<String, 
string> and checking the partition value in the map cannot differentiate if the 
partition value is literally null or it's not even a partition field to begin 
with (not an entry in the map) without an explicit containsKey.
   
   Ideally, we can actually start to deprecate the use of Map<String, String> 
in the SparkPartition and actually just include a StructLike of the identity 
partition values. I know that's a larger change just to fix this particular 
problem but it feels like the right thing to do in general. Let me think about 
this a bit more



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] [SPARK] Fix add_files type conversion exception and incorrect partition value when handling null partitions [iceberg]

Reply via email to