ahmedabu98 opened a new issue, #11899:
URL: https://github.com/apache/iceberg/issues/11899

   ### Apache Iceberg version
   
   1.7.1 (latest release)
   
   ### Query engine
   
   Other
   
   ### Please describe the bug 🐞
   
   We've been developing an Iceberg connector at [Apache 
Beam](https://github.com/apache/beam/) using the Java API, and I noticed some 
rough edges around partitioning by time types (i.e. year, month, day or hour).
   
   See the following code:
   ```java
   org.apache.iceberg.Schema schema =
       new org.apache.iceberg.Schema(
           Types.NestedField.required(1, "year", 
Types.TimestampType.withoutZone()),
           Types.NestedField.required(2, "day", 
Types.TimestampType.withoutZone()));
   PartitionSpec spec = PartitionSpec.builderFor(schema)
           .year("year")
           .day("day").build();
   Table table = catalog.createTable(TableIdentifier.parse("db.table"), schema, 
spec);
   PartitionKey pk = new PartitionKey(spec, schema);
   
   LocalDateTime val = LocalDateTime.parse("2024-10-08T13:18:20.053");
   Record rec = GenericRecord.create(schema).copy(
           ImmutableMap.of(
                   "year", val, 
                   "day", val));
   pk.partition(rec);
   ```
   
   I'm applying a simple partition to my original record and would expect it to 
work normally, but the last line fails with the following error:
   ```
   java.lang.IllegalStateException: Not an instance of java.lang.Long: 
2024-10-08T13:18:20.053
        at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
        at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:71)
        at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:58)
        at org.apache.iceberg.StructTransform.wrap(StructTransform.java:78)
        at org.apache.iceberg.PartitionKey.wrap(PartitionKey.java:30)
        at org.apache.iceberg.PartitionKey.partition(PartitionKey.java:64)
   ```
   
   We've been able to work around it with [this 
logic](https://github.com/apache/beam/blob/18ec3317e500a6fee72fc8c24552c21808437bef/sdks/java/io/iceberg/src/main/java/org/apache/beam/sdk/io/iceberg/RecordWriterManager.java#L211-L230),
 replicated below:
   <details>
   <summary><b>Work-around</b></summary>
   
   ```java
   private Record getPartitionableRecord(
       Record record, PartitionSpec spec, org.apache.iceberg.Schema schema) {
     if (spec.isUnpartitioned()) {
       return record;
     }
     Record output = GenericRecord.create(schema);
     for (PartitionField partitionField : spec.fields()) {
       Transform<?, ?> transform = partitionField.transform();
       Types.NestedField field = schema.findField(partitionField.sourceId());
       String name = field.name();
       Object value = record.getField(name);
       @Nullable Literal<Object> literal = 
Literal.of(value.toString()).to(field.type());
       if (literal == null || transform.isVoid() || transform.isIdentity()) {
         output.setField(name, value);
       } else {
         output.setField(name, literal.value());
       }
     }
     return output;
   }
   ```
   </details>
   
   So that instead we have this:
   
   ```java
   Record partitionableRec = getPartitionableRecord(rec, spec, schema);
   pk.partition(rec);
   ```
   
   This feels a little hacky and I would expect the Iceberg API to handle this 
by itself. Let me know if I'm missing something!
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [X] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to