armitage420 opened a new issue, #12395:
URL: https://github.com/apache/iceberg/issues/12395

   ### Feature Request / Improvement
   
   **Description**
   Currently, Iceberg's DataReader(Avro) lacks support for Avro's 
timestamp-millis LogicalType. This limitation causes issues when migrating Avro 
tables created with Hive 4 (which might use timestamp-millis logicalType) to 
Iceberg tables. Implementing support for timestamp-millis will improve 
compatibility and ease the migration process for users.
   
   **Current behavior**
   When performing an in-place migration of an Avro table created with Hive 4 
containing a timestamp column to an Iceberg table, an 
`IllegalArgumentException` is thrown during SELECT operations. The error occurs 
as Iceberg attempts to map the Avro schema to the Iceberg table schema.
   
   **Error message**
    An ```IllegalArgumentException: Unknown logical type:  
org.apache.hive.iceberg.org.apache.avro.LogicalTypes$TimestampMillis``` is 
thrown.
   
   **Steps to reproduce**
   1. Create an Avro table in Hive with a timestamp column:
      ```
   CREATE EXTERNAL TABLE hive_test(`id` int, `name` string, `dt` timestamp) 
STORED AS AVRO;
   ```
   
   2. Insert test data:
   ```
   INSERT INTO hive_test VALUES (1, "test name", CAST('2024-08-09 
14:08:26.326107' AS TIMESTAMP));
   ```
   
   3. Verify the data:
   ```
   SELECT * FROM hive_test;
   ```
   
   4. Migrate the table to Iceberg:
   ```
   ALTER TABLE hive_test SET TBLPROPERTIES 
('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler', 
'format-version' = '2');
   ```
   
   5. Attempt to query the migrated table:
   ```
   SELECT * FROM hive_test;
   ```
   Step 5 results in the IllegalArgumentException mentioned above.
   
   **Additional context**
   Debugging the Iceberg code reveals that DataReader has timestamp support for 
microseconds, and not for milliseconds.
   In Iceberg's TypeToSchema.java, timestamps are converted to timestamp-micros 
logical type:
   ```
   private static final Schema TIMESTAMP_SCHEMA =
         
LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
   private static final Schema TIMESTAMPTZ_SCHEMA =
         
LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
   ```
   This issue may not occur for tables originally created in Iceberg, but it 
affects the migration process from Hive Avro tables to Iceberg.
   Other engines using Iceberg connectors (e.g., Hive) may encounter this issue 
during table migration.
   
   
   ### Query engine
   
   Hive
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to