hemanth-saal opened a new issue, #6502:
URL: https://github.com/apache/iceberg/issues/6502
### Feature Request / Improvement
Iceberg 1.0.0, spark.sql(INSERT OVERWRITE),if Data frame columns order is
not as same as order of columns in table, it was working fine.
When we changed the iceberg to 1.1.0 , we are facing the issue. In 1.1.0,
It appears, its not mapping data frame columns with tables columns with by name
, instead it is mapping by order. Later i changed the order of columns as same
as table, then it worked fine in 1.1.0.
Issue details: vendor_published_at( time stamp column) and source_name
(string column) order is not same in DF and Table.
Spark DF -columns order - ['year', 'month', 'region_id', 'region_name',
'country_id', 'country_name', 'product_id', 'product_name', 'value',
'scenario_id', 'units', 'data_quality_id', 'data_quality_name',
'vendor_published_at', 'source_name', 'ingested_at']
Table -columns order - ['year', 'month', 'region_id', 'region_name',
'country_id', 'country_name', 'product_id', 'product_name', 'value',
'scenario_id', 'units', 'data_quality_id', 'data_quality_name','source_name',
'vendor_published_at', 'ingested_at']
spark.sql("Insert Overwrite Statement: INSERT OVERWRITE
datawarehouse.<schema>.<table> SELECT * from new_data")
22/12/29 05:56:36 INFO BaseMetastoreCatalog: Table loaded by catalog:
datawarehouse.<schema>.<table
22/12/29 05:56:36 ERROR Cannot write incompatible data to table
'datawarehouse.<schema>.<table':
- Cannot safely cast 'vendor_published_at': string to timestamp
-
### Query engine
Spark
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]