adrians commented on issue #12554:
URL: https://github.com/apache/iceberg/issues/12554#issuecomment-3113576724

   I found a similar problem with a table migrated using `rewrite_table_path` 
when querying using Impala.
   
   Basically Impala reads the manifest-list, gets the list of manifest-files 
and their sizes, and since the manifest-files were rewritten for the new path 
(don't have the same exact size as in the original table, as the file-contant 
has changed) and the manifest-list does not contain the updated sizes, Impala 
throws an error (the actual manifest-file sizes doesn't match the expected 
manifest-file size as declared in the manifest-list).
   
   Hive and Spark query-engines don't check the manifest-file length, but 
Impala does.
   
   Impala error-message:
   ```
   AnalysisException: Failed to load metadata for table: '...'
   CAUSED BY: TableLoadingException: Could not load table ... from catalog
   CAUSED BY: TException: 
TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
error_msgs:[IcebergTableLoadingException: Error loading metadata for Iceberg 
table ...
   CAUSED BY: UncheckedIOException: Failed to open input stream for file 
...-m2.avro:
   java.io.IOException:
   Expected to read 730455 bytes, but only 729310 bytes read.
   CAUSED BY:
   IOException:
   Expected to read 730455 bytes, but only 729310 bytes read.]), 
lookup_status:OK)
   ```
   
   It seems that the logical flow should be
   ```mermaid
   flowchart LR
   A["<b>Rewrite delete files</b><br>- Update paths for 
data-files"]-->B["<b>Rewrite the manifest files</b><br>- Update paths for 
data-files<br>- Update paths and sizes for delete-files"]-->C["<b>Rewrite the 
manifest-lists</b><br>- Update paths and sizes for manifest-files"]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to