andreaschiappacasse opened a new issue, #10294:
URL: https://github.com/apache/iceberg/issues/10294

   ### Apache Iceberg version
   
   None
   
   ### Query engine
   
   Athena
   
   ### Please describe the bug 🐞
   
   Hello everyone, today my team incurred in a very strange bug using Iceberg 
via Athena. I'll descrive the steps we used to reproduce the error below:
   
   **1. We create an iceberg table with an "id" column and 321 other columns 
with random strings - in the example below we use awsrangler to create the 
table, but the same happens when the table is created using Athena directly.**
   
   ``` python
   
   NUM_COLS=322
   
   def get_random_string(length):
       letters = string.ascii_lowercase
       result_str = ''.join(random.choice(letters) for i in range(length))
       return result_str
   
   columns = ['id']+[get_random_string(5) for i in range(NUM_COLS-1) ]
   data = pd.DataFrame(data=[columns], columns=columns)
   
   
   wr.athena.to_iceberg(
       data,
       workgroup="my-workgroup",
       database="my_database",
       table="iceberg_limits_322",
       table_location="s3://my_bucket/iceberg_limits",
   )
   
   ```
   
   **2. we then run the following query in athena to insert a random value**
   
   ``` sql
   MERGE INTO my_database.iceberg_limits_322 as existing 
   using (
        SELECT 'something' as id
   ) as new on existing.id = new.id
   WHEN NOT MATCHED
   THEN INSERT (id) VALUES (new.id)
   WHEN MATCHED THEN DELETE
   
   ```
   
   **3. which results in the error:**
   
   `[ErrorCode: INTERNAL_ERROR_QUERY_ENGINE] Amazon Athena experienced an 
internal error while executing this query. Please contact AWS support for 
further assistance. You will not be charged for this query. We apologize for 
the inconvenience.`
   
   
   Notice that the error **only occurs when multiple WHEN are used in the MERGE 
INTO query!** - in case one WHEN is used (just to insert or to delete records) 
everything works fine, and the table can be used normally.
   
   We can replicate this behaviour on multiple AWS accounts and with different 
tables/databases/s3 locations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to