peach12345 commented on issue #13406:
URL: https://github.com/apache/iceberg/issues/13406#issuecomment-3096838696

   Hi @mxm, 
   Thank you so much for your reply! 
   
   > Are you referring to something like the following?
   
   Yes. To achieve that, we need to go through the entire list to collect all 
attributes. It gets much more complex when there are multiple nested lists. Of 
course, it's possible—but more complicated—and will impact performance.
   
   > That sounds like you are replacing functionality of the Dynamic Sink. The 
first part, schema generation, should be handled by you, but merging with the 
existing table schema should be handled by the Dynamic Sink.
   
   We actually encountered issues with newly generated IDs due to changes in 
the event structure, which conflicted with the IDs already defined in the table 
schema. In some cases, the Dynamic Sink attempted to add a field to a 
non-struct/object field, which caused problems. 
   So now, we match the existing IDs, and for any new fields, we generate new 
ones. That is working for us.
   If there is time we can try to reproduce this issue to give an example. 
   
   > Field order should be handled automatically by the Dynamic Sink. We match 
fields by name.
   
   We weren’t entirely sure about it, so we chose the safest route by verifying 
the order of the schema and the row data. But it might be safe to skip that 
part now. Thanks again for your advice! 😊


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to