beryllw commented on PR #2779:
URL: https://github.com/apache/fluss/pull/2779#issuecomment-4072060741

   `INSERT INTO t (partial columns) VALUES ...` is parsed into multiple Value 
Sources and executed as a UNION ALL. Since Flink runtime does not guarantee the 
left-to-right ordering of UNION ALL inputs, and Fluss assigns auto-increment 
IDs based on server-side arrival order, we can occasionally observe 
non-deterministic ordering in test results.
   
   1. When doing a partial-column write with `INSERT INTO t (partial columns) 
VALUES ...;`, it will hit Flink's optimization rule PreValidateReWriter.
   2. PreValidateReWriter.rewriteValues() will pad each row in the VALUES 
clause into a complete row, filling missing columns with CAST(NULL AS type).
   
https://github.com/apache/flink/blob/f624c8b2ae3035089e46223f4926cfdb50b7bed6/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/calcite/PreValidateReWriter.scala#L100-L103
   3. Since CAST(...) is not a SqlLiteral, SqlToRelConverter.convertRowValues() 
will degenerate the VALUES clause into row-by-row UNION ALL.
   
https://github.com/apache/flink/blob/f624c8b2ae3035089e46223f4926cfdb50b7bed6/flink-table/flink-table-planner/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1922-L1926
   4. The Flink runtime does not guarantee the ordering of UNION ALL inputs, 
meaning records from different union branches may arrive in a non-deterministic 
order.
   
https://github.com/apache/flink/blob/f624c8b2ae3035089e46223f4926cfdb50b7bed6/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/StreamMultipleInputProcessor.java#L127-L143
   
   We can refer to this commit to reproduce the issue: 
https://github.com/beryllw/fluss/commit/001a1b97a7b9f77a0861ee6a96d69fa8fdfeb206


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to