myandpr commented on PR #21032: URL: https://github.com/apache/datafusion/pull/21032#issuecomment-4172741676
@alamb @Jefffrey sorry for the delayed reply. I spent some time digging into this further and also looked through the discussion in #20070. I updated the implementation to keep the `OneOf` fix in the shared coercion path, while restoring more actionable diagnostics where they can be determined reliably. The current approach is: - keep the `OneOf` logic in the shared coercion path rather than special-casing individual functions there - preserve concrete `arity` errors instead of falling back to a generic mismatch - preserve concrete type errors for unique matching branches - preserve hinted errors such as the `Binary -> String` cast hint - combine same-arity `Coercible` mismatches into a concrete error instead of falling back to a generic one - keep function-specific semantic diagnostics only where the generic `OneOf` logic cannot express the intended error well, such as `sum(Boolean)` / `avg(Boolean)` - narrow the public `signature()` for `first_value` / `last_value` / `nth_value` so validation, candidate signatures, and introspection all reflect the real public contract I also put together a few representative before/after examples from actual runs on `origin/main` vs this branch: | Case | Why this case | SQL | Before (`origin/main`) | After (current branch) | |---|---|---|---|---| | hex_arity | arity mismatch on Spark function mentioned in review | `select hex(1, 2);` | DataFusion error: Error during planning: Internal error: Function 'hex' failed to match any signature, errors: Error during planning: Function 'hex' expects 1 arguments but received 2,Error during planning: Function 'hex' expects 1 arguments but received 2,Error during planning: Function 'hex' expects 1 arguments but received 2. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'hex(Int64, Int64)'. You might need to add explicit type casts. Candidate functions: hex(Int64) hex(String) hex(Binary) | DataFusion error: Error during planning: Function 'hex' expects 1 arguments but received 2. No function matches the given name and argument types 'hex(Int64, Int64)'. You might need to add explicit type casts. Candid ate functions: hex(Int64) hex(String) hex(Binary) | | substr_type | unique same-arity type mismatch | `SELECT substr(1, 3);` | Error: Error during planning: Internal error: Function 'substr' failed to match any signature, errors: Error during planning: Function 'substr' requires String, but received Int64 (DataType: Int64).,Error during planning: Function 'substr' expects 3 arguments but received 2. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'substr(Int64, Int64)'. You might need to add explicit type casts. Candidate functions: substr(str: String, start_pos: Int64) substr(str: String, start_pos: Int64, length: Int64) | Error: Error during planning: Function 'substr' requires String, but received Int64 (DataType: Int64).. No function matches the given name and argument types 'substr(Int64, Int64)'. You might need to add explicit type casts. Can didate functions: substr(str: String, start_pos: Int64) substr(str: String, start_pos: Int64, length: Int64) | | substr_binary_hint | hinted type mismatch | `SELECT substr(arrow_cast('foo', 'Binary'), 1);` | Error: Error during planning: Internal error: Function 'substr' failed to match any signature, errors: Error during planning: Function 'substr' requires String, but received Binary (DataType: Binary). Hint: Binary types are not automatically coerced to String. Use CAST(column AS VARCHAR) to convert Binary data to String.,Error during planning: Function 'substr' expects 3 arguments but received 2. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'substr(Binary, Int64)'. You might need to add explicit type casts. Candidate functions: substr(str: String, start_pos: Int64) substr(str: String, start_pos: Int64, length: Int64) | Error: Error during planning: Function 'substr' requires String, but received Bin ary (DataType: Binary). Hint: Binary types are not automatically coerced to String. Use CAST(column AS VARCHAR) to convert Binary data to String.. No function matches the given name and argument types 'substr(Binary, Int64)'. You might need to add explicit type casts. Candidate functions: substr(str: String, start_pos: Int64) substr(str: String, start_pos: Int64, length: Int64) | | sum_bool | aggregate semantic diagnostic | `SELECT sum(bool_col) FROM (VALUES (true), (false), (null)) AS t(bool_col);` | Error: Error during planning: Internal error: Function 'sum' failed to match any signature, errors: Error during planning: Function 'sum' requires Decimal, but received Boolean (DataType: Boolean).,Error during planning: Function 'sum' requires UInt64, but received Boolean (DataType: Boolean).,Error during planning: Function 'sum' requires Int64, but received Boolean (DataType: Boolean).,Error during planning: Function 'sum' requires Float64, but received Boolean (DataType: Boolean).,Error during planning: Function 'sum' requires Duration, but received Boolean (DataType: Boolean).. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'sum(Boolean)'. You might need to add explicit type casts. Candidate functions: sum(Decimal) sum(UInt64) sum(Int64) sum(Float64) sum(Duration) | Error: Error during planning: Sum not supported for Boolean. No function matches the given name and argument types 'sum(Boolean)'. You might need to add explicit type casts. Candidate functions: sum(Decimal) sum(UInt64) sum(Int64) sum(Float64) sum(Duration) | | nth_value_arity | window function arity contract | `SELECT nth_value(c1, 2, 3) OVER (ORDER BY c1) FROM (VALUES (1), (2), (3)) AS t(c1);` | Error: Error during planning: Internal error: Function 'nth_value' failed to match any signature, errors: Error during planning: The function 'nth_value' expected zero argument but received 3,Error during planning: The function 'nth_value' expected 1 arguments but received 3,Error during planning: The function 'nth_value' expected 2 arguments but received 3. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'nth_value(Int64, Int64, Int64)'. You might need to add explicit type casts. Candidate functions: nth_value(NullAry()) nth_value(Any) nth_value(Any, Any) | Error: Error during planning: The function 'nth_value' expected 2 arguments but received 3. No functio n matches the given name and argument types 'nth_value(Int64, Int64, Int64)'. You might need to add explicit type casts. Candidate functions: nth_value(Any, Any) | | log_same_arity | ambiguous same-arity overload | `SELECT log(1, '');` | Error: Error during planning: Internal error: Function 'log' failed to match any signature, errors: Error during planning: Function 'log' expects 1 arguments but received 2,Error during planning: Function 'log' expects 1 arguments but received 2,Error during planning: Function 'log' requires Decimal, but received String (DataType: Utf8).,Error during planning: Function 'log' requires Float, but received String (DataType: Utf8).. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues. No function matches the given name and argument types 'log(Int64, Utf8)'. You might need to add explicit type casts. Candidate functions: log(Decimal) log(Float) log(Float, Decimal) log(Float, Float) | Error: Error during planning: Function 'log' requires one of Decimal, Float, but received String (DataTyp e: Utf8).. No function matches the given name and argument types 'log(Int64, Utf8)'. You might need to add explicit type casts. Candidate functions: log(Decimal) log(Float) log(Float, Decimal) log(Float, Float) | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
