lriggs opened a new issue, #49985: URL: https://github.com/apache/arrow/issues/49985
### Describe the bug, including details regarding any error messages, version, and platform. ### Problem It is possible to register different Gandiva functions with the same alias and parameters but different return types, resulting in confusing function overloads. For example, **DATE_EXTRACTION_TRUNCATION_FNS** in [cpp/src/gandiva/function_registry_datetime.cc] was invoked twice with the same SQL alias lists — once for extract* (returns int64) and once for date_trunc_* (returns the input date/timestamp type): ``` DATE_EXTRACTION_TRUNCATION_FNS(EXTRACT_SAFE_NULL_IF_NULL, extract) DATE_EXTRACTION_TRUNCATION_FNS(TRUNCATE_SAFE_NULL_IF_NULL, date_trunc_) ``` As a result the registry contained four entries for day(...) where there should have been two: ``` int64 day(timestamp) → extractDay_timestamp int64 day(date) → extractDay_date64 timestamp day(timestamp) → date_trunc_Day_timestamp date day(date) → date_trunc_Day_date64 ``` The same problem existed for every calendar-unit alias: year, month, quarter, week, weekofyear, yearweek, dayofmonth, hour, minute, second. Resolution behavior depended on the caller's inferred return type, which is not the SQL semantics anyone expects from day(timestamp_col). FunctionRegistry::Add was silently allowing these registrations: unordered_map::emplace keeps the first entry and discards subsequent ones with no warning. ### Component(s) Gandiva -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
