andygrove commented on code in PR #4105:
URL: https://github.com/apache/datafusion-comet/pull/4105#discussion_r3155735753
##########
spark/src/main/scala/org/apache/comet/serde/strings.scala:
##########
@@ -84,6 +84,20 @@ object CometLength extends
CometScalarFunction[Length]("length") {
}
}
+object CometLevenshtein extends
CometScalarFunction[Levenshtein]("levenshtein") {
Review Comment:
The current `CometLevenshtein` inherits `CometScalarFunction.convert`, which
does not set return_type on the proto. When the native planner sees no return
type it falls back to `session_ctx.udf("levenshtein")`, which finds
DataFusion's built-in 2-arg LevenshteinFunc and fails signature validation as
soon as a third arg is present. That is what CometNativeException: Error
from DataFusion: Function 'levenshtein' expects 2 arguments but received
3. is telling us in the 3.5 / 4.0 logs. Could you override convert and use
`scalarFunctionExprToProtoWithReturnType("levenshtein", IntegerType, false,
...)` so the planner skips the registry lookup, the same way CometStringSplit
and CometGetJsonObject do?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]