andygrove commented on code in PR #4105:
URL: https://github.com/apache/datafusion-comet/pull/4105#discussion_r3155735753


##########
spark/src/main/scala/org/apache/comet/serde/strings.scala:
##########
@@ -84,6 +84,20 @@ object CometLength extends 
CometScalarFunction[Length]("length") {
   }
 }
 
+object CometLevenshtein extends 
CometScalarFunction[Levenshtein]("levenshtein") {

Review Comment:
   The current `CometLevenshtein` inherits `CometScalarFunction.convert`, which 
does not set return_type on the proto. When the native planner sees no return 
type it falls back to `session_ctx.udf("levenshtein")`, which finds 
DataFusion's built-in 2-arg LevenshteinFunc and fails signature validation as 
soon as a third arg is present. That is what CometNativeException: Error 
     from DataFusion: Function 'levenshtein' expects 2 arguments but received 
3. is telling us in the 3.5 / 4.0 logs. Could you override convert and use 
`scalarFunctionExprToProtoWithReturnType("levenshtein", IntegerType, false, 
...)` so the planner skips the registry lookup, the same way CometStringSplit 
and CometGetJsonObject do?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to