andygrove opened a new issue, #4153:
URL: https://github.com/apache/datafusion-comet/issues/4153

   ### What is the problem the feature request solves?
   
   Just a crazy idea, but I was thinking about regexp expression support. Java 
and Rust have different regexp engines with different features and behavior, so 
we'll never be able to be fully compatible with a native acceleration.
   
   In Spark RAPIDS, I spend significant time working on a regexp transpolar to 
try and translate Java regexp into a format that would be compatible in native 
code (cuDF in that case). This was a huge effort and did not reach full 
compatibility.
   
   When we think about accelerating expressions in Comet, we really mean "write 
a native implementation", but it doesn't really have to be this way in all 
cases. We could also implement Comet expressions in Scala.
   
   Rather than fall back to Spark for a projection or predicate with a regexp 
expr, we could implement have Comet call the same Java code that Spark calls to 
evaluate the regexp expr but do this over elements in arrays rather than over 
rows, avoiding the conversion costs.
   
   This is not a well thought out idea yet, but I'll try and come up with a 
more concrete proposal.
   
   ### Describe the potential solution
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to