wzymumon opened a new issue, #20267: URL: https://github.com/apache/doris/issues/20267
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Description ## Backgroup Doris currently supports both [Native UDF](https://doris.apache.org/zh-CN/docs/dev/ecosystem/udf/native-user-defined-function), [Remote UDF](https://doris.apache.org/zh-CN/docs/dev/ecosystem/udf/remote-user-defined-function) and [Java UDF](https://doris.apache.org/zh-CN/docs/dev/ecosystem/udf/java-user-defined-function) for user-defined functions. Native UDF is written in C++ and has the best performance, but it is more difficult to write and debug, and may be limited by some system library versions(such as libc) that may not be compatible after upgrade. Remote UDF solves the language problem very well, and in theory, UDF logic can be written in any language. But the disadvantage is that users need to implement their own high-performance UDF Service, and the efficiency is not good because of the RPC problem. Java UDF is the main user-defined function solution in Doris. which reduces the migration cost of big data ecological users(Some big data ecologies such as Hive, Spark, etc. already exist a large number of ready-made UDF). The implementation of Java UDF is to start the JVM and call the relevant UDF logic from the BE side through JNI. ## Motivation Support for Wasm UDF is motivated by the following points: 1. Embeddable in [multiple programming languages. Users can write functional logic in multiple programming languages (Rust, C/C++, Golang, Java, TypeScript, Haskell) 2. Secure by default. No file, network, or environment access, unless explicitly enabled. 3. High-performance. WebAssembly engine compiles bytecode into machine-native machine code for execution instead of interpreting it, which greatly improves execution efficiency and can achieve an efficiency close to native execution. ## How to implement Need to start the Wasm Runtime and call the relevant UDF logic from the BE side by way of wasmtime-c-api. ## Scheduling I have a preliminary plan to support Wasm UDF in Doris. Phase I, I'll complete Create-Function-Statement for Wasm UDF. Phase II, I'll add wasmtime lib and wasmtime-c-api to BE, and implement basic data type based on Wasm Basic ABI Type. ### Use case _No response_ ### Related issues Remote UDF: https://github.com/apache/doris/pull/7519 Lua UDF: https://github.com/apache/doris/pull/5979 Java UDF: https://github.com/apache/doris/issues/8389 ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
