Shreyaskr1409 opened a new issue, #15214: URL: https://github.com/apache/datafusion/issues/15214
### Is your feature request related to a problem or challenge? Datafusion have benchmarks which would determine general performance of datafusion as a whole system, but when it comes to testing and analyzing each operator in datafusion, there is no major development so far. We currently can potentially monitor performance metrics of DataFusion, but if a single operator could be responsible for slow performance then finding that operator could be a difficult task. This is something I discussed a bit in slack and in https://github.com/apache/datafusion/issues/5504 . ### Describe the solution you'd like Like the implementation of benchmark `SortPreservingMerge` : https://github.com/apache/datafusion/blob/main/datafusion/core/benches/spm.rs , we could build a bunch of benchmarks for different operators. We could start with a few like `Filter`, joins like `HashJoin`, `Projection` and so on. There is a lot which could be added with passage of time, but so far I would like to listen to suggestions from the community about what they think as well. ### Describe alternatives you've considered _No response_ ### Additional context I will be continuously updating this issue as more information is gained. This would be a rather large project in itself, maybe we could branch out more such tickets to gradually add more benchmarks for each operators. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
