Shreyaskr1409 opened a new issue, #15214:
URL: https://github.com/apache/datafusion/issues/15214

   ### Is your feature request related to a problem or challenge?
   
   Datafusion have benchmarks which would determine general performance of 
datafusion as a whole system, but when it comes to testing and analyzing each 
operator in datafusion, there is no major development so far.
   
   We currently can potentially monitor performance metrics of DataFusion, but 
if a single operator could be responsible for slow performance then finding 
that operator could be a difficult task. This is something I discussed a bit in 
slack and in https://github.com/apache/datafusion/issues/5504 .
   
   ### Describe the solution you'd like
   
   Like the implementation of benchmark `SortPreservingMerge` : 
https://github.com/apache/datafusion/blob/main/datafusion/core/benches/spm.rs , 
we could build a bunch of benchmarks for different operators.
   We could start with a few like `Filter`, joins like `HashJoin`, `Projection` 
and so on.
   
   There is a lot which could be added with passage of time, but so far I would 
like to listen to suggestions from the community about what they think as well.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   I will be continuously updating this issue as more information is gained. 
This would be a rather large project in itself, maybe we could branch out more 
such tickets to gradually add more benchmarks for each operators.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to