GitHub user leborchuk added a comment to the discussion: Introducing the [perfmon] Extension for Cloudberry Database Monitoring
Hi, @fanfuxiaoran I have a number of ideas about what can be done to improve diagnosis. May I ask you a few quick questions below? You can answer them in detail or just yes/no. It will allow me to understand what you consider important and what isn't. 1. If I rewrite your code, can you get rid of SIGAR? We do not use it in https://github.com/powa-team/pg_stat_kcache and in https://github.com/open-gpdb/yagp_hooks_collector/blob/YAGP-0.0.2-WIP/src/ProcStats.cpp 2. Are you going to gather PG internal stat - here the list of stat for PG14 https://github.com/apache/cloudberry/blob/main/contrib/pg_stat_statements/pg_stat_statements--1.8--1.9.sql#L29 ? 3. How are you going to gather network stat? We gather motion stat https://github.com/open-gpdb/yagp_hooks_collector/blob/bb3afb1605aec70a736a0fd21358537a5a50aef8/src/ProtoUtils.cpp#L150 and interconnect stat (but only UDPIfc metrics) https://github.com/open-gpdb/yagp_hooks_collector/blob/bb3afb1605aec70a736a0fd21358537a5a50aef8/src/ProtoUtils.cpp#L186 4. Are you going to take into account nested queries and the level of nesting? Here how it is done in pg_stat_statement https://github.com/apache/cloudberry/blob/main/contrib/pg_stat_statements/pg_stat_statements.c#L998 and the modern tests https://github.com/postgres/postgres/blob/master/contrib/pg_stat_statements/sql/level_tracking.sql 5. What are you going to do with queries without execution plan (those usually executed in utility mode - here the examples https://github.com/postgres/postgres/blob/master/contrib/pg_stat_statements/sql/utility.sql ) 6. Are you going to track resources consumed by session? (In order to show not only top queries but also top sessions) 7. How are you going to store historical data? pg_stat_statements aggregate data, but for long-running queries it should not be done (our users demand the whole data). 8. pg_stat_statements calculate query identifier. Since we store execution plan too, are you going to calculate plan identifier? Here how I've done it for PG https://github.com/postgredients/pg_stat_query_plans 9. What type of execution plan are you going to store in historical data? We could gather explain without stat in ExecutorStart hook and explain analyze stat in ExecutorEnd hook. 10. How are you going to measure performance penalty? 11. How should the system behave if gpsmon/gpmmon is unavailable? How to limit processes of memory usage? 12. Is it acceptable if some of the processes are written (or replaced) in another programming language - go instead of C? GitHub link: https://github.com/apache/cloudberry/discussions/1087#discussioncomment-13143466 ---- This is an automatically sent email for dev@cloudberry.apache.org. To unsubscribe, please send an email to: dev-unsubscr...@cloudberry.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cloudberry.apache.org For additional commands, e-mail: dev-h...@cloudberry.apache.org