Re: proposal: schema variables
Yes, a lot new coming sessions running some "select" and sql parsing/planning there, including some partition tables in the query. but there were other sessions DML on these tables at the same time too Laurenz Albe 於 2024年7月19日週五 下午7:41寫道: > On Sat, 2021-04-10 at 08:58 +0200, Pavel Stehule wrote: > > I am sending a strongly updated patch for schema variables. > > > > I rewrote an execution of a LET statement. In the previous > implementation I hacked > > STMT_SELECT. Now, I introduced a new statement STMT_LET, and I > implemented a new > > executor node SetVariable. Now I think this implementation is much > cleaner. > > Implementation with own executor node reduces necessary work on PL side > - and allows > > the LET statement to be prepared - what is important from a security > view. > > > > I'll try to write a second implementation based on a cleaner > implementation like > > utility command too. I expect so this version will be more simple, but > utility > > commands cannot be prepared, and probably, there should be special > support for > > any PL. I hope a cleaner implementation can help to move this patch. > > > > We can choose one variant in the next step and this variant can be > finalized. > > > > Notes, comments? > > Thank you! > > I tried to give the patch a spin, but it doesn't apply any more, > and there are too many conflicts for me to fix manually. > > So I had a look at the documentation: > > > --- a/doc/src/sgml/advanced.sgml > > +++ b/doc/src/sgml/advanced.sgml > > > + > > +The value of a schema variable is local to the current session. > Retrieving > > +a variable's value returns either a NULL or a default value, unless > its value > > +is set to something else in the current session with a LET command. > The content > > +of a variable is not transactional. This is the same as in regular > variables > > +in PL languages. > > + > > + > > + > > +Schema variables are retrieved by the SELECT SQL > command. > > +Their value is set with the LET SQL command. > > +While schema variables share properties with tables, their value > cannot be updated > > +with an UPDATE command. > > "PL languages" -> "procedural languages". Perhaps a link to the > "procedural Languages" > chapter would be a good idea. > I don't think we should say "regular" variables: are there irregular > variables? > > My feeling is that "SQL statement XY" is better than > "XY SQL command". > > I think the last sentence should go. The properties they share with > tables are > that they live in a schema and can be used with SELECT. > Also, it is not necessary to mention that they cannot be UPDATEd. They > cannot > be TRUNCATEd or CALLed either, so why mention UPDATE specifically? > > > --- a/doc/src/sgml/catalogs.sgml > > +++ b/doc/src/sgml/catalogs.sgml > > > + > > + varisnotnull > > + boolean > > + > > + > > + True if the schema variable doesn't allow null value. The > default value is false. > > + > > + > > I think the attribute should be called "varnotnull", similar to > "attnotnull". > This attribute determines whether the variable is NOT NULL or not, not > about > its current setting. > > There is a plural missing: "doesn't allow null valueS". > > > + > > + vareoxaction > > + char > > + > > + > > + n = no action, d = drop > the variable, > > + r = reset the variable to its default value. > > + > > + > > Perhaps the name "varxactendaction" would be better. > > A descriptive sentence is missing. > > > --- /dev/null > > +++ b/doc/src/sgml/ref/create_variable.sgml > > > + > > + The value of a schema variable is local to the current session. > Retrieving > > + a variable's value returns either a NULL or a default value, unless > its value > > + is set to something else in the current session with a LET command. > The content > > + of a variable is not transactional. This is the same as in regular > variables in PL languages. > > + > > "regular variables in PL languages" -> "variables in procedural languages" > > > + > > + Schema variables are retrieved by the SELECT SQL > command. > > + Their value is set with the LET SQL command. > > + While schema variables share properties with tables, their value > cannot be updated > > + with an UPDATE command. > > + > > That's just a literal copy from the tutorial section. I have the same > comments > as there. > > > + > > +NOT NULL > > + > > + > > + The NOT NULL clause forbids to set the > variable to > > + a null value. A variable created as NOT NULL and without an > explicitly > > + declared default value cannot be read until it is initialized by > a LET > > + command. This obliges the user to explicitly initialize the > variable > > + content before reading it. > > + > > + > > + > > What is the reason for that behavior? I'd have expected that a NOT NULL > variable ne
Re: proposal: schema variables
sorry, I sent to wrong email. please ignore. James Pang 於 2024年10月25日週五 下午3:58寫道: > Yes, a lot new coming sessions running some "select" and sql > parsing/planning there, including some partition tables in the query. but > there were other sessions DML on these tables at the same time too > > Laurenz Albe 於 2024年7月19日週五 下午7:41寫道: > >> On Sat, 2021-04-10 at 08:58 +0200, Pavel Stehule wrote: >> > I am sending a strongly updated patch for schema variables. >> > >> > I rewrote an execution of a LET statement. In the previous >> implementation I hacked >> > STMT_SELECT. Now, I introduced a new statement STMT_LET, and I >> implemented a new >> > executor node SetVariable. Now I think this implementation is much >> cleaner. >> > Implementation with own executor node reduces necessary work on PL side >> - and allows >> > the LET statement to be prepared - what is important from a security >> view. >> > >> > I'll try to write a second implementation based on a cleaner >> implementation like >> > utility command too. I expect so this version will be more simple, but >> utility >> > commands cannot be prepared, and probably, there should be special >> support for >> > any PL. I hope a cleaner implementation can help to move this patch. >> > >> > We can choose one variant in the next step and this variant can be >> finalized. >> > >> > Notes, comments? >> >> Thank you! >> >> I tried to give the patch a spin, but it doesn't apply any more, >> and there are too many conflicts for me to fix manually. >> >> So I had a look at the documentation: >> >> > --- a/doc/src/sgml/advanced.sgml >> > +++ b/doc/src/sgml/advanced.sgml >> >> > + >> > +The value of a schema variable is local to the current session. >> Retrieving >> > +a variable's value returns either a NULL or a default value, >> unless its value >> > +is set to something else in the current session with a LET >> command. The content >> > +of a variable is not transactional. This is the same as in regular >> variables >> > +in PL languages. >> > + >> > + >> > + >> > +Schema variables are retrieved by the SELECT >> SQL command. >> > +Their value is set with the LET SQL command. >> > +While schema variables share properties with tables, their value >> cannot be updated >> > +with an UPDATE command. >> >> "PL languages" -> "procedural languages". Perhaps a link to the >> "procedural Languages" >> chapter would be a good idea. >> I don't think we should say "regular" variables: are there irregular >> variables? >> >> My feeling is that "SQL statement XY" is better than >> "XY SQL command". >> >> I think the last sentence should go. The properties they share with >> tables are >> that they live in a schema and can be used with SELECT. >> Also, it is not necessary to mention that they cannot be UPDATEd. They >> cannot >> be TRUNCATEd or CALLed either, so why mention UPDATE specifically? >> >> > --- a/doc/src/sgml/catalogs.sgml >> > +++ b/doc/src/sgml/catalogs.sgml >> >> > + >> > + varisnotnull >> > + boolean >> > + >> > + >> > + True if the schema variable doesn't allow null value. The >> default value is false. >> > + >> > + >> >> I think the attribute should be called "varnotnull", similar to >> "attnotnull". >> This attribute determines whether the variable is NOT NULL or not, not >> about >> its current setting. >> >> There is a plural missing: "doesn't allow null valueS". >> >> > + >> > + vareoxaction >> > + char >> > + >> > + >> > + n = no action, d = drop >> the variable, >> > + r = reset the variable to its default value. >> > + >> > + >> >> Perhaps the name "varxactendaction" would be better. >> >> A descriptive sentence is missing. >> >> > --- /dev/null >> > +++ b/doc/src/sgml/ref/create_variable.sgml >> >> > + >> > + The value of a schema variable is local to the current session. >> Retrieving >> > + a variable's value returns either a NULL or a default value, unless >> its value >> > + is set to something else in the current session with a LET command. >> The content >> > + of a variable is not transactional. This is the same as in regular >> variables in PL languages. >> > + >> >> "regular variables in PL languages" -> "variables in procedural languages" >> >> > + >> > + Schema variables are retrieved by the SELECT SQL >> command. >> > + Their value is set with the LET SQL command. >> > + While schema variables share properties with tables, their value >> cannot be updated >> > + with an UPDATE command. >> > + >> >> That's just a literal copy from the tutorial section. I have the same >> comments >> as there. >> >> > + >> > +NOT NULL >> > + >> > + >> > + The NOT NULL clause forbids to set the >> variable to >> > + a null value. A variable created as NOT NULL and without an >> explicitly >> > + declared default value cannot be read until it is initialized by >> a L
Re: lwlock:LockManager wait_events
On Fri, 2024-10-25 at 14:36 +0800, James Pang wrote: > we faced into a lot of lwlock:LockManager wait-events , all of these queries > are "select ..." , > there are other several session are doing DML, insert/update/delete on same > table. > Did these DML transactions holding "transactionid" and "tuple" lock blocking > "select" on lwlock:LockManager ? This is not about waiting for a lock. Rather, it means that very many sessions are trying to take a lock. They have to grab the lock manager to take a lock, and the competition for that resource is the bottleneck. It is hard to say what is the root cause without further analysis, but very often the cause is that you have too many connections to the database. Using an effective connection pool *might* solve that particular problem. Yours, Laurenz Albe
Unexpected Performance for the Function simplify_function
Dear PostgreSQL Community,
For the query 10 in TPC-H benchmark:
select
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from
CUSTOMER,
ORDERS,
LINEITEM,
NATION
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date '1993-08-01'
and o_orderdate < date '1993-08-01' + interval '3' month
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc
limit
20;
Its query plan is:
QUERY PLAN
--
Limit (cost=200479.71..200479.76 rows=20 width=205) (actual
time=506.558..510.594 rows=20 loops=1)
-> Sort (cost=200479.71..200622.37 rows=57064 width=205) (actual
time=506.557..510.591 rows=20 loops=1)
Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
lineitem.l_discount DESC
Sort Method: top-N heapsort Memory: 34kB
-> Finalize GroupAggregate (cost=191629.63..198961.25 rows=57064
width=205) (actual time=441.132..501.986 rows=37925 loops=1)
Group Key: customer.c_custkey, nation.n_name
-> Gather Merge (cost=191629.63..197772.41 rows=47554
width=205) (actual time=441.124..474.623 rows=37925 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate (cost=190629.61..191283.48
rows=23777 width=205) (actual time=437.497..464.923 rows=12642 loops=3)
Group Key: customer.c_custkey, nation.n_name
-> Sort (cost=190629.61..190689.05 rows=23777
width=185) (actual time=437.485..441.339 rows=38183 loops=3)
Sort Key: customer.c_custkey, nation.n_name
Sort Method: external merge Disk: 7184kB
Worker 0: Sort Method: external merge Disk:
7448kB
Worker 1: Sort Method: external merge Disk:
7264kB
-> Hash Join (cost=181606.66..186706.85
rows=23777 width=185) (actual time=385.555..418.269 rows=38183 loops=3)
Hash Cond: (customer.c_nationkey =
nation.n_nationkey)
-> Parallel Hash Join
(cost=181605.09..186632.29 rows=23777 width=160) (actual time=385.484..411.936
rows=38183 loops=3)
Hash Cond: (customer.c_custkey =
orders.o_custkey)
-> Parallel Seq Scan on customer
(cost=0.00..4225.00 rows=62500 width=148) (actual time=0.028..9.805 rows=5
loops=3)
-> Parallel Hash
(cost=181307.88..181307.88 rows=23777 width=16) (actual time=385.060..385.063
rows=38183 loops=3)
Buckets: 131072 (originally
65536) Batches: 1 (originally 1) Memory Usage: 7648kB
-> Parallel Hash Join
(cost=35809.22..181307.88 rows=23777 width=16) (actual time=69.608..371.381
rows=38183 loops=3)
Hash Cond:
(lineitem.l_orderkey = orders.o_orderkey)
-> Parallel Seq Scan
on lineitem (cost=0.00..143863.66 rows=622855 width=16) (actual
time=0.024..255.818 rows=492957 loops=3)
Filter:
(l_returnflag = 'R'::bpchar)
Rows Removed by
Filter: 1507448
-> Parallel Hash
(cost=35511.00..35511.00 rows=23858 width=8) (actual time=68.857..68.858
rows=19046 loops=3)
Buckets: 65536
Batches: 1 Memory Usage: 2816kB
-> Parallel Seq
Scan on orders (cost=0.00..35511.00 rows=23858 width=8) (actual
time=0.033..62.907 rows=19046 loops=3)
Filter:
((o_orderdate >= '1993-08-01'::date) AND (o_orderdate < '1993-11-01
00:00:00'::timestamp without time zone))
Rows
Removed by Filter: 480954
-> Hash (cost=1.25..1.25 rows=25
width=33) (actual time=0.037..0.037 rows=25 loops=3
Re: lwlock:LockManager wait_events
You may find this helpful, the advice is not specific to RDS. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/wait-event.lw-lock-manager.html HTH. On Fri, 25 Oct 2024 at 18:36, Laurenz Albe wrote: > On Fri, 2024-10-25 at 14:36 +0800, James Pang wrote: > > we faced into a lot of lwlock:LockManager wait-events , all of these > queries are "select ..." , > > there are other several session are doing DML, insert/update/delete on > same table. > > Did these DML transactions holding "transactionid" and "tuple" lock > blocking "select" on lwlock:LockManager ? > > This is not about waiting for a lock. Rather, it means that very many > sessions are trying > to take a lock. They have to grab the lock manager to take a lock, and > the competition for > that resource is the bottleneck. > > It is hard to say what is the root cause without further analysis, but > very often the > cause is that you have too many connections to the database. Using an > effective connection > pool *might* solve that particular problem. > > Yours, > Laurenz Albe > > >
Re: Unexpected Performance for the Function simplify_function
On 10/25/24 16:26, Ba Jinsheng wrote: >So, I wonder if you could analyse the path-choosing logic, determine the costs of competing paths, and explain why NestLoop wasn't chosen. To be honest, it is a bit challenging for me. I guess the better query plan is not considered when comparing the cost of paths? I personally print all the paths (adding an elog message inside the code) and their costs for specific JOIN. Some paths are filtered on earlier stages and not even have a leg in pathlist. Basically, Postgres already has some extended logging for the optimiser paths - use -DOPTIMIZER_DEBUG compilation flag. -- regards, Andrei Lepikhov
Re: Unexpected Performance for the Function simplify_function
Hello, The query plans and results you shared illustrate the unexpected performance differences between using and bypassing the simplify_function() logic in PostgreSQL’s optimizer. Here’s an in-depth analysis and thoughts on optimizing this scenario: *Overview of the Problem* The purpose of simplify_function() in PostgreSQL’s optimizer is to replace function calls with constant values when possible, thus simplifying expressions. However, as demonstrated in your results, this simplification appears to have an adverse effect on the performance of a specific workload. Disabling simplify_function() in this case leads to a notable reduction in both the estimated query cost and the execution time. *Performance Gains Observed* • *Estimated Cost Reduction*: 39.69% • *Execution Time Reduction*: 32.54% *Key Insights from the Execution Plans* 1. *Hash Joins and Parallel Execution*: In both execution plans, PostgreSQL is utilizing hash joins and parallel execution to distribute the workload across multiple workers. The gather merge phase in both plans confirms that multiple parallel workers are involved. 2. *Sorting and Finalization*: Both query plans include a final group aggregation step followed by sorting on the computed revenue (sum(l_extendedprice * (1 - l_discount))). The main difference observed is in how the simplified or unsimplified revenue calculation impacts the query cost. 3. *Disk I/O Reduction*: The plan without simplify_function() seems to leverage more efficient disk usage, especially with external merge sorts. The reduction in disk usage indicates that PostgreSQL can better leverage in-memory operations in the absence of function simplification. *Why Might simplify_function() be Causing Problems?* The exact issue seems to lie in how simplify_function() transforms expressions. There could be multiple factors at play, including: 1. *Expression Over-Simplification*: It’s possible that simplify_function() results in a less optimal execution plan due to over-simplification, which reduces PostgreSQL’s ability to leverage index scans, joins, or parallelism effectively. 2. *Changes in Cost Estimation*: When the expression is simplified, it may lead to changes in cost estimation, causing the planner to choose a different join order or sorting strategy that is less optimal. 3. *Overhead of Repeated Computation*: The simplification might be causing PostgreSQL to repeatedly compute certain expressions that could have been computed once within a more complex expression. *Potential Strategies to Address This Problem* To address this problem, we have a few avenues to explore within the PostgreSQL planner and optimizer: 1. *Investigate and Optimize Simplify Function Logic*: • Understand the specific cases where simplify_function() introduces overhead or leads to suboptimal execution plans. This could involve reviewing whether the simplification is inadvertently forcing PostgreSQL to choose less efficient join strategies, index scans, or aggregations. • Analyze the functions targeted by simplify_function() and consider how their simplification impacts the plan selection logic. 2. *Fine-Tuning Cost Estimation*: • PostgreSQL’s cost estimation formulas could be missing certain cost elements introduced by the simplified expressions. Reviewing and refining the cost model for simplified expressions could lead to more accurate plan choices. 3. *Expression Memoization or Pre-Aggregation*: • One approach could be to examine whether simplifying repeated expressions could lead to recomputation inefficiencies. Instead of simplifying certain functions, PostgreSQL could cache or pre-aggregate expressions where appropriate, avoiding unnecessary recomputation. *Recommendations to Move Forward* Since the results you presented indicate a substantial improvement when bypassing the simplify_function(), it suggests that there’s room for optimization within this specific area of the PostgreSQL optimizer. 1. *Perform More Tests with Varying Workloads*: Test whether disabling the simplify_function() yields consistent benefits across different workloads or if it’s specific to this TPC-H Query 10 pattern. 2. *Community Discussion*: The results suggest that this may warrant a discussion with the PostgreSQL development community, especially if it’s not an isolated case. Optimizations in core PostgreSQL logic should go through community review and testing. 3. *Patch Submission*: If you have identified specific changes to the simplify_function() that improve efficiency, prepare a patch for submission to the PostgreSQL community, ensuring that the patch is thoroughly reviewed and benchmarked. *Conclusion* The primary takeaway is that simplify_function() seems to be triggering less efficient query plans in this case, likely due to over-simplification of expressions. The solution should involve a detailed examination of the expression transformations and their interaction with PostgreSQL’s planner. T
Re: Unexpected Performance for the Function simplify_function
Ba Jinsheng writes: >> It looks like the better plan involves a >> nestloop with inner indexscan on lineitem, which is something whose >> estimated cost depends enormously on random_page_cost. You've given >> us exactly zero detail about your test conditions, so it's hard to say >> more than that. > I used the default configuration in the file > src/backend/utils/misc/postgresql.conf.sample > So the random_page_cost = 4.0 You're still admitting to nothing as to the hardware you are running this test on. However, 4.0 is a number we chose decades ago based on typical performance of spinning-rust storage. It's not very appropriate for SSD or similar storage -- numbers just a bit above 1 are probably the most appropriate thing for that kind of storage. (There are ongoing discussions about changing the setting's default value, but so far not backed by any great deal of hard evidence.) regards, tom lane
Re: Unexpected Performance for the Function simplify_function
>So, I wonder if you could analyse the path-choosing logic, determine the costs of competing paths, and explain why NestLoop wasn't chosen. To be honest, it is a bit challenging for me. I guess the better query plan is not considered when comparing the cost of paths? Best regards, Jinsheng Ba From: Andrei Lepikhov Sent: Friday, October 25, 2024 4:13 AM To: Ba Jinsheng Cc: [email protected] Subject: Re: Unexpected Performance for the Function simplify_function - External Email - On 10/25/24 02:43, Ba Jinsheng wrote: > I am not proposing a fixing patch, as the patch is incorrect. Instead, I > just want to show disabling the simplify_function() function brings > performance benefit, and it seems unexpected. I am wondering whether we > can optimize simplify_function() to make the performance better for this > workload? I also discovered your case. Using AQO and settling the correct cardinalities in each node, I found that the plan doesn't change at all. So, I wonder if you could analyse the path-choosing logic, determine the costs of competing paths, and explain why NestLoop wasn't chosen. Maybe there is kind of early selectivity estimation error or something even more deep: specific tuples distribution across blocks of the heap table. -- regards, Andrei Lepikhov Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.
Re: Unexpected Performance for the Function simplify_function
On Fri, 25 Oct 2024 at 22:26, Ba Jinsheng wrote: > I guess the better query plan is not considered when comparing the cost of > paths? You might want to change effective_cache_size is set high enough. Something like 50-75% of RAM is likely fine. David
Re: Unexpected Performance for the Function simplify_function
>It looks like the better plan involves a >nestloop with inner indexscan on lineitem, which is something whose >estimated cost depends enormously on random_page_cost. You've given >us exactly zero detail about your test conditions, so it's hard to say >more than that. I used the default configuration in the file src/backend/utils/misc/postgresql.conf.sample So the random_page_cost = 4.0 Best regards, Jinsheng Ba Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.
Re: lwlock:LockManager wait_events
On Fri, 25 Oct 2024, 14:36 James Pang, wrote: > experts, > we faced into a lot of lwlock:LockManager wait-events , all of these > queries are "select ..." , there are other several session are doing DML, > insert/update/delete on same table. Did these DML transactions holding > "transactionid" and "tuple" lock blocking "select" on lwlock:LockManager ? > Which version of PostgreSQL? Are these all same query or variant of same query (with different parameters)? What's the schema (you can redact column names) for the tables involved in the select query which is blocked/waiting on lock_manager? > Thanks, > > James >
