Re: Leading comments and client applications
Hi, >I was able to narrow this down to something in SQLAlchemy behavior. Fine :) Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover. I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected. Alternate option if you want continue to use TextClause: use /* comment */ for first prefix comment. Comment is logged and query executed (tested on Java ( not on SQLAlchemy )). We use it to track back the request id executed like that query = em.createNativeQuery("/*requete_enregistree_num_" + requete.getId() + "*/ " + requete.getReqRequete().trim()); Philippe On 25/03/2022 19:05, Philip Semanchuk wrote: On Mar 25, 2022, at 11:59 AM, Tom Lane wrote: Philip Semanchuk writes: I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding. Are you certain there's actually a newline after the comment? The easiest explanation for this would be if something in the SQLAlchemy code path were munging the newline. I verified that there is a newline after the comment. But yes, thanks to your suggestion and others, I was able to narrow this down to something in SQLAlchemy behavior. In case anyone else comes across this and is wondering -- In addition to accepting a plain string, execute() accepts a number of different SQLAlchemy data types, including TextClause and DDLElement. We used to pass a DDLElement to execute(), but a few months ago we switched to passing a TextClause because DDLElement interprets % signs anywhere in SQL scripts as Python string interpolation markers and that was causing us headaches in some scripts. Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover. I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected. https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.Connection.execute As David J pointed out, execute() is deprecated as of version 1.4. We’re still on 1.3 but we’ll have to move away from this code eventually so maybe this is a good inspiration to move away from execute() now and reduce the number of deprecation warnings we have to deal with in the future. As far as the comparison behavior goes, psql's parser strips comments that start with double dashes, for $obscure_reasons. The server is perfectly capable of ignoring those by itself, though. (Awhile back I tried to remove that psql behavior, but it caused too much churn in our regression tests.) Thanks, this is most helpful. I use psql to double check I think SQLAlchemy is doing something odd. It’s good to know that psql's behavior in this case is a choice and not required behavior for clients. Peter J. Holzer’s psycopg2 example could have showed me the same; I wish I had thought of that. I appreciate all the help! Cheers Philip -- 📌 Le nom de domaine de nos adresses mails évolue et devient @arche-mc2.fr. My Logo arche-mc2.fr <https://cloud.letsignit.com/collect/bc/61f021ab21488edd71fe55fe?p=T6Q_2TuhlnOa6Vsx6-MT2IBatnqI7vmjUul2-_EjYgUpYaHQ5q1LqjIR9-pxUHfikOYW7y2eM8WGWMEw6PlnMfKnTaOZP-y2iUHa7jlBrUK_DYDxjdcFR32CefiLPg-hrxT5OaW-c3bv9zu5BEdrLA==> Philippe DOUSSOT ARCHITECTE TECHNIQUE DIRECTION DES SOLUTIONS ARCHE MC2 DOMICILE philippe.doussot@arche‑mc2.fr
update non-indexed value is slow if some non-related index/fk are enabled
Hi all, I'm look for some hint about this behaviour : One UPDATE of one boolean value which is non-indexed take some time ( for many tuples .. 3 Millions ). But if we disable all indexes/fk the UPDATE is 10x faster and do 10x less shared hit . I don't understand why disabling all index from the table speed up the update because the boolean column is not indexed For exemple , disabling the index was done like this : UPDATE pg_index SET indisready=false WHERE indexrelid in (620809,620837,620839,620841,620854,618764,620855,620790,620790,620840); In both UPDATE with or without index enabled/disable, the PLAN is the same. Postgres 9.5 : Explain with index : slow Update on public.t_acte_entite (cost=0.00..52924.83 rows=2758149 width=45) (actual time=120908.338..120908.338 rows=0 loops=1) Buffers: shared hit=101546408 read=164106 dirtied=171996 written=9529 -> Seq Scan on public.t_acte_entite (cost=0.00..52924.83 rows=2758149 width=45) (actual time=0.028..1783.946 rows=3033768 loops=1) Output: id_acte_entite, fk_acte, fk_rubrique, fk_client, fk_salarie, fk_cycldet, fk_planning, act_user_modified, fk_ssiad_visite, fk_ssiad_visite_planning, true, acte_realise, acte_realise_modifie, ctid Filter: ((NOT t_acte_entite.acte_prevu) AND ((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR (t_acte_entite.fk_ssiad_visite IS NOT NULL))) Rows Removed by Filter: 16115 Buffers: shared hit=10 read=22416 written=9441 Planning time: 0.133 ms Execution time: 120908.387 ms Explain with index disabled : Update on public.t_acte_entite (cost=0.00..52924.83 rows=2757021 width=45) (actual time=9007.353..9007.353 rows=0 loops=1) Buffers: shared hit=9145912 read=44740 dirtied=44620 written=60 -> Seq Scan on public.t_acte_entite (cost=0.00..52924.83 rows=2757021 width=45) (actual time=2.608..1121.776 rows=3033768 loops=1) Output: id_acte_entite, fk_acte, fk_rubrique, fk_client, fk_salarie, fk_cycldet, fk_planning, act_user_modified, fk_ssiad_visite, fk_ssiad_visite_planning, true, acte_realise, acte_realise_modifie, ctid Filter: ((NOT t_acte_entite.acte_prevu) AND ((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR (t_acte_entite.fk_ssiad_visite IS NOT NULL))) Rows Removed by Filter: 16115 Buffers: shared hit=1 read=22425 written=60 Planning time: 5.835 ms Execution time: 9007.400 ms You can see that the Buffers: shared hit= drop from 101 Millions to 10Million without the indexes But as the query plan only do a Sec Scan and don't acces an index I'm little surprised. Same on postgres 13 : //without explain to know how many update are done : test=# update t_acte_entite set acte_prevu = true where (fk_ssiad_visite_planning is not null or fk_ssiad_visite is not null); UPDATE 3049406 //First explain ( with index enabled ) test=# explain (analyse, COSTS, buffers, verbose, FORMAT JSON) update t_acte_entite set acte_prevu = true where (fk_ssiad_visite_planning is not null or fk_ssiad_visite is not null); [ { "Plan": { "Node Type": "ModifyTable", "Operation": "Update", "Parallel Aware": false, "Relation Name": "t_acte_entite", "Schema": "public", "Alias": "t_acte_entite", "Startup Cost": 0.00, "Total Cost": 87603.92, "Plan Rows": 2748711, "Plan Width": 46, "Actual Startup Time": 74072.048, "Actual Total Time": 74072.049, "Actual Rows": 0, "Actual Loops": 1, "Shared Hit Blocks": 89724572, "Shared Read Blocks": 115370, "Shared Dirtied Blocks": 101855, "Shared Written Blocks": 54351, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Read Blocks": 0, "Temp Written Blocks": 0, "Plans": [ { "Node Type": "Seq Scan", "Parent Relationship": "Member", "Parallel Aware": false, "Relation Name": "t_acte_entite", "Schema": "public", "Alias": "t_acte_entite", "Startup Cost": 0.00, "Total Cost": 87603.92, "Plan Rows": 2748711, "Plan Width": 46, "Actual Startup Time": 0.047, "Actual Total Time": 1040.544, "Actual Rows": 3049406, "Actual Loops": 1, "Output": ["id_acte_entite", "fk_acte", "fk_rubrique", "fk_client", "fk_salarie", "fk_cycldet", "fk_planning", "act_user_modified", "fk_ssiad_visite", "fk_ssiad_visite_planning", "true", "acte_realise", "acte_realise_modifie", "ctid"], "Filter": "((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR (t_acte_entite.fk_ssiad_visite IS NOT NULL))", "Rows Removed by Filter": 477, "Shared Hit Blocks": 4334, "Shared Read Blocks": 53032, "Shared Dirtied Blocks": 0, "Shared Written Blocks": 14709, "Local Hit Blocks": 0, "Local Read Blocks": 0, "Local Dirtied Blocks": 0, "Local Written Blocks": 0, "Temp Rea
Re: update non-indexed value is slow if some non-related index/fk are enabled
On 06/09/2021 10:21, rob stone wrote: On Sun, 2021-09-05 at 11:21 -0400, Tom Lane wrote: "David G. Johnston" writes: On Friday, September 3, 2021, Philippe Doussot < philippe.dous...@up.coop> wrote: I don't understand why disabling all index from the table speed up the update because the boolean column is not indexed Index entries point to physical records. You just deleted one physical record and added another. The indexes need to be updated with that information. Yeah. The OP's mental model is apparently update-in-place, but that's not how Postgres does things. The index-update overhead is avoided if the update is "HOT", which requires that (a) no indexed column changes and (b) there is room on the same page for the new copy of the row. Ensuring (b) requires running with a fairly low fill-factor, which bloats your table and thereby creates its own costs. Still, that might be worth doing depending on your particular circumstances. regards, tom lane If the DDL for that table had the column defined like this:- my_boolean BOOLEAN, instead of:- my_boolean BOOLEAN NOT NULL DEFAULT FALSE/TRUE, (whichever is convenient) then that column would contain either 'f' or 't' on insert instead of null. Then even if a fillfactor was not specified for that table, an update of that single column (which does not appear in an index) would merely swap the values. Surely that would write it back in place? Also, having boolean columns containing a null makes it difficult for the getter's of that table deciding if 'null' is true or false. Just an observation. Rob Yes my columns are NOT NULL DEFAULT FALSE but the update always change the ctid ( new row in page ) I whas hopping the same optimisation as you: Write in place. For boolean it is maybe doable because the value is fixed in size but for variable length .. I was also expecting no row rewrite if value don't change .. easy for boolean but not for bigger fields Philippe
Re: update non-indexed value is slow if some non-related index/fk are enabled
On 05/09/2021 17:21, Tom Lane wrote: "David G. Johnston" writes: On Friday, September 3, 2021, Philippe Doussot wrote: I don't understand why disabling all index from the table speed up the update because the boolean column is not indexed Index entries point to physical records. You just deleted one physical record and added another. The indexes need to be updated with that information. Yeah. The OP's mental model is apparently update-in-place, but that's not how Postgres does things. The index-update overhead is avoided if the update is "HOT", which requires that (a) no indexed column changes and (b) there is room on the same page for the new copy of the row. Ensuring (b) requires running with a fairly low fill-factor, which bloats your table and thereby creates its own costs. Still, that might be worth doing depending on your particular circumstances. regards, tom lane Hi, Thank you both David and Tom for this precious information. Yes, I was thinking that the update was made in-place because a boolean true or false take the same size :). I did't expect the ctid was changed. The main reason is effectivly > (b) there is (no) room on the same page for the new copy of the row. I now see that the default TABLE fillfactor is 100. No room for update in same page by default (our case). The CTID is changing for each update (even if I UPDATE many time the same boolean column with same value false, false, false .. maybe a room for optimisation here ) And with some room (lower fillfactor) I see the HOT working well with the help from https://habr.com/en/company/postgrespro/blog/483768/ , Thanks Егор Рогов @erogov Disabing index during update obviously put index out of sync with the new row location which require an reindex. I can now better explain to my team why this update is slow ( without lower fillfactor). Why it is quick with index disabled. Why disabling index without reindexing it after enabling it is a very bad idea .. the ctid as changed (without HOT update) and index is out-of-date. Many thanks Philippe
Re: update non-indexed value is slow if some non-related index/fk are enabled
On 06/09/2021 16:13, David G. Johnston wrote: On Monday, September 6, 2021, Philippe Doussot mailto:philippe.dous...@up.coop>> wrote: I whas hopping the same optimisation as you: Write in place. How exactly would you expect “update-in-place” to work given the nature of MVCC? David J. Yes right, it is probably not possible due to concurrency. I don't know enough about it. My expectations was naive Again, thanks David for pointing me to the fact that the raw record was new on each update. Philippe
idle_in_transaction_session_timeout
Hi, Just want to say THANK YOU all (Robert Haas, Vik Fearing, ..) for these feature I just discover available since 9.6 ! although I am still in version 9.5 Philippe