Re: Leading comments and client applications

2022-03-28 Thread Philippe Doussot

Hi,

>I was able to narrow this down to something in SQLAlchemy behavior.

Fine :)


Something about the way TextClause changes the raw SQL string causes the 
behavior I’m seeing, although we didn’t notice it at the time of the changeover.
I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected. 


Alternate option if you want continue to use  TextClause:

use /* comment */ for first prefix comment.

Comment is logged and query executed (tested on Java ( not on SQLAlchemy )).
We use it to track back the request id executed like that

query = em.createNativeQuery("/*requete_enregistree_num_" + requete.getId() + "*/ 
" + requete.getReqRequete().trim());


Philippe

On 25/03/2022 19:05, Philip Semanchuk wrote:



On Mar 25, 2022, at 11:59 AM, Tom Lane  wrote:

Philip Semanchuk  writes:

I'm trying to understand a behavior where, with our Postgres client, a leading 
comment in a SQL script causes the CREATE FUNCTION statement following it to be 
not executed. I can't figure out if this is a bug somewhere or just a 
misunderstanding on my part. I would appreciate some help understanding.

Are you certain there's actually a newline after the comment?
The easiest explanation for this would be if something in the
SQLAlchemy code path were munging the newline.

I verified that there is a newline after the comment. But yes, thanks to your 
suggestion and others, I was able to narrow this down to something in 
SQLAlchemy behavior. In case anyone else comes across this and is wondering --

In addition to accepting a plain string, execute() accepts a number of 
different SQLAlchemy data types, including TextClause and DDLElement. We used 
to pass a DDLElement to execute(), but a few months ago we switched to passing 
a TextClause because DDLElement interprets % signs anywhere in SQL scripts as 
Python string interpolation markers and that was causing us headaches in some 
scripts. Something about the way TextClause changes the raw SQL string causes 
the behavior I’m seeing, although we didn’t notice it at the time of the 
changeover. I don’t know what exactly it’s doing yet, but when I switch back to 
passing a DDLElement to execute(), my SQL function is created as I expected.

https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.Connection.execute

As David J pointed out, execute() is deprecated as of version 1.4. We’re still 
on 1.3 but we’ll have to move away from this code eventually so maybe this is a 
good inspiration to move away from execute() now and reduce the number of 
deprecation warnings we have to deal with in the future.



As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.
The server is perfectly capable of ignoring those by itself,
though.  (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)


Thanks, this is most helpful. I use psql to double check I think SQLAlchemy is 
doing something odd. It’s good to know that psql's behavior in this case is a 
choice and not required behavior for clients. Peter J. Holzer’s psycopg2 
example could have showed me the same; I wish I had thought of that.


I appreciate all the help!

Cheers
Philip






--

📌 Le nom de domaine de nos adresses mails évolue et devient @arche-mc2.fr.


My Logo

arche-mc2.fr 
<https://cloud.letsignit.com/collect/bc/61f021ab21488edd71fe55fe?p=T6Q_2TuhlnOa6Vsx6-MT2IBatnqI7vmjUul2-_EjYgUpYaHQ5q1LqjIR9-pxUHfikOYW7y2eM8WGWMEw6PlnMfKnTaOZP-y2iUHa7jlBrUK_DYDxjdcFR32CefiLPg-hrxT5OaW-c3bv9zu5BEdrLA==>






Philippe DOUSSOT


ARCHITECTE TECHNIQUE


DIRECTION DES SOLUTIONS ARCHE MC2 DOMICILE


philippe.doussot@arche‑mc2.fr




update non-indexed value is slow if some non-related index/fk are enabled

2021-09-05 Thread Philippe Doussot

Hi all,

I'm look for some hint about this behaviour :


One UPDATE of one boolean value which is non-indexed take some time ( for many 
tuples .. 3 Millions ).

But if we disable all indexes/fk the UPDATE is 10x faster and do 10x less 
shared hit .

I don't understand why disabling all index from the table speed up the update 
because the boolean column is not indexed


For exemple , disabling the index was done like this :

UPDATE pg_index SET indisready=false WHERE indexrelid in 
(620809,620837,620839,620841,620854,618764,620855,620790,620790,620840);


In both UPDATE with or without index enabled/disable, the PLAN is the same.


Postgres 9.5 :
Explain with index : slow
Update on public.t_acte_entite (cost=0.00..52924.83 rows=2758149 width=45) 
(actual time=120908.338..120908.338 rows=0 loops=1)
Buffers: shared hit=101546408 read=164106 dirtied=171996 written=9529
-> Seq Scan on public.t_acte_entite (cost=0.00..52924.83 rows=2758149 width=45) 
(actual time=0.028..1783.946 rows=3033768 loops=1)
Output: id_acte_entite, fk_acte, fk_rubrique, fk_client, fk_salarie, 
fk_cycldet, fk_planning, act_user_modified, fk_ssiad_visite, 
fk_ssiad_visite_planning, true, acte_realise, acte_realise_modifie, ctid
Filter: ((NOT t_acte_entite.acte_prevu) AND 
((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR 
(t_acte_entite.fk_ssiad_visite IS NOT NULL)))
Rows Removed by Filter: 16115
Buffers: shared hit=10 read=22416 written=9441
Planning time: 0.133 ms
Execution time: 120908.387 ms

Explain with index disabled :
Update on public.t_acte_entite (cost=0.00..52924.83 rows=2757021 width=45) 
(actual time=9007.353..9007.353 rows=0 loops=1)
Buffers: shared hit=9145912 read=44740 dirtied=44620 written=60
-> Seq Scan on public.t_acte_entite (cost=0.00..52924.83 rows=2757021 width=45) 
(actual time=2.608..1121.776 rows=3033768 loops=1)
Output: id_acte_entite, fk_acte, fk_rubrique, fk_client, fk_salarie, 
fk_cycldet, fk_planning, act_user_modified, fk_ssiad_visite, 
fk_ssiad_visite_planning, true, acte_realise, acte_realise_modifie, ctid
Filter: ((NOT t_acte_entite.acte_prevu) AND 
((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR 
(t_acte_entite.fk_ssiad_visite IS NOT NULL)))
Rows Removed by Filter: 16115
Buffers: shared hit=1 read=22425 written=60
Planning time: 5.835 ms
Execution time: 9007.400 ms

You can see that the Buffers: shared hit= drop from 101 Millions to 10Million 
without the indexes
But as the query plan only do a Sec Scan and don't acces an index I'm little 
surprised.


Same on postgres 13 :

//without explain to know how many update are done :
test=# update t_acte_entite set acte_prevu = true where 
(fk_ssiad_visite_planning is not null or fk_ssiad_visite is not null);
UPDATE 3049406

//First explain ( with index enabled )
test=# explain (analyse, COSTS, buffers, verbose, FORMAT JSON) update 
t_acte_entite set acte_prevu = true where (fk_ssiad_visite_planning is not null 
or fk_ssiad_visite is not null);
[
  {
    "Plan": {
  "Node Type": "ModifyTable",
  "Operation": "Update",
  "Parallel Aware": false,
  "Relation Name": "t_acte_entite",
  "Schema": "public",
  "Alias": "t_acte_entite",
  "Startup Cost": 0.00,
  "Total Cost": 87603.92,
  "Plan Rows": 2748711,
  "Plan Width": 46,
  "Actual Startup Time": 74072.048,
  "Actual Total Time": 74072.049,
  "Actual Rows": 0,
  "Actual Loops": 1,
  "Shared Hit Blocks": 89724572,
  "Shared Read Blocks": 115370,
  "Shared Dirtied Blocks": 101855,
  "Shared Written Blocks": 54351,
  "Local Hit Blocks": 0,
  "Local Read Blocks": 0,
  "Local Dirtied Blocks": 0,
  "Local Written Blocks": 0,
  "Temp Read Blocks": 0,
  "Temp Written Blocks": 0,
  "Plans": [
    {
  "Node Type": "Seq Scan",
  "Parent Relationship": "Member",
  "Parallel Aware": false,
  "Relation Name": "t_acte_entite",
  "Schema": "public",
  "Alias": "t_acte_entite",
  "Startup Cost": 0.00,
  "Total Cost": 87603.92,
  "Plan Rows": 2748711,
  "Plan Width": 46,
  "Actual Startup Time": 0.047,
  "Actual Total Time": 1040.544,
  "Actual Rows": 3049406,
  "Actual Loops": 1,
  "Output": ["id_acte_entite", "fk_acte", "fk_rubrique", "fk_client", "fk_salarie", "fk_cycldet", "fk_planning", 
"act_user_modified", "fk_ssiad_visite", "fk_ssiad_visite_planning", "true", "acte_realise", "acte_realise_modifie", "ctid"],
  "Filter": "((t_acte_entite.fk_ssiad_visite_planning IS NOT NULL) OR 
(t_acte_entite.fk_ssiad_visite IS NOT NULL))",
  "Rows Removed by Filter": 477,
  "Shared Hit Blocks": 4334,
  "Shared Read Blocks": 53032,
  "Shared Dirtied Blocks": 0,
  "Shared Written Blocks": 14709,
  "Local Hit Blocks": 0,
  "Local Read Blocks": 0,
  "Local Dirtied Blocks": 0,
  "Local Written Blocks": 0,
  "Temp Rea

Re: update non-indexed value is slow if some non-related index/fk are enabled

2021-09-06 Thread Philippe Doussot

On 06/09/2021 10:21, rob stone wrote:


On Sun, 2021-09-05 at 11:21 -0400, Tom Lane wrote:

"David G. Johnston"  writes:

On Friday, September 3, 2021, Philippe Doussot <
philippe.dous...@up.coop>
wrote:

I don't understand why disabling all index from the table speed
up the
update because the boolean column is not indexed

Index entries point to physical records.  You just deleted one
physical
record and added another.  The indexes need to be updated with that
information.

Yeah.  The OP's mental model is apparently update-in-place, but
that's
not how Postgres does things.

The index-update overhead is avoided if the update is "HOT", which
requires that (a) no indexed column changes and (b) there is room
on the same page for the new copy of the row.  Ensuring (b) requires
running with a fairly low fill-factor, which bloats your table and
thereby creates its own costs.  Still, that might be worth doing
depending on your particular circumstances.

 regards, tom lane



If the DDL for that table had the column defined like this:-

my_boolean  BOOLEAN,

instead of:-

my_boolean  BOOLEAN NOT NULL DEFAULT FALSE/TRUE, (whichever is
convenient)

then that column would contain either 'f' or 't' on insert instead of
null.

Then even if a fillfactor was not specified for that table, an update
of that single column (which does not appear in an index) would merely
swap the values.
Surely that would write it back in place?

Also, having boolean columns containing a null makes it difficult for
the getter's of that table deciding if 'null' is true or false.

Just an observation.

Rob


Yes my columns are NOT NULL DEFAULT FALSE

but the update always change the ctid ( new row in page )

I whas hopping the same optimisation as you: Write in place.

For boolean it is maybe doable because the value is fixed in size but for 
variable length ..

I was also expecting no row rewrite if value don't change .. easy for boolean 
but not for bigger fields


Philippe




Re: update non-indexed value is slow if some non-related index/fk are enabled

2021-09-06 Thread Philippe Doussot

On 05/09/2021 17:21, Tom Lane wrote:

"David G. Johnston"  writes:

On Friday, September 3, 2021, Philippe Doussot
wrote:

I don't understand why disabling all index from the table speed up the
update because the boolean column is not indexed

Index entries point to physical records.  You just deleted one physical
record and added another.  The indexes need to be updated with that
information.

Yeah.  The OP's mental model is apparently update-in-place, but that's
not how Postgres does things.

The index-update overhead is avoided if the update is "HOT", which
requires that (a) no indexed column changes and (b) there is room
on the same page for the new copy of the row.  Ensuring (b) requires
running with a fairly low fill-factor, which bloats your table and
thereby creates its own costs.  Still, that might be worth doing
depending on your particular circumstances.

regards, tom lane

Hi,

Thank you both David and Tom for this precious information.

Yes, I was thinking that the update was made in-place because a boolean true or 
false take the same size :).
I did't expect the ctid was changed.

The main reason is effectivly
> (b) there is (no) room on the same page for the new copy of the row.

I now see that the default TABLE fillfactor is 100. No room for update in same 
page by default (our case).

The CTID is changing for each update (even if I UPDATE many time the same 
boolean column with same value false, false, false .. maybe a room for 
optimisation here )

And with some room (lower fillfactor) I see the HOT working well with the help 
from https://habr.com/en/company/postgrespro/blog/483768/ , Thanks Егор Рогов 
@erogov


Disabing index during update obviously put index out of sync with the new row 
location which require an reindex.

I can now better explain to my team why this update is slow ( without lower 
fillfactor).
Why it is quick with index disabled.
Why disabling index without reindexing it after enabling it is a very bad idea 
.. the ctid as changed (without HOT update) and index is out-of-date.

Many thanks

Philippe




Re: update non-indexed value is slow if some non-related index/fk are enabled

2021-09-07 Thread Philippe Doussot


On 06/09/2021 16:13, David G. Johnston wrote:

On Monday, September 6, 2021, Philippe Doussot mailto:philippe.dous...@up.coop>> wrote:

I whas hopping the same optimisation as you: Write in place.



How exactly would you expect “update-in-place” to work given the nature of MVCC?

David J.


Yes right, it is probably not possible due to concurrency.

I don't know enough about it.

My expectations was naive


Again, thanks David for pointing me to the fact that the raw record was new on 
each update.


Philippe



idle_in_transaction_session_timeout

2021-10-06 Thread Philippe Doussot

Hi,


Just want to say THANK YOU all (Robert Haas, Vik Fearing, ..) for these feature 
I just discover available since 9.6 !


although I am still in version 9.5


Philippe