Performance of UPDATE operation

2023-02-13 Thread Mkrtchyan, Tigran

Dear Postgres Folks,

Typically we expect that UPDATE is a slow operation in PostgreSQL, however,
there are cases where it's hard to understand why. In particular, I have a 
table like

```
CREATE SEQUENCE t_inodes_inumber_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;


CREATE TABLE t_inodes (
inumber bigint PRIMARY KEY,
icrtime timestamp with time zone NOT NULL,
igeneration bigint NOT NULL
);
```

and a transaction that inserts and update an entry in that table:

```
BEGIN;
INSERT INTO t_inodes (inumber, icrtime, igeneration)
   VALUES (nextval('t_inodes_inumber_seq'), now(), 0) RETURNING inumber \gset

UPDATE t_inodes SET igeneration = igeneration + 1 where  inumber = :inumber;
END;
```

The pgbench shows the following result:

```
$ pgbench -h localhost -n -r  -f update.sql -t 1 -c 64 -j 64 testdb
pgbench (15.0 (Debian 15.0-1.pgdg110+1))
transaction type: update.sql
scaling factor: 1
query mode: simple
number of clients: 64
number of threads: 64
maximum number of tries: 1
number of transactions per client: 1
number of transactions actually processed: 64/64
number of failed transactions: 0 (0.000%)
latency average = 11.559 ms
initial connection time = 86.038 ms
tps = 5536.736898 (without initial connection time)
statement latencies in milliseconds and failures:
 0.524   0  BEGIN;
 0.819   0  INSERT INTO t_inodes (inumber, icrtime, igeneration)
 0.962   0  UPDATE t_inodes SET igeneration = igeneration + 1 
where  inumber = :inumber;
 9.203   0  END;
```

My naive expectation will be that updating the newly inserted record should 
cost nothing... Are there ways
to make it less expensive?

Best regards,
   Tigran.

smime.p7s
Description: S/MIME Cryptographic Signature


Re: Performance of UPDATE operation

2023-02-13 Thread Laurenz Albe
On Mon, 2023-02-13 at 16:09 +0100, Mkrtchyan, Tigran wrote:
> Typically we expect that UPDATE is a slow operation in PostgreSQL, however,
> there are cases where it's hard to understand why. In particular, I have a 
> table like
> 
> ```
> CREATE SEQUENCE t_inodes_inumber_seq
>     START WITH 1
>     INCREMENT BY 1
>     NO MINVALUE
>     NO MAXVALUE
>     CACHE 1;
> 
> 
> CREATE TABLE t_inodes (
>     inumber bigint PRIMARY KEY,
>     icrtime timestamp with time zone NOT NULL,
>     igeneration bigint NOT NULL
> );
> ```
> 
> and a transaction that inserts and update an entry in that table:
> 
> ```
> BEGIN;
> INSERT INTO t_inodes (inumber, icrtime, igeneration)
>    VALUES (nextval('t_inodes_inumber_seq'), now(), 0) RETURNING inumber \gset
> 
> UPDATE t_inodes SET igeneration = igeneration + 1 where  inumber = :inumber;
> END;
> ```
> 
> The pgbench shows the following result:
> 
> ```
> $ pgbench -h localhost -n -r  -f update.sql -t 1 -c 64 -j 64 testdb
> pgbench (15.0 (Debian 15.0-1.pgdg110+1))
> transaction type: update.sql
> scaling factor: 1
> query mode: simple
> number of clients: 64
> number of threads: 64
> maximum number of tries: 1
> number of transactions per client: 1
> number of transactions actually processed: 64/64
> number of failed transactions: 0 (0.000%)
> latency average = 11.559 ms
> initial connection time = 86.038 ms
> tps = 5536.736898 (without initial connection time)
> statement latencies in milliseconds and failures:
>  0.524   0  BEGIN;
>  0.819   0  INSERT INTO t_inodes (inumber, icrtime, 
> igeneration)
>  0.962   0  UPDATE t_inodes SET igeneration = igeneration + 1 
> where  inumber = :inumber;
>  9.203   0  END;
> ```
> 
> My naive expectation will be that updating the newly inserted record should 
> cost nothing... Are there ways
> to make it less expensive?

Updating a newly inserted row is about as expensive as inserting the row in the 
first place.

You can reduce the overall impact somewhat by creating the table with a 
"fillfactor" below
100, in your case 90 would probably be enough.  That won't speed up the UPDATE 
itself, but
it should greatly reduce the need for VACUUM.

Yours,
Laurenz Albe




Re: Performance of UPDATE operation

2023-02-13 Thread Oluwatobi Ogunsola
Maybe reconsider your expectation.
Note: Every “update” have to “select” before modifying data.
Even if the page is in memory, there still work…reading ,acquiring lock, 
modifying and request to write to disk.


Regards,
Tobi

> On 13 Feb 2023, at 18:48, Laurenz Albe  wrote:
> 
> On Mon, 2023-02-13 at 16:09 +0100, Mkrtchyan, Tigran wrote:
>> Typically we expect that UPDATE is a slow operation in PostgreSQL, however,
>> there are cases where it's hard to understand why. In particular, I have a 
>> table like
>> 
>> ```
>> CREATE SEQUENCE t_inodes_inumber_seq
>> START WITH 1
>> INCREMENT BY 1
>> NO MINVALUE
>> NO MAXVALUE
>> CACHE 1;
>> 
>> 
>> CREATE TABLE t_inodes (
>> inumber bigint PRIMARY KEY,
>> icrtime timestamp with time zone NOT NULL,
>> igeneration bigint NOT NULL
>> );
>> ```
>> 
>> and a transaction that inserts and update an entry in that table:
>> 
>> ```
>> BEGIN;
>> INSERT INTO t_inodes (inumber, icrtime, igeneration)
>>VALUES (nextval('t_inodes_inumber_seq'), now(), 0) RETURNING inumber \gset
>> 
>> UPDATE t_inodes SET igeneration = igeneration + 1 where  inumber = :inumber;
>> END;
>> ```
>> 
>> The pgbench shows the following result:
>> 
>> ```
>> $ pgbench -h localhost -n -r  -f update.sql -t 1 -c 64 -j 64 testdb
>> pgbench (15.0 (Debian 15.0-1.pgdg110+1))
>> transaction type: update.sql
>> scaling factor: 1
>> query mode: simple
>> number of clients: 64
>> number of threads: 64
>> maximum number of tries: 1
>> number of transactions per client: 1
>> number of transactions actually processed: 64/64
>> number of failed transactions: 0 (0.000%)
>> latency average = 11.559 ms
>> initial connection time = 86.038 ms
>> tps = 5536.736898 (without initial connection time)
>> statement latencies in milliseconds and failures:
>>  0.524   0  BEGIN;
>>  0.819   0  INSERT INTO t_inodes (inumber, icrtime, 
>> igeneration)
>>  0.962   0  UPDATE t_inodes SET igeneration = igeneration + 
>> 1 where  inumber = :inumber;
>>  9.203   0  END;
>> ```
>> 
>> My naive expectation will be that updating the newly inserted record should 
>> cost nothing... Are there ways
>> to make it less expensive?
> 
> Updating a newly inserted row is about as expensive as inserting the row in 
> the first place.
> 
> You can reduce the overall impact somewhat by creating the table with a 
> "fillfactor" below
> 100, in your case 90 would probably be enough.  That won't speed up the 
> UPDATE itself, but
> it should greatly reduce the need for VACUUM.
> 
> Yours,
> Laurenz Albe
> 
> 





Re: For loop execution times in PostgreSQL 12 vs 15

2023-02-13 Thread Andres Freund
Hi,

On 2023-02-10 20:45:39 +0100, Pavel Stehule wrote:
> But for significant improvements it needs some form of JIT (Postgres has JIT
> for SQL expressions, but it is not used for PLpgSQL expressions). On second
> hand, PL/pgSQL is not designed (and usually) not used for extensive numeric
> calculations like this. But if somebody try to enhance performance, (s)he
> will be welcome every time (I think so there is some space for 2x better
> performance - but it requires JIT).

I think there's a *lot* of performance gain to be had before JIT is
required. Or before JIT really can do a whole lot.

We do a lot of work for each plpgsql statement / expr. Most of the time
typically isn't spent actually evaluating expressions, but doing setup /
invalidation work.

E.g. here's a profile of the test() function from upthread:

  Overhead  Command   Shared Object Symbol
+   17.31%  postgres  plpgsql.so[.] exec_stmts
+   15.43%  postgres  postgres  [.] ExecInterpExpr
+   14.29%  postgres  plpgsql.so[.] exec_eval_expr
+   11.79%  postgres  plpgsql.so[.] exec_assign_value
+7.06%  postgres  plpgsql.so[.] plpgsql_param_eval_var
+6.58%  postgres  plpgsql.so[.] exec_assign_expr
+4.82%  postgres  postgres  [.] recomputeNamespacePath
+3.90%  postgres  postgres  [.] CachedPlanIsSimplyValid
+3.45%  postgres  postgres  [.] dtoi8
+3.02%  postgres  plpgsql.so[.] exec_stmt_fori
+2.88%  postgres  postgres  [.] OverrideSearchPathMatchesCurrent
+2.76%  postgres  postgres  [.] EnsurePortalSnapshotExists
+2.16%  postgres  postgres  [.] float8mul
+1.62%  postgres  postgres  [.] MemoryContextReset

Some of this is a bit distorted due to inlining (e.g. exec_eval_simple_expr()
is attributed to exec_eval_expr()).


Most of the checks we do ought to be done once, at the start of plpgsql
evaluation, rather than be done over and over, during evaluation.

For things like simple exprs, we likely could gain a lot by pushing more of
the work into ExecEvalExpr(), rather than calling ExecEvalExpr() multiple
times.

The memory layout of plpgsql statements should be improved, there's a lot of
unnecessary indirection. That's what e.g. hurts exec_stmts() a lot.

Greetings,

Andres




Re: Performance of UPDATE operation

2023-02-13 Thread Jeff Janes
On Mon, Feb 13, 2023 at 10:09 AM Mkrtchyan, Tigran 
wrote:

>
>  0.524   0  BEGIN;
>  0.819   0  INSERT INTO t_inodes (inumber, icrtime,
> igeneration)
>  0.962   0  UPDATE t_inodes SET igeneration = igeneration
> + 1 where  inumber = :inumber;
>  9.203   0  END;
> ```
>
> My naive expectation will be that updating the newly inserted record
> should cost nothing


It takes less than 1/10 of the total time.  That is pretty close to
nothing.  Why would you expect it to be truly free?


> ... Are there ways
> to make it less expensive?
>

Obviously here you could just insert the correct value in the first place
and not do the update at all.

Cheers,

Jeff


Re: For loop execution times in PostgreSQL 12 vs 15

2023-02-13 Thread Pavel Stehule
po 13. 2. 2023 v 22:22 odesílatel Andres Freund  napsal:

> Hi,
>
> On 2023-02-10 20:45:39 +0100, Pavel Stehule wrote:
> > But for significant improvements it needs some form of JIT (Postgres has
> JIT
> > for SQL expressions, but it is not used for PLpgSQL expressions). On
> second
> > hand, PL/pgSQL is not designed (and usually) not used for extensive
> numeric
> > calculations like this. But if somebody try to enhance performance, (s)he
> > will be welcome every time (I think so there is some space for 2x better
> > performance - but it requires JIT).
>
> I think there's a *lot* of performance gain to be had before JIT is
> required. Or before JIT really can do a whole lot.
>
> We do a lot of work for each plpgsql statement / expr. Most of the time
> typically isn't spent actually evaluating expressions, but doing setup /
> invalidation work.
>

And it is the reason why I think JIT can help.

You repeatedly read and use switches based if the variable has fixed length
or if it is varlena, if it is native composite or plpgsql composite, every
time you check if target is mutable or not, every time you check if
expression type is the same as target type. The PL/pgSQL compiler is very
"lazy". Lots of checks are executed at runtime (or repeated). Another
question is the cost of v1 calling notation. These functions require some
environment, and preparing this environment is expensive. SQL executor has
a lot of parameters and setup is not cheap.

There are the same cases where expression: use buildin stable or immutable
functions, operators and types, and these types are immutable. Maybe it can
be extended with buffering for different search_paths, and then it cannot
be limited just for buildin's objects.




>
> E.g. here's a profile of the test() function from upthread:
>
>   Overhead  Command   Shared Object Symbol
> +   17.31%  postgres  plpgsql.so[.] exec_stmts
> +   15.43%  postgres  postgres  [.] ExecInterpExpr
> +   14.29%  postgres  plpgsql.so[.] exec_eval_expr
> +   11.79%  postgres  plpgsql.so[.] exec_assign_value
> +7.06%  postgres  plpgsql.so[.] plpgsql_param_eval_var
> +6.58%  postgres  plpgsql.so[.] exec_assign_expr
> +4.82%  postgres  postgres  [.] recomputeNamespacePath
> +3.90%  postgres  postgres  [.] CachedPlanIsSimplyValid
> +3.45%  postgres  postgres  [.] dtoi8
> +3.02%  postgres  plpgsql.so[.] exec_stmt_fori
> +2.88%  postgres  postgres  [.]
> OverrideSearchPathMatchesCurrent
> +2.76%  postgres  postgres  [.] EnsurePortalSnapshotExists
> +2.16%  postgres  postgres  [.] float8mul
> +1.62%  postgres  postgres  [.] MemoryContextReset
>
> Some of this is a bit distorted due to inlining (e.g.
> exec_eval_simple_expr()
> is attributed to exec_eval_expr()).
>
>
> Most of the checks we do ought to be done once, at the start of plpgsql
> evaluation, rather than be done over and over, during evaluation.
>
> For things like simple exprs, we likely could gain a lot by pushing more of
> the work into ExecEvalExpr(), rather than calling ExecEvalExpr() multiple
> times.
>
> The memory layout of plpgsql statements should be improved, there's a lot
> of
> unnecessary indirection. That's what e.g. hurts exec_stmts() a lot.
>
> Greetings,
>
> Andres
>