date:20230203

Re: Sequence vs UUID

2023-02-03 Thread Miles Elam

On Thu, Feb 2, 2023 at 11:47 AM veem v  wrote:

> Tested the UUIDv7 generator for postgres as below.
>
> With regards to performance , It's still way behind the sequence. I was
> expecting the insert performance of UUID v7 to be closer to the sequence ,
> but it doesn't seem so, as it's 500ms vs 3000ms. And the generation takes a
> lot longer time as compared to sequence too i.e. 59ms vs 1700ms. Read time
> or the index scan looks close i.e. 2.3ms vs 2.6ms.
>

Thank you for taking the effort in testing and measuring this.

Those numbers make some intuitive sense to me. The function is written in
plpgsql, not C, and is dependent on generating a UUIDv4 and then modifying
it to include the timestamp and version change. While I suspect it will
never beat a bigint by virtue of 64-bits will always be half the size of
128-bit, the read time on the index scan after it is generated is
encouraging with a strong suggestion there's a lot of low-hanging fruit for
improvement.

Also, like UUIDv4, v7 can be generated by clients, ameliorating the
generation bottleneck.

Once again, thank you for following up with good quality analysis.

Fwd: Sequence vs UUID

2023-02-03 Thread Dominique Devienne

Copying the list...

-- Forwarded message -
From: Dominique Devienne 
Date: Fri, Feb 3, 2023 at 4:57 PM
Subject: Re: Sequence vs UUID
To: veem v 

On Thu, Feb 2, 2023 at 8:47 PM veem v  wrote:

> Tested the UUIDv7 generator for postgres as below.
> With regards to performance , It's still way behind the sequence. [...]
> explain analyze  select count(nextval('myseq') ) from
> generate_series(1,10);
> Execution Time: 59.687 ms
>
> explain analyze  select count(gen_random_uuid()) from
> generate_series(1,100'000);
> Execution Time: 904.868 ms
>
> explain analyze  select count(uuid_generate_v7()) from
> generate_series(1,10);
> Execution Time: 1711.187 ms
>

Something's off regarding Guid generations IMHO...

You generate 100K Guids in ~1s. While we generate (in C++, Windows Release,
using Boost) 16M of them in +/- the same time:

Enabling Performance tests
>>>
>>

> generate 16'000'000 guids in 0.980s (user: 0.984s) 12 MB
>>>
>> generate 16'000'000 guids in parallel on 4 CPUs in 0.309s (user: 1.188s)
>>> 12 MB
>>>
>>
That's 2 orders of magnitude faster. Sure there's some overhead from the
SQL, but still. Something seems fishy.
And that's on a 2.5y old desktop. --DD

Re: Sequence vs UUID

2023-02-03 Thread veem v

Actually I did the testing by connecting to "https://dbfiddle.uk/";
postgres version -15.

PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
20210514 (Red Hat 8.5.0-10), 64-bit

Am I doing it wrong, please confirm?


On Fri, 3 Feb 2023 at 21:28, Dominique Devienne  wrote:

> Copying the list...
>
> -- Forwarded message -
> From: Dominique Devienne 
> Date: Fri, Feb 3, 2023 at 4:57 PM
> Subject: Re: Sequence vs UUID
> To: veem v 
>
> On Thu, Feb 2, 2023 at 8:47 PM veem v  wrote:
>
>> Tested the UUIDv7 generator for postgres as below.
>> With regards to performance , It's still way behind the sequence. [...]
>> explain analyze  select count(nextval('myseq') ) from
>> generate_series(1,10);
>> Execution Time: 59.687 ms
>>
>> explain analyze  select count(gen_random_uuid()) from
>> generate_series(1,100'000);
>> Execution Time: 904.868 ms
>>
>> explain analyze  select count(uuid_generate_v7()) from
>> generate_series(1,10);
>> Execution Time: 1711.187 ms
>>
>
> Something's off regarding Guid generations IMHO...
>
> You generate 100K Guids in ~1s. While we generate (in C++, Windows
> Release, using Boost) 16M of them in +/- the same time:
>
> Enabling Performance tests

>>>
>
>> generate 16'000'000 guids in 0.980s (user: 0.984s) 12 MB

>>> generate 16'000'000 guids in parallel on 4 CPUs in 0.309s (user: 1.188s)
 12 MB

>>>
> That's 2 orders of magnitude faster. Sure there's some overhead from the
> SQL, but still. Something seems fishy.
> And that's on a 2.5y old desktop. --DD
>

Re: Sequence vs UUID

2023-02-03 Thread Dominique Devienne

On Fri, Feb 3, 2023 at 5:48 PM veem v  wrote:

> Actually I did the testing by connecting to "https://dbfiddle.uk/";
> postgres version -15.
>
> PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0
> 20210514 (Red Hat 8.5.0-10), 64-bit
>
> Am I doing it wrong, please confirm?
>
>
No clue. But even SQLite can generate 1M uuid (16 bytes random blobs are
equivalent) is 1/3 of the time, so 30x faster.
So your timings of generating 100K uuids and counting them seems way too
slow to me. --DD

sqlite> select count(randomblob(16)) from generate_series(1,1000*1000);
QUERY PLAN
`--SCAN generate_series VIRTUAL TABLE INDEX 3:
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 1500   Start at 15
1 Null   0 1 10   r[1..1]=NULL
2 VOpen  0 0 0 vtab:274D3E0   0
3 Integer1 4 00   r[4]=1
4 Multiply   6 6 50   r[5]=r[6]*r[6]
5 Integer3 2 00   r[2]=3
6 Integer2 3 00   r[3]=2
7 VFilter0 1120   iplan=r[2]
zplan=''
8   Function   1 8 7 randomblob(1)  0   r[7]=func(r[8])
9   AggStep0 7 1 count(1)   1   accum=r[1]
step(r[7])
10VNext  0 8 00
11AggFinal   1 1 0 count(1)   0   accum=r[1] N=1
12Copy   1 9 00   r[9]=r[1]
13ResultRow  9 1 00   output=r[9]
14Halt   0 0 00
15Transaction0 0 1 0  1   usesStmtJournal=0
16Integer1000  6 00   r[6]=1000
17Integer168 00   r[8]=16
18Goto   0 1 00
┌───┐
│ count(randomblob(16)) │
├───┤
│ 100   │
└───┘
Run Time: real 0.278 user 0.25 sys 0.00


> On Fri, 3 Feb 2023 at 21:28, Dominique Devienne 
> wrote:
>
>> Something's off regarding Guid generations IMHO...
>> You generate 100K Guids in ~1s. While we generate (in C++, Windows
>> Release, using Boost) 16M of them in +/- the same time:
>>
>

Switching identity column to serial

2023-02-03 Thread Erik Wienhold

I was wondering if it's possible to drop a column identity (not the column
itself) while keeping the attached sequence.  This would avoid recreating
an identical sequence (especially with a correct start value and owner).

Changing the sequence owner to NONE before dropping identity is not allowed.
Also changing pg_class.relowner to some role did not help.  The sequence is
still dropped together with the column identity.

But I managed it by clearing pg_attribute.attidentity.  See the following
psql session:

test=# create table t (id int generated always as identity, x char);
CREATE TABLE

test=# insert into t (x) values ('a'), ('b') returning *;
 id | x
+---
  1 | a
  2 | b
(2 rows)

INSERT 0 2

test=# select pg_get_serial_sequence('t', 'id');
 pg_get_serial_sequence

 public.t_id_seq
(1 row)

test=# update pg_attribute set attidentity = '' where (attrelid, 
attname) = ('t'::regclass, 'id');
UPDATE 1

test=# alter table t alter id drop identity;
psql:1.sql:6: ERROR:  column "id" of relation "t" is not an identity 
column

test=# select pg_get_serial_sequence('t', 'id');
 pg_get_serial_sequence

 public.t_id_seq
(1 row)

test=# alter table t alter id set default nextval('t_id_seq');
ALTER TABLE

test=# insert into t (x) values ('c'), ('d') returning *;
 id | x
+---
  3 | c
  4 | d
(2 rows)

INSERT 0 2

test=# insert into t (id, x) values (-1, 'e') returning *;
 id | x
+---
 -1 | e
(1 row)

INSERT 0 1

test=# select * from t;
 id | x
+---
  1 | a
  2 | b
  3 | c
  4 | d
 -1 | e
(5 rows)

Is this sufficient or am I missing some detail and messing around with
pg_catalog is not enough (in addition to being risky)?

Some context:

I have to change identity columns to a form that resembles a definition as
serial.  Creating a new column and migrating the primary key constraint is
not an option.

Why is this change necessary?

My team is importing data with QGIS which fails to properly handle identity
columns.  QGIS uses INSERT with OVERRIDING SYSTEM VALUE but tries to insert
NULL although identity columns imply NOT NULL (also it's the primary key).
QGIS tries to generate an ID with nextval but does not use the qualified
sequence name although search_path does not contain the namespace.  It's
weird that QGIS thinks that it should generate the ID instead of delegating
this to the database, yet it uses RETURNING id.  Maybe it needs the ID in
advance for reference.  I don't know.

The "serial" style with nextval as column default works as expected.
Probably because QGIS just uses the column default expression which should
reference the correct sequence.  Oh, did I mention yet that QGIS generates
the ID before issuing an INSERT with RETURNING id?

I'll still open a bug ticket with QGIS but right now there's no other way
than ditching identity columns.

--
Erik

Re: Switching identity column to serial

2023-02-03 Thread Erik Wienhold

> On 04/02/2023 01:54 CET Erik Wienhold  wrote:
>
> I was wondering if it's possible to drop a column identity (not the column
> itself) while keeping the attached sequence.  This would avoid recreating
> an identical sequence (especially with a correct start value and owner).
>
> Changing the sequence owner to NONE before dropping identity is not allowed.
> Also changing pg_class.relowner to some role did not help.  The sequence is
> still dropped together with the column identity.
>
> But I managed it by clearing pg_attribute.attidentity.  See the following
> psql session:

Forgot to mention: tested on 12.13 and 15.1.

>
>   test=# create table t (id int generated always as identity, x char);
>   CREATE TABLE
>
>   test=# insert into t (x) values ('a'), ('b') returning *;
>id | x
>   +---
> 1 | a
> 2 | b
>   (2 rows)
>
>   INSERT 0 2
>
>   test=# select pg_get_serial_sequence('t', 'id');
>pg_get_serial_sequence
>   
>public.t_id_seq
>   (1 row)
>
>   test=# update pg_attribute set attidentity = '' where (attrelid, 
> attname) = ('t'::regclass, 'id');
>   UPDATE 1
>
>   test=# alter table t alter id drop identity;
>   psql:1.sql:6: ERROR:  column "id" of relation "t" is not an identity 
> column
>
>   test=# select pg_get_serial_sequence('t', 'id');
>pg_get_serial_sequence
>   
>public.t_id_seq
>   (1 row)
>
>   test=# alter table t alter id set default nextval('t_id_seq');
>   ALTER TABLE
>
>   test=# insert into t (x) values ('c'), ('d') returning *;
>id | x
>   +---
> 3 | c
> 4 | d
>   (2 rows)
>
>   INSERT 0 2
>
>   test=# insert into t (id, x) values (-1, 'e') returning *;
>id | x
>   +---
>-1 | e
>   (1 row)
>
>   INSERT 0 1
>
>   test=# select * from t;
>id | x
>   +---
> 1 | a
> 2 | b
> 3 | c
> 4 | d
>-1 | e
>   (5 rows)
>
> Is this sufficient or am I missing some detail and messing around with
> pg_catalog is not enough (in addition to being risky)?
>
> Some context:
>
> I have to change identity columns to a form that resembles a definition as
> serial.  Creating a new column and migrating the primary key constraint is
> not an option.
>
> Why is this change necessary?
>
> My team is importing data with QGIS which fails to properly handle identity
> columns.  QGIS uses INSERT with OVERRIDING SYSTEM VALUE but tries to insert
> NULL although identity columns imply NOT NULL (also it's the primary key).
> QGIS tries to generate an ID with nextval but does not use the qualified
> sequence name although search_path does not contain the namespace.  It's
> weird that QGIS thinks that it should generate the ID instead of delegating
> this to the database, yet it uses RETURNING id.  Maybe it needs the ID in
> advance for reference.  I don't know.
>
> The "serial" style with nextval as column default works as expected.
> Probably because QGIS just uses the column default expression which should
> reference the correct sequence.  Oh, did I mention yet that QGIS generates
> the ID before issuing an INSERT with RETURNING id?
>
> I'll still open a bug ticket with QGIS but right now there's no other way
> than ditching identity columns.
>
> --
> Erik

Re: Switching identity column to serial

2023-02-03 Thread Ron


On 2/3/23 18:54, Erik Wienhold wrote:

I was wondering if it's possible to drop a column identity (not the column
itself) while keeping the attached sequence.  This would avoid recreating
an identical sequence (especially with a correct start value and owner).


Why doesn't this work?
BEGIN;
DROP SEQUENCE t_id;
CREATE SEQUENCE new_t_id_seq AS INTEGER OWNED BY t.id;
ALTER SEQUENCE new_t_id_seq OWNER TO new_owner;
SELECT setval('new_t_id', (SELECT MAX(id) FROM t));
SELECT nextval('new_t_id');
COMMIT;


Changing the sequence owner to NONE before dropping identity is not allowed.
Also changing pg_class.relowner to some role did not help.  The sequence is
still dropped together with the column identity.


Manually diigging around the system catalog is /never/ recommended.

--
Born in Arizona, moved to Babylonia.

Re: Switching identity column to serial

2023-02-03 Thread Ron


On 2/3/23 22:41, Ron wrote:

On 2/3/23 18:54, Erik Wienhold wrote:

I was wondering if it's possible to drop a column identity (not the column
itself) while keeping the attached sequence.  This would avoid recreating
an identical sequence (especially with a correct start value and owner).


Why doesn't this work?
BEGIN;
DROP SEQUENCE t_id;
CREATE SEQUENCE new_t_id_seq AS INTEGER OWNED BY t.id;
ALTER SEQUENCE new_t_id_seq OWNER TO new_owner;
SELECT setval('new_t_id', (SELECT MAX(id) FROM t));
SELECT nextval('new_t_id');
COMMIT;


Note: this was not tested...

--
Born in Arizona, moved to Babylonia.

Re: Sequence vs UUID

Fwd: Sequence vs UUID

Re: Sequence vs UUID

Re: Sequence vs UUID

Switching identity column to serial

Re: Switching identity column to serial

Re: Switching identity column to serial

Re: Switching identity column to serial

8 matches

Site Navigation

Mail list logo

Footer information