Re: PostgreSQLv14 TPC-H performance GCC vs Clang

2022-01-18 Thread arjun shetty
Hi All,

I checked with LLVM/CLang 14.0 on arch x86-64-O3 in the Mac/AMD EPYC
environment , but I see  GCC performs better than Clang14.
Clang14-https://github.com/llvm/llvm-project(main branch and pull or
commitID:3f3fe4a5cfa1797..)
[image: image.png]
pre analysis GCC vs Clang
 (1) GCC more inlined functionality compared to Clang in PostgreSQL
 (2) in few functions  GCC are not inlined but Clang consider inline
   postgresqlv14/src/include/utlis/float.h: float8_mul(),float8_div
(arithmetic functions).v
  postgresqlv14/src/backend/adt/geo_ops.c : point_xxx().
(3) GCC performs better than clang on datatype Int128(need to cross check
on instruction level/assembly code on Hardware).
(4) as point(2) without inline(remove inline in source code ) on those
functions in file's float.h and geo_ops.c and observed performance
improvement 6% compared to  within inline in Clang.

regards,
Arjun


On Fri, Dec 10, 2021 at 11:51 PM Imre Samu  wrote:

> > GCC vs Clang
>
> related:
> As I see - with LLVM/Clang 14.0 ( X86_64 -O3 )   ~12% performance increase
> expected with the new optimisation ( probably adapted from gcc  )
> - https://twitter.com/djtodoro/status/1466808507240386560
> -
> https://www.phoronix.com/scan.php?page=news_item&px=LLVM-Clang-14-Hoist-Load
>
> regards,
>  Imre
>
>
>
> arjun shetty  ezt írta (időpont: 2021. nov.
> 16., K, 11:10):
>
>> Yes, currently focusing affects queries as well.
>> In meanwhile on analysis(hardware level) and sample examples noticed
>> 1. GCC performance  better than Clang on int128 .
>> 2. Clang performance better than GCC on long long
>>  the reference example
>> https://stackoverflow.com/questions/63029428/why-is-int128-t-faster-than-long-long-on-x86-64-gcc
>>
>> 3.GCC enabled with “ fexcess-precision=standard” (precision cast for
>> floating point ).
>>
>> Is these 3 points can make performance  difference GCC vs Clang in
>> PostgreSQLv14 in Apple/AMD/()environment(intel environment need to check).
>> In these environment int128 enabled wrt PostgreSQLv14.
>>
>> On Friday, November 5, 2021, Tomas Vondra 
>> wrote:
>>
>>> Hi,
>>>
>>> IMO this thread provides so little information it's almost impossible to
>>> answer the question. There's almost no information about the hardware,
>>> scale of the test, configuration of the Postgres instance, the exact build
>>> flags, differences in generated asm code, etc.
>>>
>>> I find it hard to believe merely switching from clang to gcc yields 22%
>>> speedup - that's way higher than any differences we've seen in the past.
>>>
>>> In my experience, the speedup is unlikely to be "across the board".
>>> There will be a handful of affected queries, while most remaining queries
>>> will be about the same. In that case you need to focus on those queries,
>>> see if the plans are the same, do some profiling, etc.
>>>
>>>
>>> regards
>>>
>>> --
>>> Tomas Vondra
>>> EnterpriseDB: http://www.enterprisedb.com
>>> The Enterprise PostgreSQL Company
>>>
>>


Unique constraint blues

2022-01-18 Thread Mladen Gogala
Postgres version is 13.5, platform is Oracle Linux 8.5, x86_64. Here is 
the problem:


mgogala=# create table test1(col1 integer,col2 varchar(10));
CREATE TABLE
mgogala=# alter table test1 add constraint test1_uq unique(col1,col2);
ALTER TABLE
mgogala=# insert into test1 values(1,null);
INSERT 0 1
mgogala=# insert into test1 values(1,null);
INSERT 0 1
mgogala=# select * from test1;
 col1 | col2
--+--
    1 |
    1 |
(2 rows)

So, my unique constraint doesn't work if one of the columns is null. 
Bruce Momjian to the rescue: 
https://blog.toadworld.com/2017/07/12/allowing-only-one-null


Let's see what happens:

mgogala=# truncate table test1;
TRUNCATE TABLE
mgogala=# alter table test1 drop constraint test1_uq;
ALTER TABLE
mgogala=# create unique index test1_uq on test1(col1,(col2 is null)) 
where col2 is null;

CREATE INDEX
mgogala=# insert into test1 values(1,null);
INSERT 0 1
mgogala=# insert into test1 values(1,null);
ERROR:  duplicate key value violates unique constraint "test1_uq"
DETAIL:  Key (col1, (col2 IS NULL))=(1, t) already exists.


So, this allows only a single NULL value, just what I wanted. However, 
there is a minor issue: this doesn't work for the general case:


mgogala=# insert into test1 values(1,'test1');
INSERT 0 1
mgogala=# insert into test1 values(1,'test1');
INSERT 0 1
mgogala=# select * from test1;
 col1 | col2
--+---
    1 |
    1 | test1
    1 | test1
(3 rows)

I can insert the same row twice, which defeats the purpose. So, let's 
make the 3d modification:


mgogala=# truncate table test1;
TRUNCATE TABLE
mgogala=# drop index test1_uq;
DROP INDEX
mgogala=# create unique index test1_uq on test1(col1,coalesce(col2,'*** 
EMPTY ***'));


Using "coalesce" enforces the constraint just the way I need:

mgogala=# insert into test1 values(1,null);
INSERT 0 1
mgogala=# insert into test1 values(1,null);
ERROR:  duplicate key value violates unique constraint "test1_uq"
DETAIL:  Key (col1, COALESCE(col2, '*** EMPTY ***'::character 
varying))=(1, *** EMPTY ***) already exists.

mgogala=# insert into test1 values(1,'test1');
INSERT 0 1
mgogala=# insert into test1 values(1,'test1');
ERROR:  duplicate key value violates unique constraint "test1_uq"
DETAIL:  Key (col1, COALESCE(col2, '*** EMPTY ***'::character 
varying))=(1, test1) already exists.

mgogala=#

Now comes the greatest mystery of them all:

explain (analyze,verbose) select * from test1 where col1=1 and col2='test1';
   QUERY PLAN


-
 Bitmap Heap Scan on mgogala.test1  (cost=1.70..7.52 rows=1 width=42) 
(actual ti

me=0.023..0.024 rows=1 loops=1)
   Output: col1, col2
   Recheck Cond: (test1.col1 = 1)
   Filter: ((test1.col2)::text = 'test1'::text)
   Rows Removed by Filter: 1
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on test1_uq  (cost=0.00..1.70 rows=6 width=0) 
(actual t

ime=0.015..0.016 rows=2 loops=1)
 Index Cond: (test1.col1 = 1)
 Planning Time: 1.184 ms
 Execution Time: 0.407 ms
(10 rows)

How come that the index is used for search without the "coalesce" 
function? The unique index is a function based index and, in theory, it 
shouldn't be usable for searches without the function. I don't 
understand why is this working. I am porting application from Oracle to 
Postgres and Oracle behaves like this:


SQLcl: Release 21.3 Production on Tue Jan 18 11:39:43 2022

Copyright (c) 1982, 2022, Oracle.  All rights reserved.

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.13.0.0.0

Elapsed: 00:00:00.001
SQL> create table test1(col1 integer,col2 varchar2(10));

Table TEST1 created.

Elapsed: 00:00:00.050

SQL> alter table test1 add constraint test1_uq unique(col1,col2);

Table TEST1 altered.

Elapsed: 00:00:00.139
SQL> insert into test1 values(1,null);

1 row inserted.

Elapsed: 00:00:00.026
SQL> insert into test1 values(1,null);

Error starting at line : 1 in command -
insert into test1 values(1,null)
Error report -
ORA-1: unique constraint (SCOTT.TEST1_UQ) violated

Elapsed: 00:00:00.033

Oracle is rejecting the same row twice, regardless of whether it 
contains NULL values or not. As in  Postgres, the resulting index can be 
used for searches. However, Oracle index is not a function-based index 
because it doesn't contain the coalesce function.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com





Re: Unique constraint blues

2022-01-18 Thread David G. Johnston
On Tue, Jan 18, 2022 at 10:13 AM Mladen Gogala 
wrote:

>
> mgogala=# create unique index test1_uq on test1(col1,coalesce(col2,'***
> EMPTY ***'));
>
> ->  Bitmap Index Scan on test1_uq  (cost=0.00..1.70 rows=6 width=0)

 ...
>Index Cond: (test1.col1 = 1)
>
> How come that the index is used for search without the "coalesce"
> function?


Only the second column is an expression.  The first (leading) column is
perfectly usable all by itself.  It is less efficient, hence the parent
node's:

Recheck Cond: (test1.col1 = 1)
Filter: ((test1.col2)::text = 'test1'::text)

but usable.

If you are willing to create partial unique indexes you probably should
just create two of them.  One where col2 is null and one where it isn't.

If the coalesce version is acceptable you should consider declaring the
column not null and put the sentinel value directly into the record.

David J.