RE: How to solve my slow disk i/o throughput during index scan

2024-07-11 Thread FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)

Hello, and thank you again for your example !

Sorry for my late answer, I was working on a patch for our requests. I am 
though not completely understanding what is happening. Here is a plan of a 
query where I splitted the calls with OR as you suggested, what seemed to have 
enabled parallel scans.

https://explain.dalibo.com/plan/gfa1cf9fffd01bcg#plan/node/1

But, I still wonder, why was my request that slow ? My current understanding of 
what happened is :


  *   When postgresql does an Index Scan, it goes through a loop (which is not 
parallel) of asking for a chunk of data, and then processing it. It wait for 
having processed the data to ask the next chunk, instead of loading the whole 
index in RAM (which, I suppose, would be much faster, but also not feasible if 
the index is too big and the RAM too small, so postgresql does not do it). 
Thus, the 2MB/s.
  *   When it does a Bitmap Index Scan, it can parallelize disk interactions, 
and does not use the processor to discard lines, thus a much faster index load 
and processing.

I might be completely wrong, and would really like to understand the details, 
in order to explain them to my team, and to other who might encounter the same 
problem.

Again, thank you very much for your help, we were really struggling with those 
slow queries !

Simon FREYBURGER



Interne

De : Andrei Lepikhov 
Envoyé : vendredi 5 juillet 2024 04:05
À : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD 
MANAGEMENT) ; [email protected]; 
Peter Geoghegan 
Objet : Re: How to solve my slow disk i/o throughput during index scan

On 7/4/24 22: 23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / 
DM RMP YIELD MANAGEMENT) wrote: > Hello, > > Thank you, splitting in “OR” query 
definitely enables bitmap heap scans, > and thus parallelized read to disk


On 7/4/24 22:23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE

TGV / DM RMP YIELD MANAGEMENT) wrote:

> Hello,

>

> Thank you, splitting in “OR” query definitely enables bitmap heap scans,

> and thus parallelized read to disk 😃! I though did not understand your

> second point, what is parallel append, and how to enable it ?

Just for example:



DROP TABLE IF EXISTS t CASCADE;

CREATE TABLE t (id int not null, payload text) PARTITION BY RANGE (id);

CREATE TABLE p1 PARTITION OF t FOR VALUES FROM (0) TO (1000);

CREATE TABLE p2 PARTITION OF t FOR VALUES FROM (1000) TO (2000);

CREATE TABLE p3 PARTITION OF t FOR VALUES FROM (2000) TO (3000);

CREATE TABLE p4 PARTITION OF t FOR VALUES FROM (3000) TO (4000);

INSERT INTO t SELECT x % 4000, repeat('a',128) || x FROM

generate_series(1,1E5) AS x;

ANALYZE t;



SET enable_parallel_append = on;

SET parallel_setup_cost = 0.1;

SET parallel_tuple_cost = 0.1;

SET max_parallel_workers_per_gather = 8;

SET min_parallel_table_scan_size = 0;

SET min_parallel_index_scan_size = 0;



EXPLAIN (COSTS OFF)

SELECT t.id, t.payload FROM t WHERE t.id % 2 = 0

GROUP BY t.id, t.payload;



  Group

Group Key: t.id, t.payload

->  Gather Merge

  Workers Planned: 6

  ->  Sort

Sort Key: t.id, t.payload

->  Parallel Append

  ->  Parallel Seq Scan on p1 t_1

Filter: ((id % 2) = 0)

  ->  Parallel Seq Scan on p2 t_2

Filter: ((id % 2) = 0)

  ->  Parallel Seq Scan on p3 t_3

Filter: ((id % 2) = 0)

  ->  Parallel Seq Scan on p4 t_4

Filter: ((id % 2) = 0)



Here the table is scanned in parallel. It also works with IndexScan.



--

regards, Andrei Lepikhov


---
Ce message et toutes les pièces jointes sont établis à l'intention exclusive de 
ses destinataires et sont confidentiels. L'intégrité de ce message n'étant pas 
assurée sur Internet, la SNCF ne peut être tenue responsable des altérations 
qui pourraient se produire sur son contenu. Toute publication, utilisation, 
reproduction, ou diffusion, même partielle, non autorisée préalablement par la 
SNCF, est strictement interdite. Si vous n'êtes pas le destinataire de ce 
message, merci d'en avertir immédiatement l'expéditeur et de le détruire.
---
This message and any attachments are intended solely for the addressees and are 
confidential. SNCF may not be held responsible for their contents whose 
accuracy and completeness cannot be guaranteed over the Internet. Unauthorized 
use, disclosure, distribution, copying, or any part thereof is strictly 
prohibited. If you are not the intended recipient of this message, please 
notify the sender immediately and delete it.


RE: How to solve my slow disk i/o throughput during index scan

2024-07-11 Thread FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)
Also, It might not be related, but I have suspiciously similar slow reads when 
I am inserting in database, could it be related ?

See e.g. : https://explain.dalibo.com/plan/43d37de5870e1651

The table I am inserting into looks like :

-- public."F_TDOJ_SC_HIST_2" definition

-- Drop table

-- DROP TABLE public."F_TDOJ_SC_HIST_2";

CREATE TABLE public."F_TDOJ_SC_HIST_2" (
   "ID_TRAIN" int4 NOT NULL,
   "ID_JOUR" int4 NOT NULL,
   "ID_OD" int4 NOT NULL,
   "JX" int4 NOT NULL,
   "RES" int4 NULL,
   "REV" float4 NULL,
   "RES_SC1" int4 NULL,
   "RES_SC2" int4 NULL,
   "RES_SC3" int4 NULL,
   "RES_SC4" int4 NULL,
   "RES_SC5" int4 NULL,
   "RES_SC6" int4 NULL,
   "RES_SC7" int4 NULL,
   "RES_SC8" int4 NULL,
   "RES_SC9" int4 NULL,
   "RES_SC10" int4 NULL,
   "RES_SC11" int4 NULL,
   "RES_SC12" int4 NULL,
   "RES_SC13" int4 NULL,
   "RES_SC14" int4 NULL,
   "RES_SC15" int4 NULL,
   "RES_SC16" int4 NULL,
   "RES_SC17" int4 NULL,
   "RES_SC18" int4 NULL,
   "AUT_SC1" int4 NULL,
   "AUT_SC2" int4 NULL,
   "AUT_SC3" int4 NULL,
   "AUT_SC4" int4 NULL,
   "AUT_SC5" int4 NULL,
   "AUT_SC6" int4 NULL,
   "AUT_SC7" int4 NULL,
   "AUT_SC8" int4 NULL,
   "AUT_SC9" int4 NULL,
   "AUT_SC10" int4 NULL,
   "AUT_SC11" int4 NULL,
   "AUT_SC12" int4 NULL,
   "AUT_SC13" int4 NULL,
   "AUT_SC14" int4 NULL,
   "AUT_SC15" int4 NULL,
   "AUT_SC16" int4 NULL,
   "AUT_SC17" int4 NULL,
   "AUT_SC18" int4 NULL,
   "DSP_SC1" int4 NULL,
   "DSP_SC2" int4 NULL,
   "DSP_SC3" int4 NULL,
   "DSP_SC4" int4 NULL,
   "DSP_SC5" int4 NULL,
   "DSP_SC6" int4 NULL,
   "DSP_SC7" int4 NULL,
   "DSP_SC8" int4 NULL,
   "DSP_SC9" int4 NULL,
   "DSP_SC10" int4 NULL,
   "DSP_SC11" int4 NULL,
   "DSP_SC12" int4 NULL,
   "DSP_SC13" int4 NULL,
   "DSP_SC14" int4 NULL,
   "DSP_SC15" int4 NULL,
   "DSP_SC16" int4 NULL,
   "DSP_SC17" int4 NULL,
   "DSP_SC18" int4 NULL,
   "REV_SC1" float4 NULL,
   "REV_SC2" float4 NULL,
   "REV_SC3" float4 NULL,
   "REV_SC4" float4 NULL,
   "REV_SC5" float4 NULL,
   "REV_SC6" float4 NULL,
   "REV_SC7" float4 NULL,
   "REV_SC8" float4 NULL,
   "REV_SC9" float4 NULL,
   "REV_SC10" float4 NULL,
   "REV_SC11" float4 NULL,
   "REV_SC12" float4 NULL,
   "REV_SC13" float4 NULL,
   "REV_SC14" float4 NULL,
   "REV_SC15" float4 NULL,
   "REV_SC16" float4 NULL,
   "REV_SC17" float4 NULL,
   "REV_SC18" float4 NULL,
   CONSTRAINT "F_TDOJ_SC_HIST_2_pkey" PRIMARY KEY 
("ID_TRAIN","ID_JOUR","ID_OD","JX")
)
PARTITION BY RANGE ("ID_JOUR");


-- public."F_TDOJ_SC_HIST_2" foreign keys

ALTER TABLE public."F_TDOJ_SC_HIST_2" ADD CONSTRAINT 
"F_TDOJ_SC_HIST_2_ID_JOUR_fkey" FOREIGN KEY ("ID_JOUR") REFERENCES 
public."D_JOUR"("ID_JOUR");
ALTER TABLE public."F_TDOJ_SC_HIST_2" ADD CONSTRAINT 
"F_TDOJ_SC_HIST_2_ID_OD_fkey" FOREIGN KEY ("ID_OD") REFERENCES 
public."D_OD"("ID_OD");
ALTER TABLE public."F_TDOJ_SC_HIST_2" ADD CONSTRAINT 
"F_TDOJ_SC_HIST_2_ID_TRAIN_fkey" FOREIGN KEY ("ID_TRAIN") REFERENCES 
public."D_TRAIN"("ID_TRAIN");
ALTER TABLE public."F_TDOJ_SC_HIST_2" ADD CONSTRAINT "F_TDOJ_SC_HIST_2_JX_fkey" 
FOREIGN KEY ("JX") REFERENCES public."D_JX"("JX");

I’m using a 3 steps process to insert my lines in the table :

  *   COPY into a temporary table
  *   DELETE FROM on the perimeter I will be inserting into
  *   INSERT … INTO mytable SELECT … FROM temporarytable ON CONFLICT DO NOTHING

Is it possible to parallelize the scans during the modify step ?

Regards

Simon FREYBURGER



Interne

De : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD 
MANAGEMENT) 
Envoyé : jeudi 11 juillet 2024 16:59
À : Andrei Lepikhov ; 
[email protected]; Peter Geoghegan 
Objet : RE: How to solve my slow disk i/o throughput during index scan


Hello, and thank you again for your example !

Sorry for my late answer, I was working on a patch for our requests. I am 
though not completely understanding what is happening. Here is a plan of a 
query where I splitted the calls with OR as you suggested, what seemed to have 
enabled parallel scans.

https://explain.dalibo.com/plan/gfa1cf9fffd01bcg#plan/node/1

But, I still wonder, why was my request that slow ? My current understanding of 
what happened is :


  *   When postgresql does an Index Scan, it goes through a loop (which is not 
parallel) of asking for a chunk of data, and then processing it. It wait for 
having processed the data to ask the next chunk, instead of loading the whole 
index in RAM (which, I suppose, would be much faster, but also not feasible if 

Re: How to solve my slow disk i/o throughput during index scan

2024-07-11 Thread David G. Johnston
On Thursday, July 11, 2024, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION
GENERALE TGV / DM RMP YIELD MANAGEMENT)  wrote:

> Also, It might not be related, but I have suspiciously similar slow reads
> when I am inserting in database, could it be related ?
> I’m using a 3 steps process to insert my lines in the table :
>
>- COPY into a temporary table
>- DELETE FROM on the perimeter I will be inserting into
>- INSERT … INTO mytable SELECT … FROM temporarytable ON CONFLICT DO
>NOTHING
>
>
>
> Is it possible to parallelize the scans during the modify step ?
>
>
>
This tells you when parallelism is used:


https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html

David J.


Re: How to solve my slow disk i/o throughput during index scan

2024-07-11 Thread Andrei Lepikhov
On 11/7/2024 22:09, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION 
GENERALE TGV / DM RMP YIELD MANAGEMENT) wrote:

Is it possible to parallelize the scans during the modify step ?
Temporary tables can't be used inside a query with parallel workers 
involved, because such table is local for single process.


What about your question - I'm not sure without whole bunch of data. But 
maximum speedup you can get by disabling as much constraints as possible 
- ideally, fill each partition individually with no constraints and 
indexes at all before uniting them into one partitioned table.


--
regards, Andrei Lepikhov





Re: How to solve my slow disk i/o throughput during index scan

2024-07-11 Thread Andrei Lepikhov
On 11/7/2024 21:59, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION 
GENERALE TGV / DM RMP YIELD MANAGEMENT) wrote:


Hello, and thank you again for your example !

Sorry for my late answer, I was working on a patch for our requests. I 
am though not completely understanding what is happening. Here is a plan 
of a query where I splitted the calls with OR as you suggested, what 
seemed to have enabled parallel scans.

Thanks for the feedback!
Generally, I don't understand why you needed to transform ANY -> OR at 
all to get BitmapScan. Can you just disable IndexScan and possibly 
SeqScan to see is it a hard transformation limit or mistake in cost 
estimation?


--
regards, Andrei Lepikhov





inequality predicate not pushed down in JOIN?

2024-07-11 Thread Paul George
Hey!

[version: PostgreSQL 16.3]

In the example below, I noticed that the JOIN predicate "t1.a<1" is not
pushed down to the scan over "t2", though it superficially seems like it
should be.

create table t as (select 1 a);
analyze t;
explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a<1;
  QUERY PLAN
---
 Hash Join
   Hash Cond: (t2.a = t1.a)
   ->  Seq Scan on t t2
   ->  Hash
 ->  Seq Scan on t t1
   Filter: (a < 1)
(6 rows)

The same is true for the predicate "t1.a in (0, 1)". For comparison, the
predicate "t1.a=1" does get pushed down to both scans.

explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a=1;
   QUERY PLAN
-
 Nested Loop
   ->  Seq Scan on t t1
 Filter: (a = 1)
   ->  Seq Scan on t t2
 Filter: (a = 1)
(5 rows)


-Paul-


Re: inequality predicate not pushed down in JOIN?

2024-07-11 Thread Andrei Lepikhov

On 12/7/2024 06:31, Paul George wrote:
In the example below, I noticed that the JOIN predicate "t1.a<1" is not 
pushed down to the scan over "t2", though it superficially seems like it 
should be.

It has already discussed at least couple of years ago, see [1].
Summarising, it is more complicated when equivalences and wastes CPU 
cycles more probably than helps.




create table t as (select 1 a);
analyze t;
explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a<1;
           QUERY PLAN
---
  Hash Join
    Hash Cond: (t2.a = t1.a)
    ->  Seq Scan on t t2
    ->  Hash
          ->  Seq Scan on t t1
                Filter: (a < 1)
(6 rows)

The same is true for the predicate "t1.a in (0, 1)". For comparison, the 
predicate "t1.a=1" does get pushed down to both scans.


explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a=1;
        QUERY PLAN
-
  Nested Loop
    ->  Seq Scan on t t1
          Filter: (a = 1)
    ->  Seq Scan on t t2
          Filter: (a = 1)
(5 rows)


[1] Condition pushdown: why (=) is pushed down into join, but BETWEEN or 
>= is not?

https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com

--
regards, Andrei Lepikhov





Re: inequality predicate not pushed down in JOIN?

2024-07-11 Thread Paul George
Cool! Thanks for the speedy reply, link, and summary! I'm not sure how I
missed this, but apologies for the noise.

-Paul-

On Thu, Jul 11, 2024 at 4:49 PM Andrei Lepikhov  wrote:

> On 12/7/2024 06:31, Paul George wrote:
> > In the example below, I noticed that the JOIN predicate "t1.a<1" is not
> > pushed down to the scan over "t2", though it superficially seems like it
> > should be.
> It has already discussed at least couple of years ago, see [1].
> Summarising, it is more complicated when equivalences and wastes CPU
> cycles more probably than helps.
>
> >
> > create table t as (select 1 a);
> > analyze t;
> > explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a<1;
> >QUERY PLAN
> > ---
> >   Hash Join
> > Hash Cond: (t2.a = t1.a)
> > ->  Seq Scan on t t2
> > ->  Hash
> >   ->  Seq Scan on t t1
> > Filter: (a < 1)
> > (6 rows)
> >
> > The same is true for the predicate "t1.a in (0, 1)". For comparison, the
> > predicate "t1.a=1" does get pushed down to both scans.
> >
> > explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and t1.a=1;
> > QUERY PLAN
> > -
> >   Nested Loop
> > ->  Seq Scan on t t1
> >   Filter: (a = 1)
> > ->  Seq Scan on t t2
> >   Filter: (a = 1)
> > (5 rows)
>
> [1] Condition pushdown: why (=) is pushed down into join, but BETWEEN or
>  >= is not?
>
> https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com
>
> --
> regards, Andrei Lepikhov
>
>


Re: inequality predicate not pushed down in JOIN?

2024-07-11 Thread Jerry Brenner
While applying transitivity to non-equality conditions is less frequently
beneficial than applying it to equality conditions, it can be very helpful,
especially with third party apps and dynamically changing data.  One
possible implementation to avoid the mentioned overhead would be to mark
the internally generated predicate(s) as potentially redundant and discard
it on the inner table of the join after planning (and enhance the optimizer
to recognize redundant predicates and adjust accordingly when costing).

Jerry

On Thu, Jul 11, 2024 at 5:16 PM Paul George  wrote:

> Cool! Thanks for the speedy reply, link, and summary! I'm not sure how I
> missed this, but apologies for the noise.
>
> -Paul-
>
> On Thu, Jul 11, 2024 at 4:49 PM Andrei Lepikhov  wrote:
>
>> On 12/7/2024 06:31, Paul George wrote:
>> > In the example below, I noticed that the JOIN predicate "t1.a<1" is not
>> > pushed down to the scan over "t2", though it superficially seems like
>> it
>> > should be.
>> It has already discussed at least couple of years ago, see [1].
>> Summarising, it is more complicated when equivalences and wastes CPU
>> cycles more probably than helps.
>>
>> >
>> > create table t as (select 1 a);
>> > analyze t;
>> > explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and
>> t1.a<1;
>> >QUERY PLAN
>> > ---
>> >   Hash Join
>> > Hash Cond: (t2.a = t1.a)
>> > ->  Seq Scan on t t2
>> > ->  Hash
>> >   ->  Seq Scan on t t1
>> > Filter: (a < 1)
>> > (6 rows)
>> >
>> > The same is true for the predicate "t1.a in (0, 1)". For comparison,
>> the
>> > predicate "t1.a=1" does get pushed down to both scans.
>> >
>> > explain (costs off) select * from t t1 join t t2 on t1.a=t2.a and
>> t1.a=1;
>> > QUERY PLAN
>> > -
>> >   Nested Loop
>> > ->  Seq Scan on t t1
>> >   Filter: (a = 1)
>> > ->  Seq Scan on t t2
>> >   Filter: (a = 1)
>> > (5 rows)
>>
>> [1] Condition pushdown: why (=) is pushed down into join, but BETWEEN or
>>  >= is not?
>>
>> https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com
>> 
>>
>> --
>> regards, Andrei Lepikhov
>>
>>