Hi!
My PostgreSQL version is 11.8.
The query I am running is referring to a number of foreign tables. The first
one (table1) has to IDs, let's say ID_A and ID_B. While ID_A is unique, ID_B is
not. In my case, I am pulling formation for a value of IB_B for which about 800
rows (with unique ID_A) exist. I found:
While
select * from my_view where id_b='some value';
seemingly runs "forever" (I cancelled execution after a few hours), the
following completes in about 1 hr:
select * from my_view where ia_a in (
select id_a from table1 where id_b='some value'
);
So, I tried smaller chunks of ID_a and found the execution time is non-linear
with respect to number of IDs. For e.g. 50 ID_A's, it was completed in about 12
sec.
Next I have split the ~800 ID_A's into chunks of 50 and submitted these 16
queries one after another. They all completed in about 12 secs, each.
I then found the option fetch_size, e.g. ALTER SERVER some_server OPTIONS
(fetch_size '50000'). A chunk of 50 now executes in 2 seconds (instead of 12
before).
So, I found the "size" of the query has a serious impact to the execution time.
I don't really understand why execution 16*50 takes 16*2 secs only, but
executing 1*800 takes about 3000 seconds...
Regards,
Ingolf
-----Original Message-----
From: Sebastian Dressler [mailto:[email protected]]
Sent: 30 January 2021 11:45
To: Markhof, Ingolf <[email protected]>
Cc: [email protected]
Subject: [E] Re: Foreign table performance issue / PostgreSQK vs. ORACLE
Hi Ingolf,
> On 29. Jan 2021, at 13:56, Markhof, Ingolf <[email protected]>
> wrote:
>
> Hi!
>
> I am struggling with the slow performance when running queries referring to
> foreign tables. – Yes, I know… - Please read the whole story!
Done and it rings a bell or two.
> The set-up basically is a production database and a reporting database. As
> names indicate, the production database is used for production, the reporting
> database is for analysis. On the reporting database, the only way to access
> product data is via foreign tables that link to the related production tables.
>
> Now, while some queries on the reporting service run fine, some don't even
> return any data after hours.
>
> However, the same set-up worked fine in Oracle before. Reporting wasn't
> always fast, but it delivered results in acceptable time. A query executed on
> the Oracle reporting server returns data in e.g. 30 seconds. But running the
> query translated to PostgreSQL on the PostgreSQL DB does not deliver a single
> row after hours (!) of run time.
>
> So, I wonder: Is there a fundamental difference between Oracle database links
> and foreign tables in PostgreSQL that could explain the different run times?
> Could there be some tuning option in PostgreSQL to make queries via foreign
> tables faster (e.g. I heard about option fetch_size)?
You did not explicitly mention it, but I assume you are using postgres_fdw to
connect from reporting (R) to production (P). Thomas and Tom already mentioned
incomplete/non-existing/non-applicable filter pushdowns. I want to add another
probable root cause to the list explaining the behavior you experience.
The postgres_fdw uses a CURSOR on P to execute the query. While this guarantees
transaction safety, it also prohibits parallelism (PostgreSQL server-side
cursors enforce a sequential plan).
As a result, depending on the size of tables, indexes, and filters pushed down
(or not), this probably results in slow-running queries. IMO, the worst-case
scenario is that a sequential table scan without any filtering, and a single
worker runs on the target.
Of course, you can try to optimize schemas on P and queries on R, enabling more
filter pushdown and eventually a faster execution. However, I believe this does
not work with your entire workload, i.e. there will always be performance gaps.
The parallelism issue is theoretically fixable by utilizing partitions on P. R
then connects to P with multiple postgres_fdw-backed child tables. However,
this will only work with a patch to postgres_fdw to implement
"IsForeignScanParallelSafe" (see [1] for a possible implementation). Without
this method, there will be no parallelism again. Without, the partitions scan
occurs sequentially, not showing a performance gain.
I want to mention there are proprietary options available (re-)enabling
PostgreSQL parallelism with cursors. Such an extension can potentially fix your
performance issue. However, I have not tried it so far with a setup similar to
yours.
Cheers,
Sebastian
[1]:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_swarm64_parallel-2Dpostgres-2Dfdw-2Dpatch&d=DwIGaQ&c=udBTRvFvXC5Dhqg7UHpJlPps3mZ3LRxpb6__0PomBTQ&r=ivZWA-ECVj3XrXBe0obDwKY7Ui7K5Nj9oD2KKWLm0Bw&m=urVtRLfrc1kNan7AL2Al4g0Dq-bCi5UPxtnOEzHlj_U&s=ZkvPe7hWFG3H6Q2q9bca7l984-UxMeNw1fFOAyLWlPg&e=
--
Sebastian Dressler, Solution Architect, Swarm64 AS
+49 30 994 0496 72 | [email protected]
Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht
Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des
Aufsichtsrats: Francesco de Maio