Oracle to postgres migration

2019-04-08 Thread Daulat Ram
Hi team,

Please confirm ! Can we migrate Oracle 12c database (12.1.0.1.0) running on 
Solaris to PostgreSQL 11.2 on  Linux (Ubuntu). Also, please suggest the tools 
and pre-requisites.

Regards,
Daulat




Re: Oracle to postgres migration

2019-04-08 Thread Julien Rouhaud
On Mon, Apr 8, 2019 at 1:49 PM Daulat Ram  wrote:
>
> Please confirm ! Can we migrate Oracle 12c database (12.1.0.1.0) running on 
> Solaris to PostgreSQL 11.2 on  Linux (Ubuntu). Also, please suggest the tools 
> and pre-requisites.
A database migration is likely feasible, but might require quite a lot
of work depending on what features you're using, and the amount of PL
code.  Also, obviously migrating the database is only a part of the
overall migration process, as you'll also need to take care of the
application(s), the backup/restore, monitoring and all other tools you
need.

Concerning the database migration, the best tool is probably Gilles
Darold's ora2pg.  The tool also provides a migration cost assessment
report, to evaluate the difficulty of the database migration.  More
information on http://ora2pg.darold.net/




Re: Oracle to postgres migration

2019-04-08 Thread Rick Otten
On Mon, Apr 8, 2019 at 8:04 AM Julien Rouhaud  wrote:

> On Mon, Apr 8, 2019 at 1:49 PM Daulat Ram 
> wrote:
> >
> > Please confirm ! Can we migrate Oracle 12c database (12.1.0.1.0) running
> on Solaris to PostgreSQL 11.2 on  Linux (Ubuntu). Also, please suggest the
> tools and pre-requisites.
> A database migration is likely feasible, but might require quite a lot
> of work depending on what features you're using, and the amount of PL
> code.  Also, obviously migrating the database is only a part of the
> overall migration process, as you'll also need to take care of the
> application(s), the backup/restore, monitoring and all other tools you
> need.
>
> Concerning the database migration, the best tool is probably Gilles
> Darold's ora2pg.  The tool also provides a migration cost assessment
> report, to evaluate the difficulty of the database migration.  More
> information on http://ora2pg.darold.net/
>
>
>
The last big Oracle to PG migration that I did was several years ago.  We
stood up the PostgreSQL instance(s) and then used SymmetricDS to
synchronize the Oracle and PG databases.   After tuning and testing the
postgresql side, we cut over the applications live - with minimal downtime
- by releasing the updated application code and configuration.   If we
needed to fail back, it was also pretty easy to undo the release and
configuration changes.

Another approach you can play with is to leverage Foreign Data Wrappers.
In that scenario, you can run queries on your Oracle database from within
PostgreSQL.  You can use those queries to copy data directly into new
tables without any interim files, or as a hybrid transition while you get
the new database set up.

At the time I was working on that migration, we had too many
data-edge-cases for ora2pg to be very useful.  It has come a long ways
since then.  I'm not sure it can do a live cutover, so you may need to plan
a bit of downtime if you have a lot of data to move into the new database.

Note that you will also almost certainly want to use a connection pooler
like PGBouncer and/or PGPool II (or both at the same time), so be sure to
include that in your plans from the beginning.

That said, none of this is on topic for the performance mailing list.
Please try to direct your questions to the right group next time.


Planning performance problem (67626.278ms)

2019-04-08 Thread Krzysztof Plocharz

Hi

We have some very strange query planning problem. Long story short it 
takes 67626.278ms just to plan. Query execution takes 12ms.


Query has 7 joins and 2 subselects.
It looks like the issue is not deterministic, sometimes is takes few ms 
to plan the query.


One of the tables has 550,485,942 live tuples and 743,504,012 dead 
tuples. Running ANALYZE on that tables solves the problem only temporarily.


Question is how can we debug what is going on?

Best Regards,
Krzysztof Płocharz




RE: Planning performance problem (67626.278ms)

2019-04-08 Thread Igor Neyman

-Original Message-
From: Krzysztof Plocharz [mailto:[email protected]] 
Sent: Monday, April 08, 2019 10:11 AM
To: [email protected]
Subject: Planning performance problem (67626.278ms)

Hi

We have some very strange query planning problem. Long story short it takes 
67626.278ms just to plan. Query execution takes 12ms.

Query has 7 joins and 2 subselects.
It looks like the issue is not deterministic, sometimes is takes few ms to plan 
the query.

One of the tables has 550,485,942 live tuples and 743,504,012 dead tuples. 
Running ANALYZE on that tables solves the problem only temporarily.

Question is how can we debug what is going on?

Best Regards,
Krzysztof Płocharz

___

Why do you have to run Analyze? Did you turn off Autovacuum?

Regards,
Igor Neyman


Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Krzysztof Plocharz

No, Autovacuum is running.

On 2019/04/08 16:18, Igor Neyman wrote:


-Original Message-
From: Krzysztof Plocharz [mailto:[email protected]]
Sent: Monday, April 08, 2019 10:11 AM
To: [email protected]
Subject: Planning performance problem (67626.278ms)

Hi

We have some very strange query planning problem. Long story short it takes 
67626.278ms just to plan. Query execution takes 12ms.

Query has 7 joins and 2 subselects.
It looks like the issue is not deterministic, sometimes is takes few ms to plan 
the query.

One of the tables has 550,485,942 live tuples and 743,504,012 dead tuples. 
Running ANALYZE on that tables solves the problem only temporarily.

Question is how can we debug what is going on?

Best Regards,
Krzysztof Płocharz

___

Why do you have to run Analyze? Did you turn off Autovacuum?

Regards,
Igor Neyman






Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Pavel Stehule
po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz 
napsal:

> Hi
>
> We have some very strange query planning problem. Long story short it
> takes 67626.278ms just to plan. Query execution takes 12ms.
>
> Query has 7 joins and 2 subselects.
> It looks like the issue is not deterministic, sometimes is takes few ms
> to plan the query.
>
> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> tuples. Running ANALYZE on that tables solves the problem only temporarily.
>
> Question is how can we debug what is going on?
>

please check your indexes against bloating. Planner get min and max from
indexes and this operation is slow on bloat indexes.

but 67 sec is really slow - it can be some other other problem - it is real
computer or virtual?



>
> Best Regards,
> Krzysztof Płocharz
>
>
>


Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Justin Pryzby
On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz  
> napsal:
> 
> > We have some very strange query planning problem. Long story short it
> > takes 67626.278ms just to plan. Query execution takes 12ms.
> >
> > Query has 7 joins and 2 subselects.
> > It looks like the issue is not deterministic, sometimes is takes few ms
> > to plan the query.
> >
> > One of the tables has 550,485,942 live tuples and 743,504,012 dead
> > tuples. Running ANALYZE on that tables solves the problem only temporarily.
> >
> > Question is how can we debug what is going on?
> 
> please check your indexes against bloating. Planner get min and max from
> indexes and this operation is slow on bloat indexes.

I think that's from get_actual_variable_range(), right ?

If it's due to bloating, I think the first step would be to 1) vacuum right
now; and, 2) set more aggressive auto-vacuum, like ALTER TABLE t SET
(AUTOVACUUM_VACUUM_SCALE_FACTOR=0.005).

What version postgres server ?

Justin




Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Krzysztof Plocharz




On 2019/04/08 16:42, Justin Pryzby wrote:

On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:

po 8. 4. 2019 v 16:11 odes�latel Krzysztof Plocharz  
napsal:


We have some very strange query planning problem. Long story short it
takes 67626.278ms just to plan. Query execution takes 12ms.

Query has 7 joins and 2 subselects.
It looks like the issue is not deterministic, sometimes is takes few ms
to plan the query.

One of the tables has 550,485,942 live tuples and 743,504,012 dead
tuples. Running ANALYZE on that tables solves the problem only temporarily.

Question is how can we debug what is going on?


please check your indexes against bloating. Planner get min and max from
indexes and this operation is slow on bloat indexes.


Yes, we thought about this, there are over 700,000,000 dead tuples. But 
as you said, it should not result in 67 second planning...




I think that's from get_actual_variable_range(), right ?

If it's due to bloating, I think the first step would be to 1) vacuum right
now; and, 2) set more aggressive auto-vacuum, like ALTER TABLE t SET
(AUTOVACUUM_VACUUM_SCALE_FACTOR=0.005).



We did pgrepack and it did help, but is it possible for 
get_actual_variable_range to take over 60 seconds?

Is there any other workaround for this except for pgrepack/vacuum?

Anyway to actually debug this?


What version postgres server ?

Justin







On 2019/04/08 16:33, Pavel Stehule wrote:>
>
> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz
> mailto:[email protected]>> napsal:
>
> Hi
>
> We have some very strange query planning problem. Long story short it
> takes 67626.278ms just to plan. Query execution takes 12ms.
>
> Query has 7 joins and 2 subselects.
> It looks like the issue is not deterministic, sometimes is takes 
few ms

> to plan the query.
>
> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> tuples. Running ANALYZE on that tables solves the problem only
> temporarily.
>
> Question is how can we debug what is going on?
>
>
> please check your indexes against bloating. Planner get min and max from
> indexes and this operation is slow on bloat indexes.
>
Yes, we thought about this, there are over 700,000,000 dead tuples. But 
as you said, it should not result in 67 second planning...


> but 67 sec is really slow - it can be some other other problem - it is
> real computer or virtual?
>
real, with pretty good specs: NVME drives, Six-Core AMD Opteron, 64GB of 
ram. During testing system was mostly idle.



>
> Best Regards,
> Krzysztof Płocharz
>
>




Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Krzysztof Plocharz




On 2019/04/08 16:33, Pavel Stehule wrote:



po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz 
mailto:[email protected]>> napsal:


Hi

We have some very strange query planning problem. Long story short it
takes 67626.278ms just to plan. Query execution takes 12ms.

Query has 7 joins and 2 subselects.
It looks like the issue is not deterministic, sometimes is takes few ms
to plan the query.

One of the tables has 550,485,942 live tuples and 743,504,012 dead
tuples. Running ANALYZE on that tables solves the problem only
temporarily.

Question is how can we debug what is going on?


please check your indexes against bloating. Planner get min and max from 
indexes and this operation is slow on bloat indexes.




Yes, we thought about this, there are over 700,000,000 dead tuples. But 
as you said, it should not result in 67 second planning...


but 67 sec is really slow - it can be some other other problem - it is 
real computer or virtual?




real, with pretty good specs: NVME drives, Six-Core AMD Opteron, 64GB of 
ram. During testing system was mostly idle.




Best Regards,
Krzysztof Płocharz







Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Pavel Stehule
po 8. 4. 2019 v 16:55 odesílatel Krzysztof Plocharz 
napsal:

>
>
> On 2019/04/08 16:42, Justin Pryzby wrote:
> > On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
> >> po 8. 4. 2019 v 16:11 odes�latel Krzysztof Plocharz <
> [email protected]> napsal:
> >>
> >>> We have some very strange query planning problem. Long story short it
> >>> takes 67626.278ms just to plan. Query execution takes 12ms.
> >>>
> >>> Query has 7 joins and 2 subselects.
> >>> It looks like the issue is not deterministic, sometimes is takes few ms
> >>> to plan the query.
> >>>
> >>> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> >>> tuples. Running ANALYZE on that tables solves the problem only
> temporarily.
> >>>
> >>> Question is how can we debug what is going on?
> >>
> >> please check your indexes against bloating. Planner get min and max from
> >> indexes and this operation is slow on bloat indexes.
>
> Yes, we thought about this, there are over 700,000,000 dead tuples. But
> as you said, it should not result in 67 second planning...
>
> >
> > I think that's from get_actual_variable_range(), right ?
> >
> > If it's due to bloating, I think the first step would be to 1) vacuum
> right
> > now; and, 2) set more aggressive auto-vacuum, like ALTER TABLE t SET
> > (AUTOVACUUM_VACUUM_SCALE_FACTOR=0.005).
> >
>
> We did pgrepack and it did help, but is it possible for
> get_actual_variable_range to take over 60 seconds?
> Is there any other workaround for this except for pgrepack/vacuum?
>
> Anyway to actually debug this?
>

you can use perf and get a profile.

https://wiki.postgresql.org/wiki/Profiling_with_perf



> > What version postgres server ?
> >
> > Justin
> >
> >
>
>
>
>
> On 2019/04/08 16:33, Pavel Stehule wrote:>
>  >
>  > po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz
>  > mailto:[email protected]>> napsal:
>  >
>  > Hi
>  >
>  > We have some very strange query planning problem. Long story short
> it
>  > takes 67626.278ms just to plan. Query execution takes 12ms.
>  >
>  > Query has 7 joins and 2 subselects.
>  > It looks like the issue is not deterministic, sometimes is takes
> few ms
>  > to plan the query.
>  >
>  > One of the tables has 550,485,942 live tuples and 743,504,012 dead
>  > tuples. Running ANALYZE on that tables solves the problem only
>  > temporarily.
>  >
>  > Question is how can we debug what is going on?
>  >
>  >
>  > please check your indexes against bloating. Planner get min and max from
>  > indexes and this operation is slow on bloat indexes.
>  >
> Yes, we thought about this, there are over 700,000,000 dead tuples. But
> as you said, it should not result in 67 second planning...
>
>  > but 67 sec is really slow - it can be some other other problem - it is
>  > real computer or virtual?
>  >
> real, with pretty good specs: NVME drives, Six-Core AMD Opteron, 64GB of
> ram. During testing system was mostly idle.
>
>
>  >
>  > Best Regards,
>  > Krzysztof Płocharz
>  >
>  >
>
>
>


Block / Page Size Optimization

2019-04-08 Thread Gunther
Hi all, I am sure this should be a FAQ, but I can't see a definitive 
answer, only chatter on various lists and forums.


Default page size of PostgreSQL is 8192 bytes.

Default IO block size in Linux is 4096 bytes.

I can set an XFS file system with 8192 bytes block size, but then it 
does not mount on Linux, because the VM page size is the limit, 4096 again.


There seems to be no way to change that in (most, common) Linux 
variants. In FreeBSD there appears to be a way to change that.


But then, there is a hardware limit also, as far as the VM memory page 
allocation is concerned. Apparently most i386 / amd64 architectures the 
VM page sizes are 4k, 2M, and 1G. The latter, I believe, are called 
"hugepages" and I only ever see that discussed in the PostgreSQL manuals 
for Linux, not for FreeBSD.


People have asked: does it matter? And then there is all that chatter 
about "why don't you run a benchmark and report back to us" -- "OK, will 
do" -- and then it's crickets.


But why is this such a secret?

On Amazon AWS there is the following very simple situation: IO is capped 
on IO operations per second (IOPS). Let's say, on a smallish volume, I 
get 300 IOPS (once my burst balance is used up.)


Now my simple theoretical reasoning is this: one IO call transfers 1 
block of 4k size. That means, with a cap of 300 IOPS, I get to send 1.17 
MB per second. That would be the absolute limit. BUT, if I could double 
the transfer size to 8k, I should be able to move 2.34 MB per second. 
Shouldn't I?


That might well depend on whether AWS' virtual device paths would 
support these 8k block sizes.


But something tells me that my reasoning here is totally off. Because I 
get better IO throughput that that. Even on 3000 IOPS I would only get 
11 MB per second, and I am sure I am getting rather 50-100 MB/s, no? So 
my simplistic logic is false.


What really is the theoretical issue with the file system block size? 
Where does -- in theory -- the benefit come from of using an XFS block 
size of 8 kB, or even increasing the PostgreSQL page size to 16 kB and 
then the XFS block size also to 16 kB? I remember having seen standard 
UFS block sizes of 16 kB. But then why is Linux so tough on refusing to 
mount an 8 kB XFS because it's VM page size is only 4 kB?


Doesn't this all have one straight explanation?

If you have a link that I can just read, I appreciate you sharing that. 
I think that should be on some Wiki or FAQ somewhere. If I get a quick 
and dirty explanation with some pointers, I can try to write it out into 
a more complete answer that might be added into some documentation or 
FAQ somewhere.


thanks & regards,
-Gunther





Re: Block / Page Size Optimization

2019-04-08 Thread Andres Freund
Hi,

On 2019-04-08 11:09:07 -0400, Gunther wrote:
> I can set an XFS file system with 8192 bytes block size, but then it does
> not mount on Linux, because the VM page size is the limit, 4096 again.
> 
> There seems to be no way to change that in (most, common) Linux variants. In
> FreeBSD there appears to be a way to change that.
> 
> But then, there is a hardware limit also, as far as the VM memory page
> allocation is concerned. Apparently most i386 / amd64 architectures the VM
> page sizes are 4k, 2M, and 1G. The latter, I believe, are called "hugepages"
> and I only ever see that discussed in the PostgreSQL manuals for Linux, not
> for FreeBSD.
> 
> People have asked: does it matter? And then there is all that chatter about
> "why don't you run a benchmark and report back to us" -- "OK, will do" --
> and then it's crickets.
> 
> But why is this such a secret?
> 
> On Amazon AWS there is the following very simple situation: IO is capped on
> IO operations per second (IOPS). Let's say, on a smallish volume, I get 300
> IOPS (once my burst balance is used up.)
> 
> Now my simple theoretical reasoning is this: one IO call transfers 1 block
> of 4k size. That means, with a cap of 300 IOPS, I get to send 1.17 MB per
> second. That would be the absolute limit. BUT, if I could double the
> transfer size to 8k, I should be able to move 2.34 MB per second. Shouldn't
> I?

The kernel collapses consecutive write requests. You can see the
average sizes of IO requests using iostat -xm 1. When e.g. bulk loading
into postgres I see:

Devicer/s w/s rMB/s wMB/s   rrqm/s   wrqm/s  %rrqm  
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sda  4.00  696.00  0.02471.05 0.0080.00   0.00  
10.318.507.13   4.64 4.00   693.03   0.98  68.50

so the average write request size was 693.03 kb. Thus I got 470 MB/sec
despite there only being ~700 IOPS. That's with 4KB page sizes, 4KB FS
blocks, and 8KB postgres  block size.


There still might be some benefit of different FS block sizes, but it's
not going to be related directly to IOPS.

Greetings,

Andres Freund




Re: Oracle to postgres migration

2019-04-08 Thread phb07


Le 08/04/2019 à 14:24, Rick Otten a écrit :



On Mon, Apr 8, 2019 at 8:04 AM Julien Rouhaud > wrote:


On Mon, Apr 8, 2019 at 1:49 PM Daulat Ram
mailto:[email protected]>>
wrote:
>
> Please confirm ! Can we migrate Oracle 12c database (12.1.0.1.0)
running on Solaris to PostgreSQL 11.2 on  Linux (Ubuntu). Also,
please suggest the tools and pre-requisites.
A database migration is likely feasible, but might require quite a lot
of work depending on what features you're using, and the amount of PL
code.  Also, obviously migrating the database is only a part of the
overall migration process, as you'll also need to take care of the
application(s), the backup/restore, monitoring and all other tools you
need.

Concerning the database migration, the best tool is probably Gilles
Darold's ora2pg.  The tool also provides a migration cost assessment
report, to evaluate the difficulty of the database migration.  More
information on http://ora2pg.darold.net/



The last big Oracle to PG migration that I did was several years ago.  
We stood up the PostgreSQL instance(s) and then used SymmetricDS to 
synchronize the Oracle and PG databases.   After tuning and testing 
the postgresql side, we cut over the applications live - with minimal 
downtime - by releasing the updated application code and 
configuration.   If we needed to fail back, it was also pretty easy to 
undo the release and configuration changes.


Another approach you can play with is to leverage Foreign Data 
Wrappers.  In that scenario, you can run queries on your Oracle 
database from within PostgreSQL.  You can use those queries to copy 
data directly into new tables without any interim files, or as a 
hybrid transition while you get the new database set up.


At the time I was working on that migration, we had too many 
data-edge-cases for ora2pg to be very useful.  It has come a long ways 
since then.  I'm not sure it can do a live cutover, so you may need to 
plan a bit of downtime if you have a lot of data to move into the new 
database.


Note that you will also almost certainly want to use a connection 
pooler like PGBouncer and/or PGPool II (or both at the same time), so 
be sure to include that in your plans from the beginning.


That said, none of this is on topic for the performance mailing list.  
Please try to direct your questions to the right group next time.



Just a few additional pieces of information.
1) migration from one DBMS to another must always be lead as a project 
(because your data are always important ;-)

2) a migration project always has the following main tasks:
- setting a proper postgres platform (with all softwares, procedures and 
documentation needed to provide a good PostgreSQL service to your 
applications/clients) (you may already have such a platform).
- migrating the data. This concerns both the structure (DDL) and the 
data content.
- migration the stored procedures, if any. In Oracle migrations, this is 
often a big workload in the project.
- adapting the client application. The needed effort here can be huge or 
... null, depending on the used languages, whether the data access API 
are compatible or whether an ORM is used.
- when all this has been prepared, a test phase can start. This is very 
often the most costly part of the project, in particular for mission 
critical databases.

- then, you are ready to switch to Postgres.
3) do not hesitate to invest in education and external professional support.
4) before launching such a project, it is highly recommended to perform 
a preliminary study. For this purpose, as Julien said, ora2pg brings a 
big help in analysing the Oracle database content. The cost estimates 
are pretty well computed, which gives you very quickly an idea of the 
global cost of the database migration. For the application side, you may 
also have a look at code2pg.


KR. Philippe.


Re: Oracle to postgres migration

2019-04-08 Thread legrand legrand
Rick Otten-2 wrote
> On Mon, Apr 8, 2019 at 8:04 AM Julien Rouhaud <

> rjuju123@

> > wrote:
> 
> [...]
> 
> That said, none of this is on topic for the performance mailing list.
> Please try to direct your questions to the right group next time.

Is "general" the correct one ?
or should a "migration" group be created ;^> 

Regards
PAscal



--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html




Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Justin Pryzby
On Mon, Apr 08, 2019 at 04:55:36PM +0200, Krzysztof Plocharz wrote:
> We did pgrepack and it did help, but is it possible for
> get_actual_variable_range to take over 60 seconds?

You have many tables being joined, perhaps in exhaustive search, so maybe
that's being called many times.

What version postgres server ?

Justin




Re: Oracle to postgres migration

2019-04-08 Thread Gavin Flower

On 09/04/2019 08:02, legrand legrand wrote:

Rick Otten-2 wrote

On Mon, Apr 8, 2019 at 8:04 AM Julien Rouhaud <
rjuju123@
> wrote:

[...]

That said, none of this is on topic for the performance mailing list.
Please try to direct your questions to the right group next time.

Is "general" the correct one ?
or should a "migration" group be created ;^>

Regards
PAscal



--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


I think having a specific migration group would also be likely to 
improve the visibility of pg, and the idea of migrating to pg.  As it 
help pg to appear in more search results.



Cheers,
Gavin






Re: Oracle to postgres migration

2019-04-08 Thread David G. Johnston
On Mon, Apr 8, 2019 at 3:31 PM Gavin Flower 
wrote:

> I think having a specific migration group would also be likely to
> improve the visibility of pg, and the idea of migrating to pg.  As it
> help pg to appear in more search results.
>
>
I presently have qualms retaining novice, sql, performance, and probably
some others.  I don't think adding yet another specialized low-volume list
is of particular benefit.  Nor do I think it behooves the core project to
be in the center of migration support anyway.

This discussion can and should be moved to -general

David J.


Re: Planning performance problem (67626.278ms)

2019-04-08 Thread Andres Freund
Hi,

On 2019-04-08 16:10:17 -0700, Jeremy Schneider wrote:
> On 4/8/19 07:42, Justin Pryzby wrote:
> > On Mon, Apr 08, 2019 at 04:33:34PM +0200, Pavel Stehule wrote:
> >> po 8. 4. 2019 v 16:11 odesílatel Krzysztof Plocharz 
> >>  napsal:
> >>
> >>> We have some very strange query planning problem. Long story short it
> >>> takes 67626.278ms just to plan. Query execution takes 12ms.
> >>>
> >>> Query has 7 joins and 2 subselects.
> >>> It looks like the issue is not deterministic, sometimes is takes few ms
> >>> to plan the query.
> >>>
> >>> One of the tables has 550,485,942 live tuples and 743,504,012 dead
> >>> tuples. Running ANALYZE on that tables solves the problem only 
> >>> temporarily.
> >>>
> >>> Question is how can we debug what is going on?
> >>
> >> please check your indexes against bloating. Planner get min and max from
> >> indexes and this operation is slow on bloat indexes.
> > 
> > I think that's from get_actual_variable_range(), right ?
> 
> For what it's worth, I have seen a similar issue on Aurora PG 9.6 where
> query planning took a very long time (multiple minutes). In this
> particular case, there wasn't anything Aurora-specific about the call to
> get_actual_variable_range. We weren't able to distinctly identify the
> root cause or build a reproducible test case -- but we suspect that an
> inefficiency might exist in community PostgreSQL code.
> 
> For debugging, a few ideas:
> 
> 1) capture a stack with pstack or perf record --call-graph
> 
> 2) capture the execution plan of the SQL w slow planning
> 
> 3) capture detailed stats for all relations and objects involved
> 
> 4) capture the usual info for bug reporting (preface section in docs)
> 
> A reproducible test case is the gold standard; I'm keeping my eyes open
> for another case too.
> 
> For the slow planning case that I saw, the slow process was almost
> entirely in this call stack (captured with perf record --call-graph):
> ...
> index_fetch_heap
> index_getnext
> get_actual_variable_range
> ineq_histogram_selectivity
> scalarineqsel
> mergejoinscansel
> initial_cost_mergejoin
> try_mergejoin_path
> add_paths_to_joinrel
> make_join_rel
> join_search_one_level
> standard_join_search
> make_one_rel
> query_planner
> ...

I suspect some of this might be related to < 11 not having the following
commit:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3ca930fc39ccf987c1c22fd04a1e7463b5dd0dfd

Greetings,

Andres Freund