Max_connections limit
Hello team, We have migrated our database from Oracle 12c to Postgres 11. I need your suggestions , we have sessions limit in Oracle = 3024 . Do we need to set the same connection limit in Postgres as well. How we can decide the max_connections limit for postgres. Are there any differences in managing connections in Oracle and postgres. SQL> show parameter sessions; NAME TYPEVALUE --- -- java_max_sessionspace_size integer 0 java_soft_sessionspace_limit integer 0 license_max_sessions integer 0 license_sessions_warning integer 0 sessions integer 3024 shared_server_sessions integer SQL> Regards, Daulat
Re: Max_connections limit
Daulat Ram wrote: > We have migrated our database from Oracle 12c to Postgres 11. I need your > suggestions , > we have sessions limit in Oracle = 3024 . Do we need to set the same > connection limit > in Postgres as well. How we can decide the max_connections limit for postgres. > Are there any differences in managing connections in Oracle and postgres. I'd say that is way too high in both Oracle and PostgreSQL. Set the value to 50 or 100 and get a connection pooler if the application cannot do that itself. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Max_connections limit
You now that Postgres don’t have any shared_pool as Oracle, and the session information ( execution plan, etc..) are only available for the current session. Therefore I also highly recommend to us a connection poll as Laurent wrote, in order to have higher chance that some stuff is already cached in the shared session available. Regards Herve Envoyé de mon iPhone > Le 26 juin 2019 à 11:05, Laurenz Albe a écrit : > > Daulat Ram wrote: >> We have migrated our database from Oracle 12c to Postgres 11. I need your >> suggestions , >> we have sessions limit in Oracle = 3024 . Do we need to set the same >> connection limit >> in Postgres as well. How we can decide the max_connections limit for >> postgres. >> Are there any differences in managing connections in Oracle and postgres. > > I'd say that is way too high in both Oracle and PostgreSQL. > > Set the value to 50 or 100 and get a connection pooler if the > application cannot do that itself. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com > > >
Re: Max_connections limit
On Wed, Jun 26, 2019 at 5:16 AM Hervé Schweitzer (HER) < [email protected]> wrote: > You now that Postgres don’t have any shared_pool as Oracle, and the > session information ( execution plan, etc..) are only available for the > current session. Therefore I also highly recommend to us a connection poll > as Laurent wrote, in order to have higher chance that some stuff is already > cached in the shared session available. > > Regards > Herve > > The most popular stand-alone connection pooler for PostgreSQL is the oddly named "pgbouncer":https://wiki.postgresql.org/wiki/PgBouncer There are others, of course. PgPool is also very popular: https://www.pgpool.net/mediawiki/index.php/Main_Page Some applications can also manage a connection pool efficiently entirely within the application itself. Configuring the maximum number of concurrent connections your database supports incurs significant overhead in the running database. New connections and disconnections also have a high overhead as they occur. By moving the connecting/disconnecting logic to a connection pooler you remove a lot of overhead and load from the database - letting it focus on the important stuff -- your queries. It is amazing how many fewer actual connections you need to the database when you configure a pooler. Most connections from applications and users are idle most of the time. Even on busy web servers. They just keep that pathway open in case they need to run a query to save on the overhead of having to open a new one every time. By using a pooler you only need to configure connections for the number of concurrent _queries_ rather than concurrent application and user open but idle connections.
Re: Incorrect index used in few cases..
Hi I didn't see my following response got posted on the mailing list so not sure if it is duplicate. Sorry for late reply. The initial values before upgrade for seq_page_cost=1, random_page_cost=4 and after upgrading when we started to see the issues as we were seeing "Seq Scan" we change them seq_page_cost=1, random_page_cost=1 The issue happens only in production so making the index invalid would affect service so it isn't something we can do. I have tried to rebuild the PK index to see it helps or not but it doesn't seem help. Related to the same issue we sometimes see following Seq Scan on update when querying by PK alone which appears to be related. update tc set...where id = $1 and version <$2 Update on tc (cost=100.00..10003184001.52 rows=1 width=1848) -> Seq Scan on tc (cost=100.00..10003184001.52 rows=1 width=1848) Filter: ((version < '38'::numeric) AND (id = '53670604704'::numeric)) I was trying to find where the cost=100 is set in the source code but wasn't able to find it, do anyone where it is set? And if you someone can point me to the code where it goes through the execution plans when SQL is sent i can try to go through the code to see if can figure out what it is doing behind to scene in it's calculation? On Sun, Jun 23, 2019 at 8:07 AM AminPG Jaffer wrote: > > Sorry for late reply. > > The initial values before upgrade for seq_page_cost=1, random_page_cost=4 > and after upgrading when we started to see the issues as we were seeing > "Seq Scan" we change them seq_page_cost=1, random_page_cost=1 > > The issue happens only in production so making the index invalid would > affect service so it isn't something we can do. > I have tried to rebuild the PK index to see it helps or not but it doesn't > seem help. > > Related to the same issue we sometimes see following Seq Scan on update > when querying by PK alone which appears to be related. > >update tc set...where id = $1 and version <$2 >Update on tc (cost=100.00..10003184001.52 rows=1 > width=1848) > -> Seq Scan on tc (cost=100.00..10003184001.52 rows=1 > width=1848) > Filter: ((version < '38'::numeric) AND (id = > '53670604704'::numeric)) > > I was trying to find where the cost=100 is set in the source code > but wasn't able to find it, do anyone where it is set? > And if you someone can point me to the code where it goes through the > execution plans when SQL is sent i can try to go through the code to see if > can figure out what it is doing behind to scene in it's calculation? > > Thanks > > On Tue, Jun 18, 2019 at 3:23 PM Tom Lane wrote: > >> Andres Freund writes: >> > Are those indexes used for other queries? Any chance they've been >> > recently created? >> >> > SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready, >> > indislive, txid_current(), txid_current_snapshot() >> > FROM pg_index WHERE indrelid = 'tc'::regclass; >> >> > might tell us. >> >> Oh, that's a good idea. >> >> > Amin, might be worth to see what the query plan is if you disable that >> > index. I assume it's too big to quickly drop (based on the ? >> >> Considering that the "right" query plan would have a cost estimate in >> the single digits or close to it, I have to suppose that the planner is >> rejecting that index as unusable, not making a cost-based decision not >> to use it. (Well, maybe if it's bloated by three orders of magnitude >> compared to the other indexes, it'd lose on cost. Doesn't seem likely >> though.) >> >> So I think we're looking for a hard "can't use the index" reason, and >> now we've eliminated datatype mismatch which'd be the most obvious >> such reason. But index-isnt-valid or index-isnt-ready might do the >> trick. >> >> regards, tom lane >> >
RE: Max_connections limit
From: Daulat Ram [mailto:[email protected]] Sent: Wednesday, June 26, 2019 3:14 AM To: [email protected] Subject: Max_connections limit Hello team, We have migrated our database from Oracle 12c to Postgres 11. I need your suggestions , we have sessions limit in Oracle = 3024 . Do we need to set the same connection limit in Postgres as well. How we can decide the max_connections limit for postgres. Are there any differences in managing connections in Oracle and postgres. SQL> show parameter sessions; NAME TYPEVALUE --- -- java_max_sessionspace_size integer 0 java_soft_sessionspace_limit integer 0 license_max_sessions integer 0 license_sessions_warning integer 0 sessions integer 3024 shared_server_sessions integer SQL> Regards, Daulat The difference between Oracle and PG is that Oracle has "built-in" connection pooler, and PG does not. You should use external pooler (i.e. PgBouncer) and reduce number of allowed connections in PG config to about 50, while allowing thousands client connection when configuring PgBouncer. Regards, Igor Neyman
Re: scans on table fail to be excluded by partition bounds
On Tue, Jun 25, 2019 at 10:48:01AM +, Steven Winfield wrote: > > ts=# explain SELECT * FROM eric_enodeb_cell_metrics WHERE start_time > > BETWEEN '2019-01-01 04:00' AND '2019-01-01 05:00' OR start_time BETWEEN > > '2019-01-02 04:00' AND '2019-01-02 05:00' > > Maybe it's because of the implicit usage of the local timezone when the > strings are cast to (timestamp with time zone) in the values you give for > start_time here? > What happens if you specify it using "TIMESTAMP WITH TIME ZONE '2019-01-01 > 04:00-05'", etc.? It's the same. The timezone in the constraints is the default timezone so the that's correct. Justin
Re: Perplexing, regular decline in performance
On Tue, 25 Jun 2019 at 11:55, Benjamin Scherrey wrote: > Have you done a VACUUM ANALYZE FULL on your database? This needs to be > done periodically to inform the server of the statistics of how the data > and relations are distributed across the database. Without this bad > assumptions by the planner can cause degradation of performance. > Autovacuum is enabled. As well, we had problems with autovacum running reliably in 8.2, so we are still running a nightly script that runs VACUUM ANALYZE on the complete database. As for VACUUM ANALYZE FULL, the database underwent a full dump and reload, which, as I understand it, would have rebuilt the indexes, followed by an ANALYZE to update the planner. So I'm not sure a VACUUM ANALYZE FULL would have much effect. I'm also not sure how it bears on the problem stated here, where the planner shouldn't even be looking at this table in the queries we are timing. Also, if you are using the default settings in postgres.conf then > understand those are established to use the absolute minimum amount of > resources possible which means not taking advantage of available memory or > CPUs that may be available in your environment that would make the database > server more performant. > No, we attempted to tune these, using https://pgtune.leopard.in.ua. The following values are from our install script (hence why they don't look exactly like their .conf versions). And, as someone else asked, transparent huge pages are enabled: # DB Version: 11 # OS Type: linux # DB Type: web # Total Memory (RAM): 128 GB # CPUs = threads per core * cores per socket * sockets # CPUs num: 256 # Connections num: 250 # Data Storage: ssd # Set via sysctl # 64 GB in 4096 byte pages on our 128GB production system shmall = 15777216 # 48 GB on our 128GB production system shmmax = 51,539,607,552 # Huge Pages # Set via sysctl huge-pages-alloc = 0 shared-buffers = 32GB work-mem = 1024kB maintenance-work-mem = 2GB max-stack-depth = 4MB effective-io-concurrency = 200 max-parallel-workers-per-gather = 128 max-parallel-workers = 256 # # postgresql-conf-archive # wal-buffers = 16MB min-wal-size = 1GB max-wal-size = 2GB checkpoint-completion-target = 0.7 archive-mode = on archive-timeout = 900 # # postgresql-conf-query # # 75% of production memory effective-cache-size = 96GB # SSD drives random-page-cost = 1.1 default-statistics-target = 100 I'll be providing further details in reply to another message in the thread. Thanks!
Re: Perplexing, regular decline in performance
On Tue, Jun 25, 2019 at 8:49 AM Hugh Ranalli wrote: > What we continued to notice was a milder but still definite trend of > increased query times, during the course of each week, from the mid to high > 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had noticed > that as the number of "raw_page" columns in a particular table grew, > performance would decline. They wrote a script that once a week locks the > table, deletes the processed large columns (they are not needed after > processing), copies the remaining data to a backup table, truncates the > original table, then copies it back. When this script runs we see an > immediate change in performance, from 380 ms in the hour before the drop, to > 250 ms in the hour of the drop. As rows with these populated columns are > added during the course of a week, the performance drops, steadily, until the > next week's cleaning operation. Each week the performance increase is clear > and significant. Can you show us the definition of the table, including its indexes? Can you describe the data and distribution of values within the columns, particularly where they're indexed? -- Peter Geoghegan
Re: Perplexing, regular decline in performance
On Tue, 25 Jun 2019 at 12:23, Justin Pryzby wrote: > What kernel? Version? OS? > Ubuntu 18.04; current kernel is 4.15.0-51-generic4 If Linux, I wonder if transparent hugepages or KSM are enabled ? It seems > possible that truncating the table is clearing enough RAM to mitigate the > issue, similar to restarting the DB. > tail /sys/kernel/mm/ksm/run > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag > /sys/kernel/mm/transparent_hugepage/enabled > /sys/kernel/mm/transparent_hugepage/defrag > > https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com ==> /sys/kernel/mm/ksm/run <== 0 ==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <== 1 ==> /sys/kernel/mm/transparent_hugepage/enabled <== always [madvise] never ==> /sys/kernel/mm/transparent_hugepage/defrag <== always defer defer+madvise [madvise] never >From my research in preparing for the upgrade, I understood transparent huge pages were a good thing, and should be enabled. Is this not correct? 11.2 would have parallel query, and enabled by default. Are there other > settings you've changed (or not changed)? > https://wiki.postgresql.org/wiki/Server_Configuration I've just posted the parameters we are changing from the default in a previous reply, so I won't repeat them here unless you want me to. > It's possible that the "administrative" queries are using up lots of your > shared_buffers, which are (also/more) needed by the customer-facing > queries. I > would install pg_buffercache to investigate. Or, just pause the admin > queries > and see if that the issue goes away during that interval ? > Pausing the admin queries isn't an option in our environment, especially as the issue reveals itself over the course of days, not minutes or hours. ?column?| count | count | datname |coalesce | toast | dirtyfrac |avg +-+---+---+-+++ 0.24904101286779650995 | 1044545 | 0 | mydb | position || 0. | 4.8035517857057379 0.16701241622795295199 | 700495 | 0 | mydb | stat_position_click || 0. | 1.9870234619804567 0.09935032779251879171 | 416702 | 6964 | mydb | pg_toast_19788 | harvested_job | 0.01671218280689797505 | 1.9346079452462431 0.06979762146872315533 | 292750 | 0 | mydb | url || 0. | 4.9627873612297182 0.03795774662998486745 | 159205 | 0 | mydb | stat_sponsored_position || 0. | 1.8412361420809648 0.02923155381784048663 | 122605 | 0 | mydb | pg_toast_20174 | page | 0. | 3.0259532645487541 0.02755283459406156353 | 115564 | 0 | mydb | location || 0. | 4.9953532241874632 0.02015273698468076320 | 84526 | 1122 | mydb | harvested_job || 0.01327402219435439983 | 4.9922154130090150 0.01913348905375406298 | 80251 | 0 | mydb | pg_toast_20257 | position_index | 0. | 4.9880001495308470 harvested_job is the rapidly growing "problematic" table I am talking about. page is the 355 GB table that gets referenced on the public searches. I'll google, but is there a place I should look to understand what I am seeing here? Also, Should pg_buffercache perhaps be run at the beginning and end of the week, to see if there is a significant difference? > Could you send query plan for the slow (customer-facing) queries? > > https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN > I can, but can I ask why this would matter? I'm not looking to optimise the query (although I'm sure it could be; this is a legacy system with lots of barnacles). The problem is that the same query performs increasingly slowly over the course of a week, seemingly in sync with the rows with a large toast column added to one particular table (which, as I mentioned, isn't referenced by the query in question). Wouldn't the plan be the same at both the start of the week (when the problematic table is essentially empty) and at the end (when it is much larger)? Thanks! Hugh
Re: Perplexing, regular decline in performance
On Wed, 26 Jun 2019 at 14:52, Peter Geoghegan wrote: > Can you show us the definition of the table, including its indexes? > Can you describe the data and distribution of values within the > columns, particularly where they're indexed? > I'm sorry, but I'm not sure what you mean by the "distribution of values within the columns." Can you clarify or provide an link to an example? Thanks, Hugh
Re: Perplexing, regular decline in performance
On Wed, Jun 26, 2019 at 12:02 PM Hugh Ranalli wrote: > I'm sorry, but I'm not sure what you mean by the "distribution of values > within the columns." Can you clarify or provide an link to an example? I would mostly just like to see the schema of the table in question, including indexes, and a high-level description of the nature of the data in the table. Ideally, you would also include pg_stats.* information for all columns in the table. That will actually let us see a summary of the data. Though you should be careful about leaking sensitive information that happens to be contained in the statistics, such as the most common values. -- Peter Geoghegan
Re: Perplexing, regular decline in performance
On 2019-Jun-26, Hugh Ranalli wrote: > From my research in preparing for the upgrade, I understood transparent > huge pages were a good thing, and should be enabled. Is this not correct? It is not. > Wouldn't the plan be the same at both > the start of the week (when the problematic table is essentially empty) and > at the end (when it is much larger)? Not necessarily. Though, if a plan change was the culprit you would probably see a sudden change in performance characteristics rather than gradual. Worth making sure, anyway. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Perplexing, regular decline in performance
Alvaro Herrera writes: > On 2019-Jun-26, Hugh Ranalli wrote: >> From my research in preparing for the upgrade, I understood transparent >> huge pages were a good thing, and should be enabled. Is this not correct? > It is not. Yeah ... they would be a good thing perhaps if the quality of the kernel implementation were better. But there are way too many nasty corner cases, at least with the kernel versions people around here have experimented with. You're best off to disable THP and instead manually arrange for Postgres' shared memory to use huge pages. I forget where to look for docs about doing that, but I think we have some. regards, tom lane
Re: scans on table fail to be excluded by partition bounds
On Tue, 25 Jun 2019 at 05:31, Justin Pryzby wrote: > ts=# explain SELECT * FROM eric_enodeb_cell_metrics WHERE start_time BETWEEN > '2019-01-01 04:00' AND '2019-01-01 05:00' OR start_time BETWEEN '2019-01-02 > 04:00' AND '2019-01-02 05:00'; > Append (cost=36.04..39668.56 rows=12817 width=2730) >-> Bitmap Heap Scan on eric_enodeb_cell_20190101 (cost=36.04..19504.14 > rows=6398 width=2730) > Recheck Cond: (((start_time >= '2019-01-01 04:00:00-05'::timestamp > with time zone) AND (start_time <= '2019-01-01 05:00:00-05'::timestamp with > time zone)) OR ((start_time >= '2019-01-02 04:00:00-05'::timestamp with time > zone) AND (start_time <= '2019-01-02 05:00:00-05'::timestamp with time zone))) > -> BitmapOr (cost=36.04..36.04 rows=6723 width=0) >-> Bitmap Index Scan on eric_enodeb_cell_20190101_idx > (cost=0.00..16.81 rows=6465 width=0) > Index Cond: ((start_time >= '2019-01-01 > 04:00:00-05'::timestamp with time zone) AND (start_time <= '2019-01-01 > 05:00:00-05'::timestamp with time zone)) >-> Bitmap Index Scan on eric_enodeb_cell_20190101_idx > (cost=0.00..16.03 rows=259 width=0) > Index Cond: ((start_time >= '2019-01-02 > 04:00:00-05'::timestamp with time zone) AND (start_time <= '2019-01-02 > 05:00:00-05'::timestamp with time zone)) >-> Bitmap Heap Scan on eric_enodeb_cell_20190102 (cost=36.08..20100.34 > rows=6419 width=2730) > Recheck Cond: (((start_time >= '2019-01-01 04:00:00-05'::timestamp > with time zone) AND (start_time <= '2019-01-01 05:00:00-05'::timestamp with > time zone)) OR ((start_time >= '2019-01-02 04:00:00-05'::timestamp with time > zone) AND (start_time <= '2019-01-02 05:00:00-05'::timestamp with time zone))) > -> BitmapOr (cost=36.08..36.08 rows=6982 width=0) >-> Bitmap Index Scan on eric_enodeb_cell_20190102_idx > (cost=0.00..16.03 rows=259 width=0) > Index Cond: ((start_time >= '2019-01-01 > 04:00:00-05'::timestamp with time zone) AND (start_time <= '2019-01-01 > 05:00:00-05'::timestamp with time zone)) >-> Bitmap Index Scan on eric_enodeb_cell_20190102_idx > (cost=0.00..16.84 rows=6723 width=0) > Index Cond: ((start_time >= '2019-01-02 > 04:00:00-05'::timestamp with time zone) AND (start_time <= '2019-01-02 > 05:00:00-05'::timestamp with time zone)) > > Is there some reason why the partition constraints aren't excluding any of the > index scans ? Yeah, we don't do anything to remove base quals that are redundant due to the partition constraint. There was a patch [1] to try and fix this but it's not seen any recent activity. [1] https://commitfest.postgresql.org/19/1264/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Perplexing, regular decline in performance
On Wed, Jun 26, 2019 at 03:00:43PM -0400, Hugh Ranalli wrote: > Pausing the admin queries isn't an option in our environment, especially as > the issue reveals itself over the course of days, not minutes or hours. Perhaps you can pause it for a short while at EOW and see if there's a dramatic improvement ? > ?column?| count | count | datname |coalesce | > toast | dirtyfrac |avg > +-+---+---+-+++ > 0.24904101286779650995 | 1044545 | 0 | mydb | position| >| 0. | 4.8035517857057379 > 0.16701241622795295199 | 700495 | 0 | mydb | stat_position_click | >| 0. | 1.9870234619804567 > 0.09935032779251879171 | 416702 | 6964 | mydb | pg_toast_19788 | > harvested_job | 0.01671218280689797505 | 1.9346079452462431 > 0.06979762146872315533 | 292750 | 0 | mydb | url | >| 0. | 4.9627873612297182 > 0.03795774662998486745 | 159205 | 0 | mydb | > stat_sponsored_position || 0. | > 1.8412361420809648 > 0.02923155381784048663 | 122605 | 0 | mydb | pg_toast_20174 | > page | 0. | 3.0259532645487541 > 0.02755283459406156353 | 115564 | 0 | mydb | location| >| 0. | 4.9953532241874632 > 0.02015273698468076320 | 84526 | 1122 | mydb | harvested_job | >| 0.01327402219435439983 | 4.9922154130090150 > 0.01913348905375406298 | 80251 | 0 | mydb | pg_toast_20257 | > position_index | 0. | 4.9880001495308470 > > harvested_job is the rapidly growing "problematic" table I am talking > about. page is the 355 GB table that gets referenced on the public > searches. I'll google, but is there a place I should look to understand > what I am seeing here? I should label the columns: |buffer_fraction | nbuffers| ndirty| datname | relname | toast | dirtyfrac | avgusage It looks like possibly harvested job is being index scanned, and its toast table is using up many buffers. At the EOW, maybe that number is at the expense of more important data. You could check pg_stat_user_tables/indexes for stats on that. Possibly you could make use of index-only scans using covering indexes (pg11 supports INCLUDE). Or maybe it's just too big (maybe it should be partitioned or maybe index should be repacked?) > Also, Should pg_buffercache perhaps be run at the beginning and end of the > week, to see if there is a significant difference? Yes; buffercache can be pretty volatile, so I'd save it numerous times each at beginning and end of week. > > Could you send query plan for the slow (customer-facing) queries? > > > > https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN > > I can, but can I ask why this would matter? My very tentative guess is that harvested_job itself isn't the issue, but some other, 3rd thing is the issue, which also increases (at least roughly) with time, same as that table. It'd help to see the buffer cache hit rate for that query (and its different query plan nodes), at beginning and EOW. Justin
