Re: dirty_ratio & dirty_background_ratio settings with huge memory
On 03/06/2018 01:16 PM, Laurenz Albe wrote: > pinker wrote: >> I've got cutomer with really huge RAM, now it's: >> total used free sharedbuffers cached >> Mem: 31021113052596 49515 2088019922961185 >> -/+ buffers/cache: 904183011693 >> Swap: 8191 1 8190 >> >> (free -m) >> >> and before it was twice more (6TB). >> >> and trying to figure out how to set dirty_ratio & dirty_background_ratio >> parameters. Even for normal sized server those default sizes are to high, >> but now would be ridiculously huge, respectively 1,24TB and 300GB. >> I'm thinking about 1 percent for dirty_ratio and not using >> dirty_background_ratio because it's not possible to set it below 1% but to >> set dirty_background_bytes instead to about 1-2GB. >> Maybe someone has got other experience with RAM of this size and those >> settings? > > Yes, you should set vm.dirty_background_bytes and vm.dirty_bytes > and not use the *_ratio settings. > > 2 GB for vm.dirty_background_bytes and 1 GB for vm.dirty_bytes sounds fine. > It should be the other way around: dirty_background_bytes < dirty_bytes regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/21/2018 05:09 PM, Alessandro Aste wrote: > Hi there, we are using postgresql 10.3 and we're facing an issue with a > query. The query (full query below) completes only when: > > 1 - LIMIT 10 is removed > or > 2 - show max_parallel_workers_per_gather is set to 0, so parallel > processing is disabled. > > With max_parallel_workers_per_gather set to the default value (8) I'm > not even able to get the query plan. > > Notes: > > * We're experiencing the issue in any server of ours but I've > reproduced the issue in a fresh restored database with full > vacuum/reindex of the tables. > * We didn't touch any parameter concering the parallel processing, > we're running the defaults: > > > cmdstaging=# show max_parallel_workers_per_gather ; > max_parallel_workers_per_gather > - > 8 > (1 row) > > cmdstaging=# show max_worker_processes ; > max_worker_processes > -- > 8 > (1 row) > > cmdstaging=# show max_parallel_workers; > max_parallel_workers > -- > 8 > (1 row) > > > > > The query completes only omitting the LIMIT clause or when I disable > parallel processing: > > id | vendor_id | gaa | pop_name | pop_status | pop_location | > pop_provider_id | pop_provider | pop_street | pop_city | pop > _postal_code | pop_state | pop_country | pop_country_id > +---+-+---++--+-+--+-+--+ > -+---+-+ > 684807 | 12346 | | GTT/POP/LON1T | Active | LON1T | > 12288 | Telehouse UK | 14 Coriander Avenue | London | E14 > 2AA | | GB | 219 > (1 row) > > Time: 4374.759 ms (00:04.375) > cmdstaging=# show max_parallel_workers_per_gather ; > max_parallel_workers_per_gather > - > 0 > (1 row) > > Time: 0.097 ms > > > Otherwise it just keep running for forever. > When you say "running forever" is it actually using CPU, or does it get stuck on something? > > This is the full query: > > > SELECT * FROM ( > SELECT > seg.circuit_id AS id, > vendor_gtt_pop.vendor_id, > CASE WHEN vendor_gtt_pop.push_to_gaa = 1 THEN 'Y' END as gaa, > pop.gii_circuitid AS pop_name, > cst.label AS pop_status, > seg.a_company_name AS pop_location, > seg.vendor_id AS pop_provider_id, > seg.vendor_name AS pop_provider, > cs.address1 AS pop_street, > cs.city AS pop_city, > cs.postal_code AS pop_postal_code, > cs.state AS pop_state, > csc.code AS pop_country, > cs.country_id AS pop_country_id > FROM ( > SELECT c.gii_circuitid, max(so.id <http://so.id>) AS service_order_id > FROM service_order so > join circuit c on c.product_id=so.product_id > join master_service_order mso on mso.id > <http://mso.id>=so.master_service_order_id > WHERE NOT (so.ordertype_id = 2 AND so.status <> 999) AND > NOT (so.ordertype_id = 3 AND so.status <> 999) AND > c.status >= 20 AND > c.status not in (160,999) AND > mso.client_id=11615 AND > c.service_description=28 AND > c.status!=160 > GROUP BY c.gii_circuitid > ) pop > JOIN service_order so ON so.id <http://so.id> = pop.service_order_id > left JOIN client_site cs on cs.id <http://cs.id>=so.a_site_id > left JOIN country csc on csc.id <http://csc.id>=cs.country_id > JOIN circuit c ON so.product_id=c.product_id > JOIN circuit_status cst ON cst.id <http://cst.id>=c.status > JOIN ( > SELECT c.id <http://c.id> AS circuit_id, sg.id <http://sg.id> AS segment_id, > c.pop_support_vendor_id AS vendor_id, > v.name <http://v.name> AS vendor_name, sg.a_company_name > FROM segment sg > JOIN circuit_layout cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal > = 1 > JOIN circuit c ON c.id <http://c.id> = cl.circuit_id > JOIN vendor v ON v.id <http://v.id> = c.pop_support_vendor_id > ) seg ON seg.circuit_id = c.id <http://c.id> > JOIN vendor_gtt_pop on vendor_gtt_pop.gtt_pop_id = seg.circuit_id > ) foo where vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id LIMIT 10 > > > > Execution plan with max_parallel_workers_per_gather =0 , > max_parallel_workers_per_gather =8 and no LIMIT clause : > > We really need to see the execution plan that causes issues, i.e. max_parallel_workers_per_gather=8 with LIMIT clause. Plain explain (without analyze), at least. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/21/2018 08:44 PM, Alessandro Aste wrote: > Thanks for your reply Tomas. The query just got stuck for forever. I > observed no CPU spikes, it is currently running and I see 89 of the CPU > idle. I'm using a relative powerfull machine (120 GB ram, 56 CPU). > That doesn't really answer the question, I'm afraid. I suppose "89 of CPU" means that 89% idle in total, but 11% with 56 CPUs still means about 6 cores 100% busy. But maybe you meant something else? Is there something else running on the machine? If you look at "top" are the processes (the one you're connected to and the parallel workers) doing something on the CPU? > > Plain analyze as requested. : > I don't see anything obviously broken with the query plan, and it's difficult to compare with the other plans because they are quite different. But there's one part of the plan interesting: Limit (cost=253523.56..253523.57 rows=1 width=176) -> Sort (cost=253523.56..253523.57 rows=1 width=176) Sort Key: c_2.gii_circuitid, c_1.id -> Nested Loop (cost=33190.89..253523.55 rows=1 width=176) Join Filter: (c_1.id = c.id) -> Nested Loop (cost=31724.87..31736.29 rows=1 ...) ... -> Gather (cost=1466.02..221787.23 rows=3 width=75) Workers Planned: 5 -> Hash Join (cost=466.02..220786.93 rows=1 ...) ... That is, there's a Gather on the inner side of a Nested Loop. I wonder if that might cause issues in case of under-estimate (in which case we'd be restarting the Gather many times) ... BTW one of the plans you sent earlier is incomplete, because it ends like this: -> Nested Loop (cost=42469.41..42480.82 rows=1 width=85) (...) Join Filter: (c.status = cst.id) Time: 3016.688 ms (00:03.017) That is, it's missing the part below the join. That being said, I'm not sure what's the issue here. Can you prepare a self-contained test case that we might use to reproduce the issue? For example by dumping the relevant part of the tables? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/22/2018 11:33 AM, Alessandro Aste wrote: > Tomas, thank you. This machine is abare metal server running only a > staging postgresql 10.3 instance. Nobody is using it beside me. > > I'm attaching 4 files. > > every_30_seconds_top_stats_during_query.txt - this is a caputure of > the top command every 30 seconds(more or less) for 10+ minutes while I'm > running the query. Let me know if this helps to answere your question. > EXPLAIN_WITH_LIMIT_AND_max_parallel_workers_per_gather_SET_TO_0.txt - > query plan with full query and max_parallel_workers_per_gather force > to 0. Full output. > EXPLAIN_WITH_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.txt - > query plan with full query and default parellel processing settings. > Full output. > EXPLAIN_WITHOUT_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.tx - > query plan of the query omitting the LIMIT clause and default parellel > processing settings. Full output. > OK. Looking at the top output, I see this: PID USER VIRT RES SHR S %CPU %MEMTIME+ COMMAND 104880 postgres 30.8g 1.9g 1.9g R 92.0 1.5 15:15.60 postmaster 111732 postgres 30.8g 476m 473m R 88.2 0.4 0:00.47 postmaster 111730 postgres 30.8g 473m 470m R 86.4 0.4 0:00.46 postmaster 111731 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111733 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111734 postgres 30.8g 476m 473m R 86.4 0.4 0:00.46 postmaster 111728 root 15824 1912 828 R 3.8 0.0 0:00.04 top That means it certainly is not stuck, it's simply doing a lot of work on CPU. The question is why and what it's doing ... Can you collect some CPU profiles using perf? There's a howto here: https://wiki.postgresql.org/wiki/Profiling_with_perf But in short - install perf, install debuginfo packages for postgres, and then do perf record -a -g -s sleep 60 while running the query. Once the profile data is collected, do perf report > report.txt and share the report.txt with us (well, if it's very large you may need to only cut the first 1000 lines or so). That should tell us in which functions most of the time is spent. That will give us some hints, hopefully. > > For what concerns the self-contained test case - I'll do my best to > prepare it. > Yeah, that would be helpful. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
On 03/22/2018 11:29 PM, Alessandro Aste wrote: > Thanks Tomas. We're currently building postgres from source. In order to > enable symbols, you want me to re-configure postres with --enable-debug > then run perf? > Yes. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on
Hi, the perf profiles look really weird - clearly, there's a lot of lock contention, because the top item is this 13.49%13.40% postmaster postgres [.] LWLockAcquire | ---LWLockAcquire That's a sign of lock contention - not sure which one, though. And then there's heap_hot_search_buffer 12.65% 7.92% postmaster postgres [.] heap_hot_search_buffer | ---heap_hot_search_buffer So either there's a table with many HOT-updated rows (repeatedly), or we search the HOT chains very frequently for some reason. Considering it only affects the non-LIMIT case, I'd guess the latter. There also seems to be quite a few page faults, for some reason. Trying to reproduce this without the actual data is rather futile. We need some sort of reproducer (sample data to test the query on). regards On 03/28/2018 10:30 AM, Alessandro Aste wrote: > Hello, any news ? > > Thank you, > > Alessandro. > > On Fri, Mar 23, 2018 at 8:22 PM, Alessandro Aste > mailto:alessandro.a...@gmail.com>> wrote: > > PS , in the meanwhile I discovered a 2nd workaround(beside disabling > parallel processing) . I added offset 0 to the subquery , and, > according to the documentation, “OFFSET 0 is the same as omitting > the OFFSET clause” > - https://www.postgresql.org/docs/current/static/queries-limit.html > <https://www.postgresql.org/docs/current/static/queries-limit.html> > > cmd3dev=# show max_parallel_workers_per_gather ; > > *max_parallel_workers_per_gather* > > *-* > > *8* > > (1 row) > > > > cmd3dev=# \timing > > Timing is on. > > cmd3dev=# SELECT * FROM (SELECT seg.circuit_id AS id, > vendor_gtt_pop.vendor_id, CASE WHEN vendor_gtt_pop.push_to_gaa = 1 > THEN 'Y' END as gaa, pop.gii_circuitid AS pop_name, cst.label AS > pop_status, seg.a_company_name AS pop_location, seg.vendor_id AS > pop_provider_id, seg.vendor_name AS pop_provider, cs.address1 AS > pop_street, cs.city AS pop_city, cs.postal_code AS pop_postal_code, > cs.state AS pop_state, csc.code AS pop_country, cs.country_id AS > pop_country_id FROM ( SELECT c.gii_circuitid, max(so.id > <http://so.id>) AS service_order_id FROM service_order so join > circuit c on c.product_id=so.product_id join master_service_order > mso on mso.id <http://mso.id>=so.master_service_order_id WHERE NOT > (so.ordertype_id = 2 AND so.status <> 999) AND NOT (so.ordertype_id > = 3 AND so.status <> 999) AND c.status >= 20 AND c.status not in > (160,999) AND mso.client_id=11615 AND c.service_description=28 AND > c.status!=160 GROUP BY c.gii_circuitid ) pop JOIN service_order so > ON so.id <http://so.id> = pop.service_order_id left JOIN client_site > cs on cs.id <http://cs.id>=so.a_site_id left JOIN country csc on > csc.id <http://csc.id>=cs.country_id JOIN circuit c ON > so.product_id=c.product_id JOIN circuit_status cst ON cst.id > <http://cst.id>=c.status JOIN ( SELECT c.id <http://c.id> AS > circuit_id, sg.id <http://sg.id> AS segment_id, > c.pop_support_vendor_id AS vendor_id, v.name <http://v.name> AS > vendor_name, sg.a_company_name FROM segment sg JOIN circuit_layout > cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal = 1 JOIN > circuit c ON c.id <http://c.id> = cl.circuit_id JOIN vendor v ON > v.id <http://v.id> = c.pop_support_vendor_id ) seg ON seg.circuit_id > = c.id <http://c.id> JOIN vendor_gtt_pop on > vendor_gtt_pop.gtt_pop_id = seg.circuit_id offset 0) foo where > vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id desc LIMIT 10; > > id | vendor_id | gaa | pop_name | pop_status | > pop_location | pop_provider_id | pop_provider | pop_street > | pop_city | pop > > _postal_code | pop_state | pop_country | pop_country_id > > > +---+-+---++--+-+--+-+--+ > > -+---+-+ > > 684807 | 12346 | Y | GTT/POP/LON1T | Active | LON1T > | 12288 | Telehouse UK | 14 Coriander Avenue | London | E14 > > 2AA | | GB | 219 > > (1 row) > > > > *Time: 2245.073 ms (00:02.245)* > > > > > On Fri, Mar 23, 2018 at 9:31 AM, Alessandro Aste > mailto:alessandro.a...@gmail.com>> wrote: >
Re: Seems like there is an issue with reltuples showing twice the number of rows
On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote: > I am relying on reltuples on my web app to get fast row counts. > > This was recommended by this article to get fast approx row > counts: https://wiki.postgresql.org/wiki/Count_estimate > > > However for some table I am getting twice as many values when I try to > do this. I did some more research and came up with this query. > > select reltuples, n_live_tup, n_dead_tup > from pg_stat_user_tables join pg_class using (relname) > where relname = > 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d'; > > > it returns > > > reltuples | n_live_tup | n_dead_tup > -++ > 2.7209e+06 | 1360448 | 1360448 > > > If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d > > and I run the same query again, > > reltuples | n_live_tup | n_dead_tup > -++ > 1.36045e+06 | 1360448 | 1360448 > > But after some time the value goes back to being double the value. This > is causing me a lot of problems since this inaccuracy does not make any > sense to me. > There was a difference between VACUUM and ANALYZE in handling recently dead rows (essentially deleted rows that can't be removed yet), causing similar changes to reltuples. Essentially if you do VACUUM and ANALYZE, it may set reltuples to rather different estimates. That is fixed now and should be in the next minor release. Hard to say if this is a case of that, but judging by the number of dead tuples chances are it is. It's probably better to use n_live_tup instead, though. I'd say that's closer to the "live tuples" definition. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ERROR: found multixact from before relminmxid
On 04/06/2018 02:09 AM, Alexandre Arruda wrote: > Hi, > > Some time ago, I had this errors frequently showed in logs after some > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables > show the same and not complete the tasks (showed by some table bloat > select). > Then, I did a full dump/restore into a new version (10.2) and everything > is ok for a couple of months. Now, I have errors like this again: > > db1=# cluster pc44t; > > ERROR: found multixact 134100944 from before relminmxid 192042633 > > > > Like before, the only way to make the errors to desapear is by > dump/reload the whole table. > > > Thanks for any help. > That's going to be hard, unless you still have the cluster around. This surely seems like some sort of data corruption issue, but without being able to inspect the data pages it's nearly impossible to determine what went wrong. We'd also need more information about what happened to the hardware and cluster before the issues started to appear - crashes, hardware issues. And so on. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ERROR: found multixact from before relminmxid
On 04/06/2018 04:29 PM, Alexandre Arruda wrote: > 2018-04-06 9:39 GMT-03:00 Tomas Vondra <mailto:tomas.von...@2ndquadrant.com>>: > > > > On 04/06/2018 02:09 AM, Alexandre Arruda wrote: > > Hi, > > > > Some time ago, I had this errors frequently showed in logs after some > > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables > > show the same and not complete the tasks (showed by some table bloat > > select). > > Then, I did a full dump/restore into a new version (10.2) and everything > > is ok for a couple of months. Now, I have errors like this again: > > > > db1=# cluster pc44t; > > > > ERROR: found multixact 134100944 from before relminmxid 192042633 > > > > > > > > Like before, the only way to make the errors to desapear is by > > dump/reload the whole table. > > > > > > Thanks for any help. > > > > That's going to be hard, unless you still have the cluster around. > > This surely seems like some sort of data corruption issue, but without > being able to inspect the data pages it's nearly impossible to determine > what went wrong. > > We'd also need more information about what happened to the hardware and > cluster before the issues started to appear - crashes, hardware issues. > And so on. > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > Hi Tomas, > The old cluster are gone, unfortunatly. > > This server is a 128GB ECC RAM with a dual redundant hba fiber channel > connect to a sotorage with Raid 6 and I don't have (apparently) any > errors reported. > Yesterday I did a test with one table: some sum aggragates, count(*), > etc, then dump/reload and repeat the tests the results (of querys) are > the same, regarding the vacuum problem > thats disapeared. > I'm not sure I understand correctly. So you can reproduce the issue? If yes, how can you share the scripts/data you use (and how large is it)? If we could reproduce it locally, it would make the investigation much easier. BTW you mentioned you're using PostgreSQL 9.6 - which minor version, specifically? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: ERROR: found multixact from before relminmxid
On 04/09/2018 01:49 PM, Alexandre Arruda wrote: > > > 2018-04-06 13:11 GMT-03:00 Tomas Vondra <mailto:tomas.von...@2ndquadrant.com>>: > > > > On 04/06/2018 04:29 PM, Alexandre Arruda wrote: > > 2018-04-06 9:39 GMT-03:00 Tomas Vondra <mailto:tomas.von...@2ndquadrant.com> > > <mailto:tomas.von...@2ndquadrant.com > <mailto:tomas.von...@2ndquadrant.com>>>: > > > > > > > > On 04/06/2018 02:09 AM, Alexandre Arruda wrote: > > > Hi, > > > > > > Some time ago, I had this errors frequently showed in logs > after some > > > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in > this tables > > > show the same and not complete the tasks (showed by some > table bloat > > > select). > > > Then, I did a full dump/restore into a new version (10.2) > and everything > > > is ok for a couple of months. Now, I have errors like this > again: > > > > > > db1=# cluster pc44t; > > > > > > ERROR: found multixact 134100944 from before relminmxid > 192042633 > > > > > > > > > > > > Like before, the only way to make the errors to desapear is by > > > dump/reload the whole table. > > > > > > > > > Thanks for any help. > > > > > > > That's going to be hard, unless you still have the cluster around. > > > > This surely seems like some sort of data corruption issue, but > without > > being able to inspect the data pages it's nearly impossible to > determine > > what went wrong. > > > > We'd also need more information about what happened to the > hardware and > > cluster before the issues started to appear - crashes, > hardware issues. > > And so on. > > > > regards > > > > -- > > Tomas Vondra http://www.2ndQuadrant.com > > PostgreSQL Development, 24x7 Support, Remote DBA, Training & > Services > > > > > > Hi Tomas, > > The old cluster are gone, unfortunatly. > > > > This server is a 128GB ECC RAM with a dual redundant hba fiber channel > > connect to a sotorage with Raid 6 and I don't have (apparently) any > > errors reported. > > Yesterday I did a test with one table: some sum aggragates, count(*), > > etc, then dump/reload and repeat the tests the results (of querys) are > > the same, regarding the vacuum problem > > thats disapeared. > > > > I'm not sure I understand correctly. So you can reproduce the issue? If > yes, how can you share the scripts/data you use (and how large is it)? > If we could reproduce it locally, it would make the investigation much > easier. > > BTW you mentioned you're using PostgreSQL 9.6 - which minor version, > specifically? > > > regards > > > Hi Tomas, > > No, I can't reproduce. What I did is a simple way to "validate" the > current table data to see if a dump/reload > preserve them. Old postgresql was 9.6.5. The problem returns now in new > 10.3 installation. > I'm confused. First you say you can't reproduce the issue, then you say it got back on a new installation. So which is it? > There is a way to correct this tables without a dump/reload ? > > I'm thinking to reinstall cluster doing a initdb --data-checksums, > but I'm affraid about a severe performance impact. > The performance impact should be non-measurable for most workloads. It might be a couple of percent on certain workloads/hardware, but that's probably not your case. The bigger question is whether this can actually detect the issue. If it's due to an storage issue, then perhaps yes. But if you only see multixact issues consistently and nothing else, it might easily be a PostgreSQL bug (in which case the checksum will be correct). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Shared buffers increased but cache hit ratio is still 85%
On 07/18/2018 10:43 AM, Andreas Kretschmer wrote: > > > Am 18.07.2018 um 10:26 schrieb Hans Schou: >> Am I doing something wrong or should some history be cleared? > > Reset the stats for that database. You can check the date of last reset > with: > > select stats_reset from pg_stat_database where datname = 'database_name'; > > and reset it with: > > ||pg_stat_reset() > || > ||Reset all statistics counters for the current database to zero > (requires superuser privileges by default, but EXECUTE for this function > can be granted to others.)|| > It might be better to note current values of the counters somewhere, and compute a delta later (and use that to compute the cache hit ratio). The issue is that pg_stat_reset() throws away all sorts of interesting and important stats, including those driving autovacuum/autoanalyze. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 11:07 AM, Raghavendra Rao J S V wrote: Hi All, We are using postgres *9.2* version on *Centos *operating system. We have around *1300+* tables.We have following auto vacuum settings are enables. Still few of the tables(84 tables) which are always busy are not vacuumed.Dead tuples in those tables are more than 5000. Due to that tables are bloating and observed few areas has performance degradation. You don't say how large the tables are, so it's impossible to say whether 5000 dead tuples is excessive or not. IMHO it's a negligible amount and should not lead to excessive bloat or issues. A certain amount of wasted is expected - it's a trade-off between immediate and delayed cleanup. If you delay the cleanup a bit, it's going to be more efficient overall. It's also unclear why the tables are not vacuumed - it may easily be due to all the autovacuum workers being constantly busy, unable to cleanup all tables in a timely manner. In that case lowering the threshold is not going to help, on the contrary. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 03:41 PM, Raghavendra Rao J S V wrote: Hi Tomas, Thank you very much for your response. As we know table becomes a candidate for autovacuum process based on below formula. *Autovacuum VACUUM thresold for a table = autovacuum_vacuum_scale_factor * number of tuples + autovacuum_vacuum_threshold* *Current settings in my database are as follows.* /autovacuum_vacuum_scale_factor = 0.1/ /autovacuum_vacuum_threshold = 40/ Due to above formula the dead tuples are accumulating based on the number of live tuples as show below picture. select relname,n_live_tup,n_dead_tup,(n_live_tup*.1+40) expected_to_autovacuum,* from pg_stat_user_tables where n_dead_tup>800 order by n_live_tup desc limit 100; In order to avoid the dead tuples accumulation I wold like to change the auto vacuum settings in *"postgresql.conf"* as below. /autovacuum_vacuum_scale_factor = 0.01/ //autovacuum_vacuum_threshold = 100/ / OK, so the tables apparently have enough dead tuples to trigger vacuum. That mean the autovacuum throughput is insufficient to do all the cleanup. If you lower the scale factor, the amount of cleanup will *increase* (more tables being eligible for cleanup) making it less likely autovacuum can keep up. You need to increase the throughtput, by increasing vacuum_cost_limit or something like that. *Kindly guide me your views. Does it cause any adverse effect on DB.* * Well, it forces the database to do more stuff / more often, so it may have adverse impact, of course. It's hard to say if it's going to be a win overall, because we don't know how serious is the bloat. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Fwd: is there any adverse effect on DB if I set autovacuum scale factor to zero?
On 08/13/2018 04:24 PM, Ashu Pachauri wrote: + pgsql-general Thanks and Regards, Ashu Pachauri -- Forwarded message - From: *Ashu Pachauri* mailto:ashu210...@gmail.com>> Date: Mon, Aug 13, 2018 at 7:53 PM Subject: Re: is there any adverse effect on DB if I set autovacuum scale factor to zero? To: mailto:raghavendra...@gmail.com>> The way I see *autovacuum_vacuum_scale_factor* is not in terms of absolute number but as the percentage of any table that can consist of updated / deleted tuples to make it eligible for vacuuming. A factor of 0.1 ensures that your tables would be eligible for vacuuming if more than 10% of the tuples are deleted/updated. 1. If you think that 10% is too high for you in terms of storage cost, you can decrease the number or set it to zero. But, I would advise to increase the value of *autovacuum_vacuum_threshold* to something reasonable if you do that, otherwise you pay the CPU cost frequent vacuuming across all tables. 2. However, if your issue is not the fixed 10% overhead but the lack of throughput i.e. you see the number of deleted/updated tuples keeps increasing in an unbounded fashion, the right way to deal with it is a) Having higher value of *autovacuum_max_workers* b) lower value for *autovacuum_naptime*. Increasing autovacuum_max_workers is unlikely to solve the issue with throughput, because all the workers are throttled together - there's a limit on the amount of work that can be done per second. Increasing the number of workers is akin to allowing more cars on a highway, but also lowering the speed limit. You need to increase the limit on amount of work, and lowering naptime is one way to do that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Code of Conduct
On 09/18/2018 01:47 PM, James Keener wrote: > following a long consultation process It's not a consultation if any dissenting voice is simply ignored. Don't sugar-coat or politicize it like this -- it was rammed down everyone's throats. That is core's right, but don't act as everyone's opinions and concerns were taken into consideration. I respectfully disagree. I'm not sure which dissenting voices you think were ignored, but from what I've observed in the various CoC threads the core team took the time to respond to all comments. That does not necessarily mean the resulting CoC makes everyone happy, but unfortunately that's not quite possible. And it does not mean it was not an honest consultation. IMO the core team did a good job in listening to comments, tweaking the wording and/or explaining the reasoning. Kudos to them. There are a good number of folks who are concerned that this CoC is overreaching and is ripe for abuse. Those concerns were always simply, plainly, and purposely ignored. No, they were not. There were multiple long discussions about exactly these dangers, You may dislike the outcome, but it was not ignored. > Please take time to read and understand the CoC, which is intended to ensure that PostgreSQL remains an open and enjoyable project for anyone to join and participate in. I sincerely hope so, and that it doesn't become a tool to enforce social ideology like in other groups I've been part of. Especially since this is the main place to come to get help for PostgreSQL and not a social club. Ultimately, it's a matter of trust that the CoC committee and core team apply the CoC in a careful and cautious way. Based on my personal experience with most of the people involved in both groups I'm not worried about this part. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Help to understand Actual Rows vs Plan Rows from the query planner output
Plan Rows: 50 > + > Plan Width: 0 > + > Actual Startup Time: 0.048 > + > Actual Total Time: 0.048 > + > Actual Rows: 1 > + > Actual Loops: 1 > + > Index Cond: "(lower((city)::text) = 'melanyfort'::text)" > + > - Node Type: "Bitmap Index Scan" > + > Parent Relationship: "Member" > + > Parallel Aware: false > + > Index Name: "lower_state_users_idx" > + > Startup Cost: 0.00 > + > Total Cost: 5.87 > + > Plan Rows: 211 > + > Plan Width: 0 > + > Actual Startup Time: 0.102 > + > Actual Total Time: 0.102 > + > Actual Rows: 211 > + > Actual Loops: 1 > + > Index Cond: "(lower((state)::text) = 'ohio'::text)" > + > Planning Time: 0.260 > + > Triggers: > + > Execution Time: 0.249 > (1 row) > > aruprakshit=# > > > -- > > In the first node type of "Bitmap Index Scan” on “users_lower_idx”, I > see the plan rows are 50, but actual rows it got 1. In the second node > type of "Bitmap Index Scan” on “ lower_state_users_idx”, I see the plan > rows are 211, and actual rows 211. Both are same. Based on what > conditions planner estimated the planned and actual rows count? > > > In node type “BitmapAnd”, I see again the actual rows 1, then why on the > final plan i.e. Node Type: "Bitmap Heap Scan” again planner estimated > rows 1? How does it counts these? What does the Loops count says us > about the query? > It's not very clear to me whether you're asking how the planner computes estimates in general, or how it computed these particular estimates (or what issues you see there). Perhaps this would give you at least some answers: https://www.postgresql.org/docs/11/static/row-estimation-examples.html regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Why the index is not used ?
Hi, On 10/07/2018 08:32 PM, ROS Didier wrote: > Hi Francisco > > Thank you for your remark. > You're right, but it's the only procedure I found to make search on > encrypted fields with good response times (using index) ! > Unfortunately, that kinda invalidates the whole purpose of in-database encryption - you'll have encrypted on-disk data in one place, and then plaintext right next to it. If you're dealing with credit card numbers, then you presumably care about PCI DSS, and this is likely a direct violation of that. > Regarding access to the file system, our servers are in protected network areas. few people can connect to it. > Then why do you need encryption at all? If you assume access to the filesystem / storage is protected, why do you bother with encryption? What is your threat model? > it's not the best solution, but we have data encryption needs and > good performance needs too. I do not know how to do it except the > specified procedure.. > > if anyone has any proposals to put this in place, I'm interested. > One thing you could do is hashing the value and then searching by the hash. So aside from having the encrypted column you'll also have a short hash, and you may use it in the query *together* with the original condition. It does not need to be unique (in fact it should not be to make it impossible to reverse the hash), but it needs to have enough distinct values to make the index efficient. Say, 10k values should be enough, because that means 0.01% selectivity. So the function might look like this, for example: CREATE FUNCTION cchash(text) RETURNS int AS $$ SELECT abs(hashtext($1)) % 1; $$ LANGUAGE sql; and then be used like this: CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc)); and in the query SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32' AND cchash(cc) = cchash('test value 32'); Obviously, this does not really solve the issues with having to pass the password to the query, making it visible in pg_stat_activity, various logs etc. Which is why people generally use FDE for the whole disk, which is transparent and provides the same level of protection. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Why the index is not used ?
Hi, On 10/08/2018 04:10 PM, ROS Didier wrote: > Hi Tomas > > Thank you for your answer and recommendation which is very > interesting. I'm going to study the PCI DSS document right now. > > * Here are my answer to your question : > > />>/ > /What is your threat model?/ > /< we want to prevent access to sensitive data for everyone except those > who have the encryption key. > in case of files theft, backups theft, dumps theft, we do not want > anyone to access sensitive data. > The thing is - encryption is not panacea. The interesting question is whether this improves security compared to simply using FDE and regular access rights (which are grantable at the column level). Using those two pieces properly may very well be a better defense than not well designed encryption scheme - and based on this discussion, it does not seem very polished / resilient. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: COPY command returns "ERROR: invalid XML content"
On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote: Hi, I'm using libpq (v10) to import lots of xml files into a PG10 table. I noticed if number of records imported exceeds 2100 then the following error is returned: ERROR: invalid XML content DETAIL: line 1: Couldn't find end of Start Tag timeBasedFileNamingAndTriggerin line 1 logFile.%d{-MM-dd}.%i.html My guess is this is an issue/limitation in libxml2, which we use to parse and process XML. What libxml2 version you have installed? Can you share an example of a XML document to reproduce the issue? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Install postgres on rhel 7
On Mon, Oct 07, 2019 at 01:00:32AM -0400, Mageshwaran Janarthanam wrote: Hi Team, I have question on the postgres install process. Tried to read the archives but couldn't find it. I wanted to install postgres server via rpm and wanted to store the binaries under non default location like /pgbin. The rpm package (at least not those available at yum.postgresql.org) are not relocatable: rpm -qpi postgresql12-12.0-1PGDG.rhel7.x86_64.rpm | grep Relocations Relocations : (not relocatable) So unfortunately, you'll have to build your own packages, or something like that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: PG 12 not yet for mac
On Mon, Oct 07, 2019 at 11:11:43AM -0400, Ravi Krishna wrote: https://postgresapp.com/downloads.html The link which says PG 12 is actually PG 11. Not sure if the link is correct or not (it seems to point to Postgres-2.3-12.dmg, which seems like it might be v12). But more imporantly, that's not a project/page managed by the PostgreSQL community, you need to report the issues to the authors (most likely through github issues). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgcrypto question
On Mon, Oct 07, 2019 at 12:05:16PM -0400, Erik Aronesty wrote: Currently, it is my understanding that the pgcrypto library requires the user to send a password or private key up to the server for decryption. Correct. In the naive case the key is included in each SQL query, which does have various issues. Bruce Momjian has a nice extension that allows you to fix that by loading the key into backend memory: http://momjian.us/download/pgcryptokey/ Is there a notion of a client-side encrypt/decrypt plugin when doing a postgres query? For example, a user could query postgres, get back data of type "encrypted", and a "libpq" plugin could decode/decrypt those columns that are of data type "encrypted" in a manner transparent to the user of the client Obviously I could write this by intercepting the relevant libpq calls using LD_PRELOAD or Microsoft's "Detours" ... but it there a better way to do that? AFAIk that's usually done at the application level, i.e. the application is sending/receiving encrypted data, and the database simply sees bytea columns. I'm not aware of a driver doing that transparently, but it seems like an interesting idea - I wonder if it could be done e.g. in psycopg as an extension, or something like that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgcrypto question
On Mon, Oct 07, 2019 at 02:51:30PM -0400, Erik Aronesty wrote: Good idea for "psycopg". It would be easy for a POC, but I think the only meaningful layer to operate at would be a libpq drop-in replacement that intercepts PQgetvalue, PQprepare, PQexecParams, PQexecPrepared ... etc. That way odbc, python, node, etc would "just work" as long as you used LD_PRELOAD appropriately. It's not clear to me how would that know which columns are encrypted, with what key, etc. Because those encrypted columns are essentially just regular bytea columns, so there's no easy way to distinguish them. I'm no psycopg2 expert, but it does have some infrastructure for casting PostgreSQL types to Python types, and I guess that could be used for the encryption. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: temporary files
On Mon, Oct 07, 2019 at 01:52:41PM -0700, dangal wrote: Hello everyone, I have a question to see if you can help me, I have set work_mem in 100 MB but I have the following in the pgbadger Queries generating the most temporary files (N) Count Total size Min size Max sizeAvg size 58 3.24 GiB 57.15 MiB 57.15 MiB 57.15 MiB How could it be that if you require 57 MB, 100 MB will not reach ? thanks a lot The on-disk and in-memory representations are not the same, and the on-disk one is often significantly more efficient. Data that needs 57MB temporary file may beed ~150MB memory to perform in-memory sort, for example. Yet another reason may be that e.g. hash join splits the memory into batches, and each one has to fit into work_mem. And whevener we hit the limit, we double the number of batches, i.e. we cut the batch size in half. Thus the size is somewhere between 50 and 100MB, with ~75MB on average. Furthermore, we also estimate the batch size before running the query, so we may start with unnecessarily many batches, pushing the average batch size down even more. FWIW it's somewhat suspicious you have 58 temp files with almost no variability in size. That seems as if a single query was executed repeatedly. Maybe try looking into what query that is. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Table locking during backup
On Mon, Oct 07, 2019 at 11:00:08PM +0200, Artur Zając wrote: Hi, I cannot reproduce some scenario I found in my PostgreSQL logs. I have two connections/processes: Process 24755 is standard pg_backup connection with: . BEGIN; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ ONLY . LOCK TABLE gm.tableabc; . COPY FROM gm.tableabc Process 5969 is normal connection where I have: LOG: proces 5969 still waiting for AccessExclusiveLock on relations 562888531 of database 16429 after 1000.066 ms DETAILT: Process holding the lock: 24755. Wait queue: 5969 Query that is waiting (from proces 5969) is: CREATE TEMP TABLE IF NOT EXISTS tableabc ( Id BIGINT DEFAULT gm.top() ) INHERITS (gm.tableabc); I cannot reproduce what pg_dump is doing that causes waiting on proces 5969 (which lock is acquired and why). When pg_dump ends lock is released and proces 5969 continues. I know that doing in two parallel connections: BEGIN; CREATE TEMP TABLE IF NOT EXISTS tableabc ( Id BIGINT DEFAULT gm.top() ) INHERITS (gm.tableabc); causes waiting state on second connection until the first finished, but pg_dump connection is read only. Not sure why would it matter that the pg_dump connection is read-only, this is about locking because pg_dump needs to ensure the schema does not change while it's running. pg_dump does not do LOCK TABLE gm.tableabc; but LOCK TABLE gm.tableabc IN ACCESS SHARE MODE; Which should be visible in pg_locks system view. And it does conflict with the ACCESS EXCLUSIVE mode, used by the second query. Could you suggest me which part of pg_dump (which query) might cause that behaviour. It's this line: https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dump.c#L6676 regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: timescaleDB & WAL replication
On Tue, Oct 15, 2019 at 09:28:27AM +0200, basti wrote: Hello, I have a Master / Slave Postgres setup with WAL Replication. Now I want to add timescaleDB. I found this todo: https://docs.timescale.com/latest/tutorials/replication As I understand that in the right way I just need to add the timescaleDB extention on the master side? AFAIK timescale is "just" a regular extension, in which case yes, all you need to do is installing it on the master side. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Fri, Oct 18, 2019 at 09:14:09PM -0500, Ariadne Conill wrote: Hello, On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost wrote: Greetings, * Ariadne Conill (aria...@dereferenced.org) wrote: > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver wrote: > > https://www.postgresql.org/docs/11/functions-json.html > > " The field/element/path extraction operators return NULL, rather than > > failing, if the JSON input does not have the right structure to match > > the request; for example if no such element exists" > > It is known that the extraction operators return NULL. The problem > here is jsonb_set() returning NULL when it encounters SQL NULL. > > > Just trying to figure why one is worse then the other. > > Any time a user loses data, it is worse. The preference for not > having data loss is why Pleroma uses PostgreSQL as it's database of > choice, as PostgreSQL has traditionally valued durability. If we > should not use PostgreSQL, just say so. Your contention that the documented, clear, and easily addressed behavior of a particular strict function equates to "the database system loses data and isn't durable" is really hurting your arguments here, not helping it. The argument about how it's unintuitive and can cause application developers to misuse the function (which is clearly an application bug, but perhaps an understandable one if the function interface isn't intuitive or is confusing) is a reasonable one and might be convincing enough to result in a change here. I'd suggest sticking to the latter argument when making this case. > > > I believe that anything that can be catastrophically broken by users > > > not following upgrade instructions precisely is a serious problem, and > > > can lead to serious problems. I am sure that this is not the only > > > project using JSONB which have had users destroy their own data in > > > such a completely preventable fashion. Let's be clear here that the issue with the upgrade instructions was that the user didn't follow your *application's* upgrade instructions, and your later code wasn't written to use the function, as documented, properly- this isn't a case of PG destroying your data. It's fine to contend that the interface sucks and that we should change it, but the argument that PG is eating data because the application sent a query to the database telling it, based on our documentation, to eat the data, isn't appropriate. Again, let's have a reasonable discussion here about if it makes sense to make a change here because the interface isn't intuitive and doesn't match what other systems do (I'm guessing it isn't in the SQL standard either, so we unfortunately can't look to that for help; though I'd hardly be surprised if they supported what PG does today anyway). Okay, I will admit that saying PG is eating data is perhaps hyperbolic, My experience is that using such hyperbole is pretty detrimental, even when one is trying to make a pretty sensible case. The problem is that people often respond in a similarly hyperbolic claims, particularly when you hit a nerve. And that's exactly what happened here, becase we're *extremely* sensitive about data corruption issues, so when you claim PostgreSQL is "eating data" people are likely to jump on you, beating you with the documentation stick. It's unfortunate, but it's also entirely predictable. but I will also say that the behaviour of jsonb_set() under this type of edge case is unintuitive and frequently results in unintended data loss. So, while PostgreSQL is not actually eating the data, it is putting the user in a position where they may suffer data loss if they are not extremely careful. Here is how other implementations handle this case: MySQL/MariaDB: select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"a":null,"b":2,"c":3} Microsoft SQL Server: select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in: {"b":2,"c":3} Both of these outcomes make sense, given the nature of JSON objects. I am actually more in favor of what MSSQL does however, I think that makes the most sense of all. I do mostly agree with this. The json[b]_set behavior seems rather surprising, and I think I've seen a couple of cases running into exactly this issue. I've solved that with a simple CASE, but maybe changing the behavior would be better. That's unlikely to be back-patchable, though, so maybe a better option is to create a non-strict wrappers. But that does not work when the user is unaware of the behavior :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Replication of Replication
On Fri, Oct 18, 2019 at 05:54:34PM -0400, Edilmar Alves wrote: I have 3 servers running CentOS+PG 11 (postgresql11-server-11.5-1PGDG.rhel7.x86_64): - s1: main db + publication - s2: subscription of the main db all works fine until here... Now, I tried to config this: - s2: publication of the same db that was replicated from s1 - s3: subscription of the db from s2 When I did this, replication s1=>s2 stopped, and replication s2=>s3 never worked. Is this not possible? You'll have to share more details - error messages from the server log, how you check that the replication stopped, etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: releasing space
On Thu, Oct 17, 2019 at 05:20:09PM +, Julie Nishimura wrote: Hello everybody, We are running PostgreSQL 9.6.2 cluster master -> standby (streaming replication). 22 tb of space (constantly struggling with the space, pruning the old data, but not fast enough). The biggest db takes 16 tb. So, we've copied it to another server, and now we would like to delete it from our original source, to free up the space. What would be the right approach for this? Just issue drop database command (16tb). How long it might take? Should we do it gradually (drop biggest tables first)? Any suggestions? Caveats? Generally speaking, DROP DATABASE simply recursively drops all the various objects - indexes, tables, etc. It mostly just deleting the files, which should not be very expensive (we certainly don't need to delete all the data or anything), but there's certain number of I/O involved. But it does depend on the OS / filesystem / hardware if that's an issue. So if you want to be on the safe side, you can drop the objects one by one, with a bit of delay between them, to throttle the I/O a bit. FWIW the latest minor release for 9.6 is 9.6.15, you're 13 minor versions (~30 months) of fixes behind. You might want to consider upgrading ... -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Has there been any discussion of custom dictionaries being defined in the database?
On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote: Morris de Oryx writes: Given that Amazon is bragging this week about turning off Oracle, it seems like they could kick some resources towards contributing something to the Postgres project. With that in mind, is the idea of defining dictionaries within a table somehow meritless, or unexpectedly difficult? Well, it'd just be totally different. I don't think anybody cares to provide two separate definitions of common dictionaries (which'd have to somehow be kept in sync). As for why we did it with external text files in the first place --- for at least some of the dictionary types, the point is that you can drop in data files that are available from upstream sources, without any modification. Getting the same info into a table would require some nonzero amount of data transformation. IMHO being able to load dictionaries from a table would be quite useful, and not just because of RDS. For example, it's not entirely true we're just using the upstream dictionaries verbatim - it's quite common to add new words, particularly in specialized fields. That's way easier when you can do that through a table and not through a file. Having said that ... in the end a dictionary is really just a set of functions implementing the dictionary API; where they get their data from is their business. So in theory you could roll your own dictionary that gets its data out of a table. But the dictionary API would be pretty hard to implement except in C, and I bet RDS doesn't let you install your own C functions either :-( Not sure. Of course, if we expect the dictionary to work just like the ispell one, with preprocessing the dictionary into shmem, then that requires C. I don't think that's entirely necessary, thoug - we could use the table directly. Yes, that would be slower, but maybe it'd be sufficient. But I think the idea is ultimately that we'd implement a new dict type in core, and people would just specify which table to load data from. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote: ... The hyperbole here is misplaced. There is a difference between a bug and a POLA violation. This might be the latter, but it isn't the former. So please tone it down a bit. It's not the function that's unsafe, but the ill-informed use of it. We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it since 9.5. That's five releases ago. So it's a bit late to be coming to us telling us it's not safe (according to your preconceptions of what it should be doing). We could change it prospectively (i.e. from release 13 on) if we choose. But absent an actual bug (i.e. acting contrary to documented behaviour) we do not normally backpatch such changes, especially when there is a simple workaround for the perceived problem. And it's that policy that is in large measure responsible for Postgres' deserved reputation for stability. Yeah. Incidentally, why is your function written in plpgsql? Wouldn't a simple SQL wrapper be better? create or replace function safe_jsonb_set (target jsonb, path text[], new_value jsonb, create_missing boolean default true) returns jsonb as $func$ select case when new_value is null then target else jsonb_set(target, path, new_value, create_missing) end $func$ language sql; And if we were to change it I'm not at all sure that we should do it the way that's suggested here, which strikes me as no more intuitive than the current behaviour. Rather I think we should possibly fill in a json null in the indicated place. Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why jsonb_set(..., "...", NULL) should do the same thing as jsonb_set(..., "...", 'null':jsonb) I'm not entirely surprised it's what MySQL does ;-) but I'd say treating it as a deletion of the key (just like MSSQL) is somewhat more sensible. But I admit it's quite subjective. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 11:21:26AM -0400, Stephen Frost wrote: Greetings, * Dmitry Dolgov (9erthali...@gmail.com) wrote: If we want to change it, the question is where to stop? Essentially we have: update table set data = some_func(data, some_args_with_null); where some_func happened to be jsonb_set, but could be any strict function. I don't think it makes any sense to try and extrapolate this out to other strict functions. Functions should be strict when it makes sense for them to be- in this case, it sounds like it doesn't really make sense for jsonb_set to be strict, and that's where we stop it. Yeah. I think the issue here is (partially) that other databases adopted similar functions after us, but decided to use a different behavior. It might be more natural for the users, but that does not mean we should change the other strict functions. Plus I'm not sure if SQL standard says anything about strict functions (I found nothing, but I looked only very quickly), but I'm pretty sure we can't change how basic operators change, and we translate them to function calls (e.g. 1+2 is int4pl(1,2)). I wonder if in this case it makes sense to think about an alternative? For example, there is generic type subscripting patch, that allows to update a jsonb in the following way: update table set jsonb_data[key] = 'value'; It doesn't look like a function, so it's not a big deal if it will handle NULL values differently. And at the same time one can argue, that people, who are not aware about this caveat with jsonb_set and NULL values, will most likely use it due to a bit simpler syntax (more similar to some popular programming languages). This seems like an entirely independent thing ... Right. Useful, but entirely separate feature. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote: On 10/19/19 12:32 PM, David G. Johnston wrote: On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra mailto:tomas.von...@2ndquadrant.com>> wrote: > >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it >since 9.5. That's five releases ago. So it's a bit late to be coming to >us telling us it's not safe (according to your preconceptions of what it >should be doing). > There have been numerous complaints and questions about this behavior in those five years; and none of the responses to those defenses has actually made the current behavior sound beneficial but rather have simply said "this is how it works, deal with it". I haven't seen a patch, which for most possible solutions should be fairly simple to code. This is open source. Code speaks louder than complaints. IMHO that might be a bit too harsh - I'm not surprised no one sent a patch when we're repeatedly telling people "you're holding it wrong". Without a clear consensus what the "correct" behavior is, I wouldn't send a patch either. > >We could change it prospectively (i.e. from release 13 on) if we choose. >But absent an actual bug (i.e. acting contrary to documented behaviour) >we do not normally backpatch such changes, especially when there is a >simple workaround for the perceived problem. And it's that policy that >is in large measure responsible for Postgres' deserved reputation for >stability. > Yeah. Agreed, this is v13 material if enough people come on board to support making a change. We have changed such things in the past. But maybe a new function might be a better way to go. I haven't given it enough thought yet. I think the #1 thing we should certainly do is explaining the behavior in the docs. >And if we were to change it I'm not at all sure that we should do it the >way that's suggested here, which strikes me as no more intuitive than >the current behaviour. Rather I think we should possibly fill in a json >null in the indicated place. > Not sure, but that seems rather confusing to me, because it's mixing SQL NULL and JSON null, i.e. it's not clear to me why [...] But I admit it's quite subjective. Providing SQL NULL to this function and asking it to do something with that is indeed subjective - with no obvious reasonable default, and I agree that "return a NULL" while possible consistent is probably the least useful behavior that could have been chosen. We should never have allowed an SQL NULL to be an acceptable argument in the first place, and can reasonably safely and effectively prevent it going forward. Then people will have to explicitly code what they want to do if their data and queries present this invalid unknown data to the function. How exactly do we prevent a NULL being passed as an argument? The only thing we could do would be to raise an exception, I think. That seems like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing. I don't know, but if we don't know what the "right" behavior with NULL is, is raising an exception really that ugly? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: releasing space
On Sat, Oct 19, 2019 at 04:34:32PM -0700, Adrian Klaver wrote: On 10/19/19 4:17 PM, Julie Nishimura wrote: Thank you, Thomas. Do you know if it is safe to replicate 9.6.2 (smaller) db to 9.6.15 (larger capacity) using pg_basebackup? Would it be considered as an upgrade? pg_basebackup backups an entire Postgres cluster which will be many databases. So when you say db do mean a Postgres cluster or an individual database? My understanding is Julie wants to create a copy of a 9.6.2 cluster using pg_basebackup and then run 9.6.15 on it. That's OK, it's essentially a minor version upgrade. FWIW Julie, please don't top post - it just makes it harder to follow the discussion. Also, this seems like a completely separate question, unrelated to the DROP DATABLASE one. It might be better to start a new thread instead of repurposing an existing one. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: On 10/20/19 1:14 PM, David G. Johnston wrote: On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan mailto:andrew.duns...@2ndquadrant.com>> wrote: And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. That's kinda the point - if they never pass NULL they won't encounter any problems but as soon as the data and their application don't see eye-to-eye the application developer has to decide what they want to do about it. We are in no position to decide for them and making it obvious they have a decision to make and implement here doesn't seem like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: CPU SPIKE
On Mon, Oct 21, 2019 at 10:45:33AM +0530, Pawan Sharma wrote: Hi All, Having real high CPU issue (95-98%), with SELECT statements and select queries contains multiple AND operator, is it will cause any CPU Spike..??? apps team is using sub-partition, PG11, CPU:24, Mem: 16GB Any query execution is using CPU, so yes - it can cause a spike in CPU usage. The question is whether the query plan is efficient, or whether it's using the CPU unenecessarily. You need to show us EXPLAIN ANALYZE and perhaps explain why you think it shouldn't use this much CPU. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: On 10/20/19 4:18 PM, Tomas Vondra wrote: On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote: On 10/20/19 1:14 PM, David G. Johnston wrote: On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan mailto:andrew.duns...@2ndquadrant.com>> wrote: And yet another is to raise an exception, which is easy to write but really punts the issue back to the application programmer who will have to decide how to ensure they never pass in a NULL parameter. That's kinda the point - if they never pass NULL they won't encounter any problems but as soon as the data and their application don't see eye-to-eye the application developer has to decide what they want to do about it. We are in no position to decide for them and making it obvious they have a decision to make and implement here doesn't seem like a improper position to take. The app dev can avoid this problem today by making sure they don't pass a NULL as the value. Or they can use a wrapper function which does that for them. So frankly this doesn't seem like much of an advance. And, as has been noted, it's not consistent with what either MySQL or MSSQL do. In general I'm not that keen on raising an exception for cases like this. I think the general premise of this thread is that the application developer does not realize that may be necessary, because it's a bit surprising behavior, particularly when having more experience with other databases that behave differently. It's also pretty easy to not notice this issue for a long time, resulting in significant data loss. Let's say you're used to the MSSQL or MySQL behavior, you migrate your application to PostgreSQL or whatever - how do you find out about this behavior? Users are likely to visit https://www.postgresql.org/docs/12/functions-json.html but that says nothing about how jsonb_set works with NULL values :-( We should certainly fix that. I accept some responsibility for the omission. +1 You're right raising an exception may not be the "right behavior" for whatever definition of "right". But I kinda agree with David that it's somewhat reasonable when we don't know what the "universally correct" thing is (or when there's no such thing). IMHO that's better than silently discarding some of the data. I'm not arguing against the idea of improving the situation. But I am arguing against a minimal fix that will not provide much of value to a careful app developer. i.e. I want to do more to support app devs. Ideally they would not need to use wrapper functions. There will be plenty of situations where it is mighty inconvenient to catch an exception thrown by jsonb_set(). And catching exceptions can be expensive. You want to avoid that if possible in your performance-critical plpgsql code. True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). FWIW I think the JSON/JSONB part of our code base is amazing, and the fact that various other databases adopted something very similar over the last couple of years just confirms that. And if this is the only speck of dust in the API, I think that's pretty amazing. TY. When I first saw the SQL/JSON spec I thought I should send a request to the SQL standards committee for a royalty payment, since it looked so familiar ;-) ;-) I'm not sure how significant this issue actually is - it's true we got a couple of complaints over the years (judging by a quick search for jsonb_set and NULL in the archives), but I'm not sure that's enough to justify any changes in backbranches. I'd say no, but I have no idea how many people are affected by this but don't know about it ... No, no backpatching. As I said upthread, this isn't a bug, but it is arguably a POLA violation, which is why we should do something for release 13. WFM regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote: On 10/20/19 11:07 PM, Tomas Vondra wrote: On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote: True. And AFAIK catching exceptions is not really possible in some code, e.g. in stored procedures (because we can't do subtransactions, so no exception blocks). Can you explain the above to me as I thought there are exception blocks in stored functions and now sub-transactions in stored procedures. Sorry for the confusion - I've not been particularly careful when writing that response. Let me illustrate the issue with this example: CREATE TABLE t (a int); CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN -- SAVEPOINT s1; INSERT INTO t VALUES (1); -- COMMIT; EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; $$; CALL test(); If you uncomment the SAVEPOINT, you get NOTICE: error: unsupported transaction command in PL/pgSQL because savepoints are not allowed in stored procedures. Fine. If you uncomment the COMMIT, you get NOTICE: error: cannot commit while a subtransaction is active which happens because the EXCEPTION block creates a subtransaction, and we can't commit when it's active. But we can commit outside the exception block: CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$ DECLARE msg TEXT; BEGIN BEGIN INSERT INTO t VALUES (1); EXCEPTION WHEN others THEN msg := SUBSTR(SQLERRM, 1, 100); RAISE NOTICE 'error: %', msg; END; COMMIT; END; $$; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Barman
On Thu, Oct 31, 2019 at 05:29:34PM +, Daulat Ram wrote: Hi All, We have configured postgres 11.2 in streaming replication (primary & Standby) on docker and I am looking to initiate the Postgres backup using barman. As I know there are few options for taking backup using barman. RSYNC backup Incremental Backups Streaming Backup with continuous WAL streaming Centralized and Catalogued Backups Which is the best option for backup using barman? So that we can keep the database safe in case of disaster? I feel the Incremental Backups are most useful to perform the PITR but I want to know the experts suggestions. You're mixing a number of topics, here. Firstly, all backups done by barman are centralized and catalogued, that's pretty much one of the main purposes of barman. When it comes to backup methods, there are two basic methods. rsync and postgres (which means pg_basebackup). This is about creating the initial base backup. Both methods then can replicate WAL by either streaming or archive_command. So first you need to decide whether to use rsync and pg_basebackup, where rsync allows advanced features like incremental backup, parallel backup and deduplication. Then you need to decide whether to use archive_command or streaming (i.e. pg_receivexlog). The "right" backup method very much depends on the size of your database, activity, and so on. By default you should probably go with the default option, described as "scenario 1" in the barman docs, i.e. pg_basebackup (backup_method = postgres) and WAL streaming. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: How to import Apache parquet files?
On Tue, Nov 05, 2019 at 04:21:45PM +0100, Softwarelimits wrote: Hi Imre, thanks for the quick response - yes, I found that, but I was not sure if it is already production ready - also I would like to use the data with the timescale extension, that is why I need a full import. Well, we're not in the position to decide if parquet_fdw is production ready, that's something you need to ask author of the extension (and then also judge yourself). That being said, I think FDW is probably the best way to do this. It's explicitly designed to work with foreign data, so using it to access parquet files seems somewhat natural. The alternative is probably transforming the data into COPY format, and then load it into Postgres using COPY (either as a file, or stdin). Which of these options is the right one depends on your requirements. FDW is more convenient, but row-based and probably significantly less efficient than COPY. So if you have a lot of these parquet files, I'd probably use the COPY. But maybe the ability to query the parquet files directly (with FDW) is useful for you. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BigSQL pgc alternative
On Fri, Dec 20, 2019 at 12:15:06PM -0300, Samuel Teixeira Santos wrote: Hi all. BigSQL still allow to install Postgres and others resources as like a portable install. But today, it's only offer your tool (pgc) for newer Postgresql versions. I would like to install as portable option because it's more easy to config and use in my own user in my development station. What you recommend to do to replace pgc tool as alternative to install postgresql 10 and the respective postgis version as like portable option? I'm not familiar with BigSQL, so I'm not sure what exactly you mean when you say "portable option". Can you explain? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: jsonb_set() strictness considered harmful to data
On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote: On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule wrote: Hi po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan napsal: Updated version including docco and better error message. cheers andrew I think so my objections are solved. I have small objection + errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""), + errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used"))); "null_value_treatment := 'raise_exception'\"" it use proprietary PostgreSQL syntax for named parameters. Better to use ANSI/SQL syntax "null_value_treatment => 'raise_exception'\"" It is fixed in attached patch source compilation without warnings, compilation docs without warnings check-world passed without any problems I'll mark this patch as ready for commiter Thank you for your work Thanks for the review. I propose to commit this shortly. Now that this was committed, I've updated the patch status accordingly. Thanks! -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Avoiding out of date statistics / planner
On Wed, Feb 12, 2020 at 10:23:22AM -0700, Michael Lewis wrote: It may also be worth noting that it is possible to make autovacuum/analyze more aggressive, perhaps only on the tables that see large changes in data that might result in a statistics issue. If you could share a query, explain analyze output, and pseudo code or at least description of what sort of bulk operations are being done, then more insight could be offered. Another thing you can do is deploy auto_explain, and log explain plan for long-runnning queries. That won't fix the root cause, but it will help you with confirming the root cause - you'll see the query plan, which should give you enough context. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization
On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote: Hello, 1. I was told that M$ SQLServer provides huge performance deltas over PostgreSQL when dealing with index-unaligned queries : create index i on t (a,b, c); select * from t where b=... and c=...; Perhaps index-only scans might help here, but that generally does not work for "SELECT *" queries. Columnar storage has been tried by various companies, CitusData, EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been discussed quite a lot, last thread that I was able to find being in 2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com where Fujitsu's patch made it quite far. What is the status on such a storage manager extension interface ? I think you're looking for threads about zheap and (especially) zedstore. Those are two "storage manager" implementations various people are currently working on. Neither of those is likely to make it into pg13, though :-( 2. What do you think of adding a new syntax : 'from t join t2 using (fk_constraint)' ? And further graph algorithms to make automatic joins ? Both 'natural join' and 'using (column_name)' are useless when the columns are not the same in source and destination. Plus it is often the case that the fk_constraints are over numerous columns, even though this is usually advised against. But when this case happens there will be a significant writing speedup. I'm not really sure what's the point / benefit here. Initially it seemed you simply propose a syntax saying "do a join using the columns in the FK constraint" but it's unclear to me how this implies any writing speedup? I have been bothered by this to the point that I developed a graphical-query-builder plugin for pgModeler, https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode , but I believe such a syntax would be much better in the core! Hm, maybe. 3. What is the status of making the internal parser of PostgreSQL less coupled to the core, and easier to cherry-pick from outside? It would be great to incorporate it into companion projects : pgAdmin4, pgModeler, pgFormatter... I have no idea what you mean by "less coupled" here. What are the requirements / use cases you're thinking about? FWIW I think it's pretty bad idea to post questions about three very different topics into a single pgsql-hackers thread. That'll just lead to a lot of confusion. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: DB running out of memory issues after upgrade
On Tue, Feb 18, 2020 at 05:46:28PM +, Nagaraj Raj wrote: after upgrade Postgres to v9.6.11 from v9.6.9 DB running out of memory issues no world load has changed before and after upgrade. spec: RAM 16gb,4vCore Any bug reported like this or suggestions on how to fix this issue? I appreciate the response..!! This bug report (in fact, we don't know if it's a bug, but OK) is woefully incomplete :-( The server log is mostly useless, unfortunately - it just says a bunch of processes were killed (by OOM killer, most likely) so the server has to restart. It tells us nothing about why the backends consumed so much memory etc. What would help us is knowing how much memory was the backend (killed by OOM) consuming, which should be in dmesg. And then MemoryContextStats output - you need to connect to a backend consuming a lot of memory using gdb (before it gets killed) and do (gdb) p MemoryContextStats(TopMemoryContext) (gdb) q and show us the output printed into server log. If it's a backend running a query, it'd help knowing the execution plan. It would also help knowing the non-default configuration, i.e. stuff tweaked in postgresql.conf. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Check constraints do not seem to be working!!!
On 11/11/20 10:06 AM, Jitendra Loyal wrote: > Thanks Nikolay > > I read that but is there a way to meet the above requirement. And I will > like to add that IS NULL and IS NOT NULL should evaluate to true/false. > These operators are made for this and should not be returning NULL. > This has nothing to do with IS [NOT] NULL, it's the first part of the expression (b = TRUE) causing trouble. Essentially, the constraint (b = true) and (c is not null) is evaluated in two steps. First we evaluate the two parts individually, and for (null, true) the results would look like this: (b = true) => null (c is not null) => true and then we combine those results using 'AND' null AND true => null which is considered as if the constraint matches. If you want to handle NULL for the first expression, you may do this, for example: (b it not null and b = true) and (c is not null) Or something like that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Extended statistics for correlated columns, row estimates when values are not in MCVs list
On 12/4/20 11:53 PM, Michael Lewis wrote: > On version 12.5, I have a query similar to the below where I am getting > rather unfortunate row estimates and a sub-optimal index choice as a result. > > SELECT > id > FROM > messages > WHERE > client_id = 1234 > and site_id = 889977 > and message_type_id in ( 59, 62, 102, 162 ) > and sent_on > NOW() - INTERVAL '3 days'; > > I created extended statistics (all types) on client_id and site_id, > analyzed the columns, and in pg_stats_ext.dependencies (4 = client_id, 7 = > site_id) have: > {"4 => 7": 0.002997, "7 => 4": 0.896230} > > When I check row estimates like below, I get results that indicate the > columns are expected to be independent still. > explain SELECT id FROM messages; --889594304 > explain SELECT id FROM messages where client_id = 1234; --133439 > explain SELECT id FROM messages where site_id = 889977; --28800 > explain SELECT id FROM messages where client_id = 1234 and site_id = > 889977; --4 > I was a bit confused about this initially, because why would it still be estimated as independent clauses, with the extended stats defined? But I think the explanation is fairly simple - the combination of values is simply rare enough not to be included in the MCV list, so it probably gets estimated using the "regular" logic as if independent. But we still mark the clauses as estimated, so the functional dependencies are not really considered. (I wonder if the code in PG14 would work better, but that's of little use of course.) > However, I pick a client & site ID pair which show up in the MCVs list, > then I get the same estimate when querying for that site_id with or without > including the client_id. That is great. > I think this is consistent with the above explanation - in this case the MCV actually kicks in, significantly improving the estimate. > Is it reasonable to expect that if the correlation between two columns is > rather high, then the optimizer might figure the columns are not > independent and perhaps would give less weight to the value derived from > independent column statistics? With table statistics, it is possible to set > a static value or ratio for something like ndistinct. Any chance for > something similar on dependency someday? > > Perhaps I am expecting too much or have a poor understanding of what > extended statistics can or someday might do. I deal with under estimates > from correlations between client_id and sites or other similar > dependent objects a fair bit and am hopeful to resolve some of those > planning problems with extended stats, without the maintenance overhead of > migrating everything to a single client per database to get more specific > statistics. > What you might try is defining the statistics with only the functional dependencies. That should consider the column-level correlation even when the combination of values is not in the MCV. It might make the "good" estimate worse, but that's an inherent trade-off. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Potential BRIN Index Corruption
On 12/9/20 12:07 AM, Huan Ruan wrote: > HI Alvaro > > Unfortunately those records were written a while ago and we no longer keep > their WAL logs. Thanks for your help anyway. > Can you estimate when roughly the records were written? E.g. by using a rough estimate of WAL or XIDs generated per day, or something like that. Maybe there was some I/O issue in that time period and a couple writes got lost, or something like that. Are there any other corrupted indexes on the table? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Is there a way to dump schema to files on disk and keep them in sync
On 12/13/20 6:34 PM, Adrian Klaver wrote: > On 12/12/20 6:58 PM, Tim Uckun wrote: >> I want to dump my postgres schema to disk in neat directories like >> pgadmin presents. Then I want to be able to edit the files and sync >> changes to the database and ideally if changes were made in the >> database to sync them back to the disk. >> >> Is there a tool that does this? Is there a tool that will dump the >> schema into separate directories and files like pgadmin does? > > pgAdmin does not create directories, it just organizes the contents of > the system catalogs into GUI elements. > > For schema management I would suggest a tool like the one I use > Sqitch(https://sqitch.org/). It will organize the process of schema > creation and management. > Yeah, that was my thought too. Or maybe look at the other schema versioning tools available - we have a list on the wiki: https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques I'm sure it's incomplete, but it's helpful nevertheless. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PostgreSQL 14.4 ERROR: out of memory issues
On 7/14/22 21:25, Aleš Zelený wrote: > > st 13. 7. 2022 v 2:20 odesílatel Michael Paquier <mailto:mich...@paquier.xyz>> napsal: > > On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote: > > So far, it has happened three times (during a single week) from > the 14.3 -> > > 14.4 upgrade, before 14.4 we haven't suffered from such an issue. > > > > Questions: > > 1) Can we safely downgrade from 14.4 to 14.3 by shutting down the > instance > > and reinstalling 14.3 PG packages (to prove, that the issue > disappear)? > > 2) What is the best way to diagnose what is the root cause? > > Hmm. 14.4 has nothing in its release notes that would point to a > change in the vacuum or autovacuum's code paths: > https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4 > <https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4> > > There is nothing specific after a look at the changes as of, and I am > not grabbing anything that would imply a change in memory context > handling either: > `git log --stat REL_14_3..REL_14_4` > `git diff REL_14_3..REL_14_4 -- *.c` > > Saying that, you should be able to downgrade safely as there are no > changes in WAL format or such that would break things. Saying that, > the corruption issue caused by CONCURRENTLY is something you'd still > have to face. > > > Thanks, good to know that, we can use it for a test case, since we > already hit the CONCURRENTLY bug on 14.3. > > > 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app= > ERROR: out of > > memory > > 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app= > DETAIL: Failed > > on request of size 152094068 in memory context > "TopTransactionContext". > > 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT: > > automatic vacuum of table "prematch.replication.tab_queue_tmp" > > This is the interesting part. Do you happen to use logical > replication in a custom C++ plugin? > > > We are using logical replication to other instances (pg_output) and > decoderbufs > https://github.com/debezium/postgres-decoderbufs > <https://github.com/debezium/postgres-decoderbufs> for other applications. > This is probably just a red herring - std:bad_alloc is what the process that runs into the overcommit limit gets. But the real issue (e.g. memory leak) is likely somewhere else - different part of the code, different process ... > ... > > Checking the RssAnon from proc/pid/status I've found some points where > RssAnon memory usage grew very steep for a minute, but no "suspicious" > queries/arguments were found in the instance logfile. > > Any hint, on how to get the root cause would be appreciated since so far > I've failed to isolate the issue reproducible testcase. > At least I hope that looking for the RssAnon process memory is an > appropriate metric, if not, let me know and I'll try to update the > monitoring to get the root cause. > > I can imagine a workaround with client application regular reconnect..., > but u to 14.3 it works, so I'd like to fix the issue either on our > application side or at PG side if it is a PG problem. > I think it's be interesting to get memory context stats from the processes consuming a lot of memory. If you know which processes are suspect (and it seems you know, bacause if a reconnect helps it's the backend handling the connection), you can attach a debugger and do $ gdb -p $PID call MemoryContextStats(TopMemoryContext) which will log info about memory contexts, just like autovacuum. Hopefully that tells us memory context is bloated, and that might point us to particular part of the code. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expr. extended stats are skipped with equality operator
On 8/5/22 11:50, Danny Shemesh wrote: > Hey all ! > > I'm on a quest to help the planner (on pg14) use the best of several > partial, expressional indices we have on some large tables (few TBs in > size, billions of records). > > As we know, stats for expressions in partial indices aren't gathered by > default - so I'm tinkering with expressional extended stats to cover for > those. > > I've tackled two interesting points there: > 1. Seems like expressional stats involving the equality operator are > skipped or mismatched (fiddle > <https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/5379>) > Let's take the following naive example: > /create table t1 (x integer[]); > insert into t1 select array[1]::integer[] from generate_series(1, > 10, 1); > create statistics s1 on (x[1] = 1) from t1; > analyze t1; > / > /explain analyze select * from t1 where x[1] = 1;/ > /> Seq Scan on t1 (cost=0.00..1986.00 rows=500 width=29) (actual > time=0.009..36.035 rows=10 loops=1)/ > / > / > Now, of course one can just create the stat on x[1] directly in this > case, but I have a more complex use case where an equality operator is > beneficial; > should the above case be supported ? feels like I'm just missing > something fundamental. > Hmmm. The problem here is that the expression may be interpreted either as an operator clause (A=B), or as a boolean expression ((A=B)=TRUE). In principle we might check which option matches extended statistics, but that's not for free :-( So what the current code does is trying to match the more specific cases first, leaving the "bool expression" as a last resort. That seems like a reasonable trade off to me, but I'll think about this a bit more. There are probably other ways to match expressions, and we can't possibly explore all of them. For example you may create statistics on (x=1 AND y=2) and I doubt we'll match that, because we'll try matching individual clauses not some arbitrary combinations of clauses. (Maybe we shouldn't even allow creating such statistics ...) > 2. Less important, just a minor note - feel free to ignore - although > the eq. operator above seems to be skipped when matching the ext. stats, > I can work around this by using a CASE expression (fiddle > <https://www.db-fiddle.com/f/wJZNH1rNwJSo3D5aByQiWX/1>); > Building on the above example, we can: > /create statistics s2 on (case x[1] when 1 then true else false end) > from t1;/ > /explain analyze select * from t1 where (case x[1] when 1 then true else > false end/ > /> Seq Scan on t1 (cost=0.00..1986.00 rows=10 width=25) (actual > time=0.011..33.721 rows=10 loops=1)/ > / Yes, because this end ups not being matches as opclause, and therefore goes all the way to the boolvarsel() in clause_selectivity_ext. > / > What's a bit problematic here, though, is that if we mix other dependent > columns to the extended stat, and specifically if we create an mcv, > queries involving the CASE expression throw with `error: unknown clause > type 130`, where clause type == T_CaseExpr. > > The second point for me would be that I've found it a bit non intuitive > that creating an extended statistic can fail queries at query time; it > makes sense that the mcv wouldn't work for case expressions, but it > might've been a bit clearer to: > > a. Fail this at statistic creation time, potentially, or > b. Convert the type numeric in the above error to its text > representation, if we can extract it out at runtime somehow - > I couldn't find a mapping of clause type numerics to their names, and as > the node tags are generated at compile time, it could be build-dependent > and a bit hard to track down if one doesn't control the build flags > Yeah, this seems like a clear bug - we should not fail queries like this. It's a sign statext_is_compatible_clause() and the MCV code disagrees which clauses are compatible. Can you share an example triggering this? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Expr. extended stats are skipped with equality operator
On 8/7/22 19:28, Tom Lane wrote: > Tomas Vondra writes: >> Yeah, this seems like a clear bug - we should not fail queries like >> this. It's a sign statext_is_compatible_clause() and the MCV code >> disagrees which clauses are compatible. > > Indeed. I attempted to fix that at e33ae53dd. > Thanks! That probably explains why I've been unable to reproduce that, I haven't realized there's a fix already. -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: backend crash on DELETE, reproducible locally
On 11/6/18 10:54 PM, Andres Freund wrote: > On 2018-11-06 16:47:20 -0500, Tom Lane wrote: >> =?UTF-8?Q?Ond=c5=99ej_Bouda?= writes: >>>> Ondřej, as a short-term workaround you could prevent the crash >>>> by setting that index's recheck_on_update property to false. >> >>> Thanks for the tip. I am unsuccessful using it, though: >>> # ALTER INDEX public.schedulecard_overlap_idx SET (recheck_on_update = >>> FALSE); >>> ERROR: unrecognized parameter "recheck_on_update" >> >> Oh, for crying out loud. That's yet a different bug. >> I'm not sure that it's the fault of the recheck_on_update >> feature proper though; it might be a pre-existing bug in >> the reloptions code. Looks like somebody forgot to list >> RELOPT_KIND_GIST in RELOPT_KIND_INDEX, but is that the >> fault of commit c203d6cf8 or was it busted before? > > Looks new: > + RELOPT_KIND_INDEX = > RELOPT_KIND_BTREE|RELOPT_KIND_HASH|RELOPT_KIND_GIN|RELOPT_KIND_SPGIST, > > there aren't any other "for all indexes" type options, so the whole > category didn't exist before. > > It also strikes me as a really bad idea, even if RELOPT_KIND_GIST > wouldn't have been omitted: It breaks index am extensibility. > Does it? The RELOPT_KIND_* stuff is hard-coded in reloptions.h anyway, so I'm not sure how this particular thing makes it less extensible? That being said, we also have RELOPT_KIND_BRIN, and that seems to be missing from RELOPT_KIND_INDEX too (and AFAICS the optimization works for all index types). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: integrate Postgres Users Authentication with our own LDAP Server
On Thu, May 09, 2019 at 07:11:24AM +, M Tarkeshwar Rao wrote: We want to setup ldap authentication in pg_hba.conf, for Postgresql users(other than postgres super user). We are getting issue with special characters by following steps given in postgres documentation. It is not accepting any special characters as special characters are mandatory in our use case. Can you please help us or have you any steps by which we can configure any postgres with LDAP? Please don't cross-post - this is a fairly generic question, it has nothing to do with performance or development, so the right thing is to send it to pgsql-general. Likewise, it makes little sense to send questions to the "owner". I've removed the other lists from CC. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it
2273.256 ms > (13 rows) > > > Thank you for any sort of insight into this. I don't think indexing the number of trigrams like this can help, and I'm not sure how to improve this (at least for the built-in GIN). It seem similarity searches are bound to be proportional to the most frequent trigram in the query. I wonder if the "newer" GIN variants like RUM [1] could improve this, but I don't think it has trgm opclasses. regards [1] https://github.com/postgrespro/rum -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: pb with join plan
On 6/21/23 00:26, Marc Millas wrote: > > > On Tue, Jun 20, 2023 at 11:19 PM David Rowley <mailto:dgrowle...@gmail.com>> wrote: > > On Wed, 21 Jun 2023 at 08:34, Marc Millas <mailto:marc.mil...@mokadb.com>> wrote: > > > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley > mailto:dgrowle...@gmail.com>> wrote: > >> > >> On Wed, 21 Jun 2023 at 07:42, Marc Millas <mailto:marc.mil...@mokadb.com>> wrote: > >> > But if I do the same with clause one OR clause 2, I have to > kill the request after an hour, seeing the filesystem showing more > than 140 Mb of increased usage. > >> > >> > > link to the anonymized plan of the req with one clause : > https://explain.depesz.com/s/TWp4 <https://explain.depesz.com/s/TWp4> > > link to the plan with the second > clause alone: https://explain.depesz.com/s/byW5 > <https://explain.depesz.com/s/byW5> > link to the plan with both clauses ORed (the one not > finishing) https://explain.depesz.com/s/jHO2 > <https://explain.depesz.com/s/jHO2> > > > > It's quite difficult to know what the problem is you want to fix here. > Your initial post indicated it was the query with the OR condition > that was causing you the problems, but the plan you've posted has no > OR condition?! > > You're more likely to get help here if you take time to properly > explain the situation and post the information that's actually > relevant to the problem you're having, or state the problem more > clearly, as there's a mismatch somewhere. > > It might also be worth having a look at > https://wiki.postgresql.org/wiki/Slow_Query_Questions > <https://wiki.postgresql.org/wiki/Slow_Query_Questions> . EXPLAIN is not > going to tell us what part of the query is slow. I'll let the wiki > page guide you into what to do instead. > > > I know that page. obviously, as I have to kill the request, I cannot > provide a explain analyze... > It's a bit weird the "victor" table is joined seemingly without any join conditions, leading to a cross join (which massively inflates the cost for joins above it). Maybe the anonymized plan mangles it somehow. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: move databases from a MySQL server to Postgresql.
On 6/23/23 13:45, Alfredo Alcala wrote: > Hello > > I need to move some databases from a MySQL server to Postgresql. > > Can someone tell me the migration procedure, tools and recommendations? > I'm not an expect on this, but migrations tend to be somewhat application-specific. I'd suggest you take a look at this wiki page: https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL#MySQL and maybe try some of the tools mentioned there (pgloader, mysql2pgsql, and so on). You'll have to give it a try on your databases, and then ask questions about practical issues you run ran into. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [Beginner Question]What is query jumbling?
On 7/17/23 05:23, Wen Yi wrote: > Hi community, > I am reading the 'Generate code for query jumbling through > gen_node_support.pl' > (https://www.postgresql.org/message-id/E1pMk51-000puf-55%40gemulon.postgresql.org) > to understand the principle of the postgres's node, but I really don't know > the meaning of the word 'query jumbling'. > > Can someone provide me some advice? > Thanks in advance! > Stuff like this is usually explained somewhere in the code base, so "git grep" is your friend. In this case it's probably good to read the comment in src/backend/nodes/queryjumblefuncs.c which says: * Normalization is implemented by fingerprinting queries, selectively * serializing those fields of each query tree's nodes that are judged * to be essential to the query. This is referred to as a query jumble. So "jumbling" is a process to generate a "normalized" query. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Query take a long time and use no index
arallel Seq Scan on data (cost=0.00..188457.44 > rows=7452844 width=8) (actual time=0.137..2121.695 rows=5962263 loops=3) > Planning Time: 0.247 ms > JIT: > Functions: 11 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > Timing: Generation 2.665 ms, Inlining 0.000 ms, Optimization 1.612 > ms, Emission 48.580 ms, Total 52.857 ms > Execution Time: 3525.656 ms > (12 rows) > > Why are the indexes not used ? It's a common misconception that indexes necessarily improve performance and that slow queries are slow because of missing indexes. In practice indexes are most useful for selective queries, i.e. queries that scan only a fraction of the table - say, a couple percent. But this query scans ~17M rows from the table, and my guess is that's a significant part of the table. In which case the index is useless. Also, notice the seqscan took ~4s. Even if that drops to ~1s with an index, the overall query will still take 12s. > Is it a Problem of the DATE_TRUNC ? My guess is it's more about the DISTINCT, which forces a sort before the aggregate function. Which also means the query can't be parallel. You can try simplifying the query step by step - remove the DISTINCT first, then the DATE_TRUNC. > How can I optimize? > It's always going to be slow with the COUNT(DISTINCT), I'm afraid. Not sure how much you can modify the query / database, and how accurate results you need. If you're OK with estimates, you can try postgres-hll extension [2] which estimates count(distinct). For exact reaults, I wrote count_distinct extension [2] that uses hashtable. Might be worth a try, I guess. Another option is to precalculate stuff, and do queries on that. That is, you're not going to add data with old timestamps, so you can pre-aggregate that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Query take a long time and use no index
On 7/17/23 13:20, Tomas Vondra wrote: > ... > > It's always going to be slow with the COUNT(DISTINCT), I'm afraid. > > Not sure how much you can modify the query / database, and how accurate > results you need. If you're OK with estimates, you can try postgres-hll > extension [2] which estimates count(distinct). For exact reaults, I > wrote count_distinct extension [2] that uses hashtable. Might be worth a > try, I guess. > Damn, I forgot to add the links: [1] https://github.com/citusdata/postgresql-hll [2] https://github.com/tvondra/count_distinct regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How to grant read only functions execute permission to read only user
On 7/17/23 16:11, Yi Sun wrote: > Hello guys, > > Our read only user is okay to be granted read only permission of tables. > > How to grant read only functions execute permission to read only user, > is there a simple way to do it please? If not, how to get the list of > read only functions please? Then can grant one by one based on the list, > thanks What is read-only function? I don't think Postgres has anything like that. Functions inherit the privileges of the user that executes them by default. So if the user is read-only (i.e. has just SELECT privilege), then the function can't do any writes either. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Intermittent Issue with WAL Segment Removal in Logical Replication
On 12/27/23 16:31, Kaushik Iska wrote: > Hi all, > > I'm including additional details, as I am able to reproduce this issue a > little more reliably. > > Postgres Version: POSTGRES_14_9.R20230830.01_07 > Vendor: Google Cloud SQL > Logical Replication Protocol version 1 > I don't know much about Google Cloud SQL internals. Is it relatively close to Postgres (as e.g. RDS) or are the internals very different / modified for cloud environments? > Here are the logs of attempt succeeding right after it fails: > > 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- FAILS > 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres > ERROR: requested WAL segment 000100060059 has already been > removed > 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > 6/5AE67D79 (proto_version '1', publication_names > 'peerflow_pub_wal_testing_2') <- SUCCEEDS > 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres > LOG: logical decoding found consistent point at 6/5A31F050 > > Happy to include any additional details of my setup. > I personally don't see how could this fail and then succeed, unless Google does something smart with the WAL segments under the hood. Surely we try to open the same WAL segment (given the LSN is the same), so how could it not exist and then exist? As Ron already suggested, it might be useful to see information for the replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn value). Also, maybe show the contents of pg_wal (especially for the segment referenced in the error message). Can you reproduce this outside Google cloud environment? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Intermittent Issue with WAL Segment Removal in Logical Replication
On 12/29/23 22:28, Kaushik Iska wrote: > I am unfortunately not really familiar with Google Cloud SQL internals > as well. But we have seen this happen on Amazon RDS as well. > Do you have a reproducer for regular Postgres? > Could it be possible that we are requesting a future WAL segment, say > WAL upto X is written and we are asking for X + 1? It could be that the > error message is misleading. > I don't think that should be possible. The LSN in the START_REPLICATION comes from the replica, where it's tracked as the last LSN received from the upstream. So that shouldn't be in the future. And it's doesn't seem to be suspiciously close to segment boundary either. In fact, the LSN in the message is 6/5AE67D79, but the "failed" segment is 000100060059, which is the *preceding* one. So it can't be in the future. > I do not have the information from pg_replication_slots as I have > terminated the test. I am fairly certain that I can reproduce this > again. I will gather both the restart_lsn and contents of pg_wal for the > failed segment. Is there any other information that would help debug > this further? > Hard to say. The best thing would be to have a reproducer script, ofc. If that's not possible, the information already requested seems like a good start. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: How do you optimize the disk IO when you cannot assume a file will start at a boundary then?
On 2/22/24 02:22, Siddharth Jain wrote: > Hi All, > > I understand the storage layer in databases goes to great lengths to ensure: > - a row does not cross a block boundary > - read/writes/allocation happen in units of blocks > etc. The motivation is that at the OS level, it reads and writes pages > (blocks), not individual bytes. I am only concerned about SSDs but I think > the principle applies to HDD as well. > > but how can we do all this when we are not even guaranteed that the > beginning of a file will be aligned with a block boundary? refer this > <https://stackoverflow.com/questions/8018449/is-it-guaranteed-that-the-beginning-of-a-file-is-aligned-with-pagesize-of-file-s> > . > > Further, I don't see any APIs exposing I/O operations in terms of blocks. > All File I/O APIs I see expose a file as a randomly accessible contiguous > byte buffer. Would it not have been easier if there were APIs that exposed > I/O operations in terms of blocks? > > can someone explain this to me? > The short answer is that this is well outside our control. We do the best we can - split our data files to "our" 8kB pages - and hope that the OS / filesystem will do the right thing to map this to blocks at the storage level. The filesystems do the same thing, to some extent - they align stuff with respect to the beginning of the partition, but if the partition itself is not properly aligned, that won't really work. As for the APIs, we work with what we have in POSIX - I don't think there are any APIs working with blocks, and it's not clear to me how would it fundamentally differ from the APIs we have now. Moreover, it's not really clear which of the "block" would matter. The postgres 8kB page? The filesytem page? The storage block/sector size? FWIW I think for SSDs this matters way more than for HDD, because SSDs have to erase the space before a rewrite, which makes it much more expensive. But that's not just about the alignment, but about the page size (with smaller pages being better). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Performance degradation after upgrading from 9.5 to 14
On 4/17/24 19:13, Johnathan Tiamoh wrote: > Hello, > > > I performed an upgrade from postgresql-9.5 to postgresql-14 and the > performance has degraded drastically. > > Please, is they any advice on getting performance back ? > There's very little practical advice we can provide based on this report, because it's missing any useful details. There's a number of things that might have caused this, but we'd have to speculate. For example: 1) How did you upgrade? pg_dump or pg_upgrade? 2) Did you run ANALYZE to collect statistics after the upgrade? 3) Did you transfer the configuration, or did you just create a new cluster with the default values? 4) What exactly is slower? Queries? Inserts? 5) Can you quantify the impact? Is it 2x slower? 100x slower? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Unexpected data when subscribing to logical replication slot
Hi, On 5/8/24 11:17, Daniel McKenzie wrote: > We have a .NET application which subscribes to a logical replication slot > using wal2json. The purpose of the application is to publish events to AWS > SQS. We are required to first "enrich" these events by querying the > database. > > We have found that these queries will often find old data (i.e. the data > that existed prior to the update) which is unexpected. > > For example, when I use a psql terminal to update a user's last name from > "Jones" to "Smith" then I would expect the enrichment query to find "Smith" > but it will sometimes still find "Jones". It finds the old data perhaps 1 > in 50 times. > > To reproduce this I use a psql terminal to execute an update statement > which changes a user's last name with \watch 1.5 and monitor our > application logs for cases where the wal2json output and the enrichment > query output have different last names. > Where/how does the enrichment query run? How does the whole process look like? I guess an application is receiving decoded changes as JSON, and then querying the database? > We have compared transaction ids by adding include-xids to pg_recvlogical > and adding txid_current() to to the enrich query and the txid_current() is > always the xid + 1. > > We have found two things that appear to resolve the problem - > >- Using a more powerful EC2 instance. We can reproduce the issue with a >r7a.medium instance but not with a r7a.large EC2 instance. >- Changing the Postgres synchronous_commit parameter from "on" to "off". >We cannot reproduce the issue with synchronous_commit set to "off". > > We need help to understand this unexpected behaviour. > Would be good to have some sort of reproducer - ideally a script that sets up an instance + replication, and demonstrates the issue. Or at least a sufficiently detailed steps to reproduce it without having to guess what exactly you did. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: page is not marked all-visible but visibility map bit is set in relation "pg_statistic"
On 7/11/24 19:21, Gus Spier wrote: > AWS RDS Postgres Aurora version 14.4 > I believe RDS and Aurora are two separate products, so I'm a bit confused by this ... > Error log shows: page is not marked all-visible but visibility map bit is > set in relation "pg_statistic" > > To me, that sounds ominous. But, there does not appear to be any great > performance hit. The applications are chugging along nicely. The end-users > have not yet gathered at my door, waving torches and pitch-forks. > > What is the correct course of action in this case? > I'd say you need to report this to AWS support - we have no idea what changes they made in Aurora, and AFAIK they made a lot of changes in this area. That is not to say we don't have any bugs in this area - see for example the discussion in [1], but we are not in position to investigate issues on a proprietary product. regards [1] https://www.postgresql.org/message-id/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A%40mail.gmail.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: PG Dump on 11 - Restore on 16 - is possible?
On 8/13/24 11:47, Durumdara wrote: > Hello! > > We have to upgrade our server. The PG upgrade is impossible because of > too much data. I'm not sure I understand. What exactly is the problem with the amount of data? How much data are you dealing with, and is the problem in the amount it takes to copy the data, or do you have limited amount of disk space, or what is the issue? I'd say pg_upgrade (in link mode) is probably orders of magnitude faster for upgrading large clusters - probably less than a minute even for huge clusters (multiple TB of data). And it doesn't need more disk space, it does not copy most of the data. Typically, you create a replica on the new system (or clone the fs volume in some way), and then switch over and do the pg_upgrade. pg_dump is pretty expensive - it exports and imports the database, has to rebuild indexes, ... I'd be surprised if this wasn't more laborious and time consuming than the pg_upgrade way. > So we will create a new server with PG 16, and we have to migrate the > databases one by one (from a PG 11 server). > > Is it possible? > > Does PG16 read PG11 dumps and restore them correctly on the PG16 server? > Yes. The docs actually say this: Because pg_dump is used to transfer data to newer versions of PostgreSQL, the output of pg_dump can be expected to load into PostgreSQL server versions newer than pg_dump's version. > Or is it unsupported now? > It's true PG11 is EOL already, but AFAIK the dump/restore should work, we try to support even unsupported versions - I don't recall if the policy is ~10 years, but the docs say: pg_dump can also dump from PostgreSQL servers older than its own version. (Currently, servers back to version 9.2 are supported.) I'd probably use pg_dump from the target version (PG16), but it should not matter. regards -- Tomas Vondra
Re: About replication minimal disk space usage
On 8/24/24 14:18, Manan Kansara wrote: > Hello All, > I have my self hosted postgres server on aws with 16gb disk space > attached to it for ml stuff and analysis stuff we are using vertex ai so > i have setup live replication of postgres using data stream service to > BigQuery table. We use BigQuery table as data warehouse because we have > so many different data source so our data analysis and ml can > happened at one place. > but problem is there When i am starting replication in there pg_wal take > whole space about 15.8gb in some days of starting replication > > *_Question_ *: how can i setup something like that that optimally use > disk space so old pg_wal data that are not usable can we delete i think > i should create one cron job which taken care whole that things but i > don't know any approach can you please guide > In future if as data grew i will attached more disk space to that > instance but i want to make optimal setup so my whole disk is not in > full usage any time and my server crash again. > Why don't you just give it more disk space? I'm not a fan of blindly throwing hardware at an issue, but 16GB is tiny these days, especially if it's shared by both data and WAL, and the time you spend optimizing this is likely more expensive than any savings. If you really want to keep this on 16GB, I think we'll need more details about what exactly you see on the instance / how it runs out of disk space. AFAIK datastream relies on logical replication, and there's a couple ways how that may consume disk space. For example, the datastream replication may pause for a while, in which case the replication slot will block removal of still-needed WAL, and if the pause is long enough, that may be an issue. Of course, we have no idea how much data you're dealing with (clearly not much, if it fits onto 16GB of disk space with everything else). Another option is that you have a huge transaction (inserting and/or modifying a lot of data at once), and the logical decoding ends up spilling the decoded transaction to disk. If you want a better answer, I think you'll have to provide a lot more details. For example, which PostgreSQL version are you using, and how is it configured? What config parameters have non-default values? regards -- Tomas Vondra
Re: ERROR: could not open relation with OID XXXX
On 8/25/24 15:42, Marcelo Zabani wrote: > Hi all, > > I can reproduce the error in the subject from time to time when querying > catalog tables while DDL is happening concurrently. Here's a bash script > that reproduces it (not always, you might have to run it many times > until you see ERROR: could not open relation with OID ): > > #!/usr/bin/env bash > psql -c "create table test(x serial primary key); select oid, relname > from pg_class where relname='test'" > # The next two queries will run concurrently > psql -c "select oid, relname, pg_sleep(3), pg_get_indexdef(oid) from > pg_class join pg_index on indexrelid=pg_class.oid WHERE > relname='test_pkey';" 2>&1 1>/tmp/pgbug.log & > sleep 1 > psql -c "drop table test" > cat /tmp/pgbug.log > wait > > I am confused as to how this is possible. I assume if the row with the > test_pkey index exists in the pg_index catalog table, that the snapshot > of the catalog tables contains the test table itself and is generally > consistent, so querying the catalog should not run into such errors. > I think you're assuming the whole query runs with a single snapshot, and AFAIK that's not quite accurate - we do some special stuff for catalogs, for example. There's also the additional complexity of maintaining a cache on catalogs, invalidating it, etc. I don't have a great simple "this happens because X" explanation, but a lot of this relies on proper locking - in particular, that we lock all the objects before execution, which also invalidates all the caches etc. But that can't happen here, because we only realize we need to access the OID very late in the execution, when we get to pg_get_indexdef. > I've seen this happen in Production without pg_sleep in the mix, too. I > added pg_sleep to the example above only because it makes the error > easier to reproduce. > It's a race condition, essentially. The sleep just makes it easier to hit, but it can happen without it. > Is there something I can do to avoid this? Is my understanding of how > the catalog tables work wrong? > I believe you could actually lock the pg_class rows for update. Just add FOR UPDATE at the end of the query. regards -- Tomas Vondra
Re: ERROR: could not open relation with OID XXXX
On 8/25/24 17:36, Marcelo Zabani wrote: >> we do some special stuff for catalogs > > That is good to know, thanks! > >> I believe you could actually lock the pg_class rows for update. Just > add FOR UPDATE at the end of the query. > > Thanks, but I tried that and got "ERROR: permission denied for table > pg_class", even if I try it only for tables the user owns. > Yeah, I tried that with a superuser, I didn't realize it'd fail for regular users who don't have rights to modify pg_class directly. > At least considering the use-case of avoiding this error due to > temporary tables/indexes (which are a part of normal application > execution), I was thinking of using materialized CTEs that filters those > out, and only after that using other functions that for example take > OIDs and return definitions. Other kinds of DDL that create > non-temporary tables can be "blamed" on developers in my case. > Do you think using those materialized CTEs could help? And do you think > this can be considered a bug that I should report or is it just too > edge-casey to consider? I have no idea what you mean - what would be filtered out, etc. But you can give it a try - you have a reproducer, so testing if it fails should not be that difficult. Other than that, I think you can do two things: 1) Make the application to simply retry in it sees this error. 2) Lock the objects using LOCK TABLE before running the query. regards -- Tomas Vondra
Re: BRIN index on timestamptz
On 4/23/21 10:31 AM, Mohan Radhakrishnan wrote: Hi, I am planning to use as I search based on timestamptz fields. There are millions of records.I refer https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits <https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits> I execute this on the AWS RDS instance. Is there something in the plan I should pay attention to ? I notice the Execution Time. Thanks, Mohan INSERT INTO testtab (id, date, level, msg) SELECT g, CURRENT_TIMESTAMP + ( g || 'minute' ) :: interval, random() * 6, md5(g::text) FROM generate_series(1,800) as g; Two things: a) You need to do ANALYZE, otherwise there are no statistics the optimizer could use (which is why the row estimates in the plans are entirely bogus). b) BRIN indexes don't work on random data, because the whole idea is about eliminating large blocks of data (e.g. 1MB). But with random data that's not going to happen, because each such range will match anything. Which is why seqscan is a bit faster than when using BRIN index. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Extended stats - value not in MCV list
Hi, this topic would probably be a better fit for pgsql-hackers, as it's about the internal implementation, but let's continue the discussion here for now. On 5/2/21 6:52 AM, Pedro Luis Guzmán Hernández wrote: Hi there, I've just started using extended stats cause the planner was giving me terrible estimates for a certain table. MCV extended stats solved my problem when values are in the extended MCV list, but estimates are still terrible when they are not in the MCV list. Limited as my knowledge of database internals is, I dug into the source code and found an important difference on how these not-MCV values are handled in the single-column and multi-column cases. Hmm, so which which PostgreSQL version are you testing this on? I'm not sure which code branch you're looking at. Can you share a reproducer, i.e. a simple example demonstrating the issue? That'd be very helpful. For single columns, the estimate is calculated as follows: selectivity = (1 - sum(MCV_frequencies)) / (distinct_values - count(MCV)) Which seems to assume a uniform distribution of non-MCV values and looks like a sensible guess, at least to my amateur eyes. For multi-column statistics it seems to me that the estimate is calculated instead as: selectivity = 1 - sum(MCV_frequencies) Which instead seems to assume that the value could potentially be present in all the rows not covered by the MCV. This seems like an adequate upper bound, but is also quite conservative compared to the single-column estimate. In my specific case this yields a selectivity even higher than some of the least frequent MCV values, which is a condition that is actually checked for and compensated in the single-column estimate as an additional check. I have MCV and distinct extended stats, so I know the distinct_values stats is available. It shouldn't behave like that, and some quick experiments suggest it does not (at least on master). I can't rule out a bug, of course. A reproducer would be helpful. So I hope my question is clear from the above. How come the estimates are calculated with such different approaches? I insist I have no experience with database/query planner development, so excuse me if I am overlooking some obvious conceptual difference between single-column and multi-column stats. The single-column estimate is actually described in the documentation, but the multi-column estimate is not. If there is indeed a good reason for this difference, I think it should be documented. As for the ndistinct estimates and multi-column MCV lists, it's not all that simple - there may be conditions on only some of the columns, in which case we don't know how many groups we actually matched, etc. TBH I'm not sure how much of those implementation details we want to put into the user docs - it may be a bit too much, and we may need to change some of it. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Strange behavior of function date_trunc
On 5/5/21 3:23 PM, Pavel Luzanov wrote: Hello, It is very likely that the date_trunc function in the following example is executed for each line of the query. Although it marked as a STABLE and could only be called once. It could, but that's just an option - the database may do that, but it's not required to do it. In this case it might be beneficial, but it'd make the planner more complex etc. EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= date_trunc('day', '2021-05-05'::timestamptz); QUERY PLAN --- Function Scan on generate_series g (cost=0.00..15.00 rows=333 width=8) (actual time=2801.884..3263.328 rows=2332801 loops=1) Filter: (x >= date_trunc('day'::text, '2021-05-05 00:00:00+03'::timestamp with time zone)) Rows Removed by Filter: 10713600 Planning Time: 0.040 ms Execution Time: 3336.657 ms When replacing date_trunc with now, the query is much faster: EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= now(); QUERY PLAN --- Function Scan on generate_series g (cost=0.00..15.00 rows=333 width=8) (actual time=1648.777..1845.430 rows=2275325 loops=1) Filter: (x >= now()) Rows Removed by Filter: 10771076 Planning Time: 0.039 ms Execution Time: 1918.767 ms The variant with now works almost as fast as with the constant. This suggests me that perhaps date_trunc is being executed for every line of the query: EXPLAIN (ANALYZE) SELECT * FROM generate_series('2021-01-01', '2021-06-01', '1 s'::interval) AS g(x) WHERE g.x >= '2021-05-05'::timestamptz; QUERY PLAN --- Function Scan on generate_series g (cost=0.00..12.50 rows=333 width=8) (actual time=1628.743..1826.841 rows=2332801 loops=1) Filter: (x >= '2021-05-05 00:00:00+03'::timestamp with time zone) Rows Removed by Filter: 10713600 Planning Time: 0.033 ms Execution Time: 1901.680 ms In this regard, I have two questions: 1. How can I find out exactly how many times the date_trunc function has been executed? So far, these are just my assumptions. 2. If date_trunc is indeed called multiple times, why is this happening? Well, it'd not like date_trunc is executed for each row while now() is executed only once. The functions are executed for each row in both cases, but now() is simply much cheaper - it just returns a value that is already calculated, while date_trunc has to parse and truncate the value, etc. You can use CTE to execute it just once, I think: with x as (select date_trunc('day', '2021-04-01'::timestamptz) as x) select * from t where a > (select x from x); regards Tomas
Re: How to hash a large amount of data within Postgres?
On 6/23/21 7:39 PM, Peter J. Holzer wrote: > On 2021-06-21 15:53:09 +0200, Thorsten Schöning wrote: >> Some years ago I implemented some SQL to read all files, build a table >> of SHA256 hashes and tell me how much data is redundant. The goal was >> to have a look at which files share the same hash with different LOIDs >> and optionally change that, so that all those files are only stored >> once on the end. >> >> While the approach was pretty naive, because it simply read all files >> into memory to calculate the hashes, I'm somewhat sure it worked in >> the past with Postgres 9.6. The executing server had enough free RAM >> available as well to process the at most ~4 GiB large files one after >> another. >> >> I tried that SQL today with Postgres 11 on UB 18.04 and it failed: >> >>> [Code: 0, SQL State: XX000] FEHLER: invalid memory alloc request size >>> 1898107949 >>> Wobei: PL/pgSQL-Funktion loid_hash_calc(oid,text)[...] > [...] >> I searched regaridng that issue and only found two relevant results: >> Corrupted rows for some reason and simply size restrictions when >> allocating memory. The latter is more likely than the former in my >> case, as the restrictions seems to be 1 GiB and I do have larger >> files. > > 1 GB is the maximum size of quite a few data structures in PostgreSQL. I > don't use PL/pgSQL, but I wouldn't be surprised if it was the maximum > size of whatever loread() returns (a bytea?). I would be surprised if > this limit was higher in version 9.6 than it is in version 11, however. > Well, it's actually a bit worse than that - the maximum allocation size is (1GB - 1B), as it's defined like this: #define MaxAllocSize((Size) 0x3fff) /* 1 gigabyte - 1 */ And this includes both the "user data" and a small "header" used for the bytea value. Depending on what format you use to output the values there may be additional limits (e.g. 'hex' requires 2 characters per byte, so doubling the amount of memory needed). For large objects this is not an issue, because we store them in small chunks, not as one large bytea value. > >> I'm doing the following simply currently, because I didn't find any >> interfaces allowing to forward blocks of data, LOIDs, file descriptors >> or anything like that working with smaller buffers or alike. >> >>> fd := lo_open( loid, INV_READ); >>> size:= lo_lseek(fd, 0, SEEK_END); >>> PERFORMlo_lseek(fd, 0, SEEK_SET); >> >>> hashBin := digest(loread(fd, size), algorithm); >>> hashHex := encode(hashBin, 'hex'); >> >> So, is there any way to work around the problem I have currently? > > Normally, hash libararies have a way to feed chunks of data into a hash > computations to avoid having to keep the whole thing in RAM. > The pgcrypto extension seems to be lacking such functionality, however. > > I would build something similar to a Merkle tree: > > Choose a convenient chunk size (a few MB is probably ok), read the large > object in chunks of this size, computing the hash for each. Concatenate > all the hashes and compute the hash of that. Add intermediate levels if > the the concatenated hashes are still too large to fit in memory. > Not sure where you searched, but there definitely are interfaces to read chunks of data from large objects - see this: 1) lo_get (loid, offset, length) https://www.postgresql.org/docs/13/lo-funcs.html 2) lo_seek() + lo_read() https://www.postgresql.org/docs/13/lo-interfaces.html Obviously, you can't do "loread(fd, size)" because that's going to attempt building one large bytea, failing because of the alloc limit. You have to stream the data into the hash. Doing that in plpgsql is possible, although possibly somewhat slow. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: ERROR: unrecognized node type
On 9/29/21 8:37 PM, Tom Lane wrote: Adrian Klaver writes: Unless folks are using plain text only email readers. That is why screenshots are generally a bad idea for textual information. For the record the Postgres version is 12.8. Hmm, if it's v12 then NodeTag 223 is T_List, which is ... a bit surprising. That's a common enough node type that "oversight" doesn't seem to fly as an explanation. I don't think we'll be able to make progress on this without a lot more detail. A self-contained example that triggers it would be very useful; or maybe you could get a backtrace from the point of the error? https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend A handy trick for similar cases (not mentioned in the wiki) is to set SET log_error_verbosity = verbose; and then trigger the error. This logs the exact location (file:line) where the error is printed, and then set a breakpoint to that place. Makes it easier to generate the backtrace. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: ERROR: unrecognized node type
On 9/30/21 8:38 AM, Amine Tengilimoglu wrote: ... Remarkable thing the related error occurs when executing sql statements containing where. The sqls that do not contain a where are not getting an error. Location information as below; ERROR: XX000: unrecognized node type: 223 *LOCATION: exprType, nodeFuncs.c:263* STATEMENT: SELECT n.nspname as "Schema", c.relname as "Name", CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'partitioned table' WHEN 'I' THEN 'partitioned index' END as "Type", pg_catalog.pg_get_userbyid(c.relowner) as "Owner" FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r','p','v','m','S','f','') AND n.nspname <> 'pg_catalog' AND n.nspname <> 'information_schema' AND n.nspname !~ '^pg_toast' AND pg_catalog.pg_table_is_visible(c.oid) ORDER BY 1,2; I'm unable to reproduce the issue, so it probably depends on what tables are created etc. But if you say it only happens with WHERE clause, that's interesting. It suggests the failure probably happens somewhere in transformWhereClause, but we can only speculate why and the query conditions look entirely reasonable. I suggest you do this: 1) start a session, identify the PID of the backend select pg_backend_pid(); 2) attach a debugger (e.g. gdb) to the pid gdb -p $PID 3) set breakpoint to the location in the error message (gdb) break nodeFuncs.c:263 (gdb) continue 4) run the query, the breakpoint should be triggered 5) extract full backtrace (gdb) bt full 6) print the expression (gdb) p nodeToString(expr) That should give us some hints about what might be wrong ... regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: create a new GIN index for my own type
On 10/4/21 8:30 AM, huangning...@yahoo.com wrote: Hi: I created a new data type, and then I wanted to create a GIN index for it, but when I created the index, the program would crash 。 The version of postgresql is 9.6。 The following is part of the code, and I also refer to the code of intarray. I doubt anyone is going to investigate this unless you provide a more complete example - something like an extension where people can do "make install" without having to fill in various pieces of code. To investigate the crash, you need to attach a debugger to the backend and run the CREATE INDEX (or whatever triggers the crash). The debugger should catch the segfault and you'll be able to identify where exactly it crashes and why (and investigate). 1) first get PID of the backend SELECT pg_backend_pid(); 2) then attach a debugger to the backend gdb -p $PID (gdb) c 3) run the CREATE INDEX query 4) get backtrace from the debugger (gdb) bt regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Problem in pg_basebackup of 1.7 TB database while replicating in PostgreSQL 9.4.( Master and slave both are in postgreSQL 9.4)
On 10/4/21 9:28 AM, Amal Chakravarty wrote: Hi all. I am working with a database of 1.7 TB size which is in PostgreSQL 9.4. Kindly suggest how to take the pg_basebackup of such a huge data while replicating from master to slave. And what exactly is the issue? Does it fail in some way or are you looking for a faster / more efficient way to transfer the data? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: create a new GIN index for my own type
On 10/4/21 3:32 PM, huangning...@yahoo.com wrote: I have already debugged the program according to this step, but I found that in the DataCopy function, the variable typlen should be 8, but it is -1, Well, if you have debugged this, it'd be nice if you could share more information (e.g. backtraces, etc.) otherwise others can just guess what you saw. And that makes it much harder to help you. I see you defined the data type as PASSEDBYVALUE, but you haven't specified INTERNALLENGTH, so it's -1 (i.e. variable length). Obviously, that can't be passed by value - not sure if this is intentional or just a case of CREATE TYPE not checking it. BTW it's customary not to top post - inline replies are much easier to follow, as it makes clearer which parts you respond to. And please make sure that you're responding to the mailing list, not just directly to the other person. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Avg/max size of these JSON docs in Postgres
On 10/12/21 21:21, Michael Lewis wrote: On Tue, Oct 12, 2021 at 12:51 PM Simon Riggs mailto:simon.ri...@enterprisedb.com>> wrote: On Tue, 12 Oct 2021 at 18:53, Michael Lewis mailto:mle...@entrata.com>> wrote: > > On Tue, Oct 12, 2021 at 1:43 AM Simon Riggs mailto:simon.ri...@enterprisedb.com>> wrote: >> >> On Tue, 12 Oct 2021 at 08:14, Ram Pratap Maurya >> mailto:ram.mau...@lavainternational.in>> wrote: >> >> > Confirm what is Avg/max size of these JSON docs in Postgres. >> >> JSON and JSONB datatypes can both be max 1GB in size. > > That is per row. No, that is per column. Yes, sorry. My attempt at clarification only muddled things. Each column within each row can be up to 1GB in size is how I understand the limit. But you're kinda right, actually. It very much is per-tuple, because in various places we form tuples with all the data inline. Consider for example this: create table t (a text, b text); alter table t alter column a set storage extended; alter table t alter column b set storage extended; insert into t select repeat(md5('a'), 512*1024*1024/32), repeat(md5('b'), 512*1024*1024/32); ERROR: invalid memory alloc request size 1073741880 Clearly, both values are only 512MB (no compression). Yet it fails, simply because tts_virtual_copy_heap_tuple calls heap_form_tuple to form a tuple with all values and also the tuple header. But this succeeds, because the values are 64B shorter, leaving enough space for the tuple header etc. insert into t select repeat(md5('a'), 512*1024*1024/32 - 2), repeat(md5('b'), 512*1024*1024/32); And you can even select the data: select * from t; You can even do this: update t set a = repeat(md5('a'), 512*1024*1024/32); which works, so now you have a row with two 512MB values. But then you'll face this: select * from t; ERROR: out of memory DETAIL: Cannot enlarge string buffer containing 536870922 bytes by 536870912 more bytes. because printtup() builds a huge string with all the data (and if the columns had compression, this would be decompressed, because it goes to the client). So yeah, there's an explicit 1GB limit per value, but having rows close to the 1GB limit is going to cause all sorts of unpredictable and rather painful issues :-( regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: GIN index
Hi, Why exactly are you starting a new thread again, instead of continuing in the thread you started a couple days ago? A couple recommendations: 1) I find it unlikely you'll get a lot of help unless you provide a way to reproduce the issue easily. That is, something people can build and test on their own. Posting incomplete chunks of code means we'd have to spend time filling in the gaps, etc. Sorry, not going to happen. Create an extension that we can build/install easily, include a SQL script reproducing the issue. 2) The code is incomplete, e.g. hex2char/char2hex are missing. And the calls seems to be the other way around. I'd expect geomgrid_in to transform hex string to binary, but it calls Char2Hex. 3) AFIAK there's nothing like CREATE FUNCTION ... NOT FENCED ..., so it seems you're using some sort of fork. In that case you should probably ask them. 4) Don't post backtraces as screenshots. Copy them as text, attach them as a text file. 5) Continue in the thread you already started, otherwise it's impossible to follow the discussion. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Plans with bad estimates/paths
On 11/16/21 9:22 PM, Joe Wildish wrote: > Hello, > > I am struggling with a problem that appears planning related. I'm > hoping folk here may be able to advise on how best to tackle the > issue. > > We have a system that ingests JSON messages containing order data. > The data from these messages is inserted into a normalised table > structure; "order", "order_item", "order_discount", etc. Orders can > have multiple items, items can have multiple discounts and so forth. > Referential constraints exist between the tables in the expected way. > Fields in the tables are all declared NOT NULL. If the schema of the > JSON is such that a field is optional, we reflect that optionality in > another "order_" table, and make "order_" be a subset of > "order". Moreover, we ingest messages for different > clients/customers. Therefore each message-related table carries with > it a client identifier which forms part of the primary key on the > table. For example, "order" has a key of "(client_id, order-id)". > > We have written transformations that calculate various facts about > orders. For example, one of the transforms emits order item data > where we have calculated the overall discount for the item, and have > "filled in" some of the optional fields with defaults, and have > categorised the order item on the basis of some aspect of the order > (e.g. "this is an e-commerce order, this is retail order"). These > transforms are typically per client (e.g. `WHERE client_id = 123`) > although in some cases we transform over multiple clients (e.g. > `WHERE client_id = ANY (SELECT client_id FROM clients WHERE ...)`). > > The issue is that for some clients, or combination of clients, the > planner is choosing a path that takes substantially longer to > evaluate than the plan it typically chooses for other clients. The > number of tables being joined is in the region of 15. There is an > extended statistic object in place to help the one aggregation that > occurs (defined on the basis of the `GROUP BY` columns) to try and > get a better estimate of the likely number of rows emitted. However, > what I am often seeing in the explain plan is that the estimated rows > is small and the actuals are significantly larger e.g. > > Merge Join (cost=1.14..253250.32 rows=1099 width=69) (actual > time=1268.587..2400.353 rows=4282355 loops=1) > It sure smells like a case of correlated data for some clients but not others, but who knows ... Hard to say without seeing the nodes below the join. If the lower nodes are estimated accurately, then it's the join selectivity that is estimated poorly, and there's not much we can do about it :-( Do the "good" plans have the same underestimate? Maybe there's just much less data for those clients, and the "poor" plan ends up being fast anyway? > I am assuming this underestimation is the source of the planner > choosing the "wrong" path; in production, we have had to resort to > setting the join and from collapse limits to 1 to force a naive plan > to be generated. This is giving us execution times in the 10/20 > second range vs. >45m in some cases. > That may be happening, yes. So is it the join order that ends up being wrong, or the join methods? Have you tried increasing the collapse limit instead? Although, if it works for some queries but not others, that's likely not going to help. > (a) Do you have any suggestions on a general approach to tackling the > problem? For example, one option might be to pre-compute some of the > subqueries that are occurring in the transforms, write the results > into their own tables, and substitute those tables in place of the > subqueries in the main transform. Is this something people typically > do in this situation? > The first thing I'd do is reduce the query size as much as possible. In this case I'd try removing as many joins as possible until the issue disappears. The simpler the query, the easier it is to investigate. And yes, replacing parts of a query with a temporary table is a common solution, because it's possible to collect statistics on it, build indexes etc. That usually solves estimation issues in multi-tenancy. Sometimes even a CTE with materialization is enough. > (b) Do I need to provide a schema and explain plans to get any > concrete advice on how to proceed? > That would be helpful, particularly after making the query as small as possible. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgis - geography function
On 1/5/22 09:48, Sameer TWAYANA wrote: Hi Sir, I need some help. I have been using the postgis extension for many years. Now in postgis version 3.2 using geom data in geography with ST_Length and ST_Distance function returns some weird results. Before I used 3.1.4, all was good. But I got different results in version 3.2. So can you suggest why I am getting different results in postgis version 3.2. I really appreciate for your help. Thank you. You haven't shown a lot of detail about the issue, so it's hard to say what's going on, unfortunately :-( You'll need to show us some queries with expected/actual results (the simpler the better, of course). And provide info about the Postgres version, etc. BTW if it's a problem related to PostGIS, maybe using a mailing list for that specific project [1] would yield better results. [1] https://lists.osgeo.org/mailman/listinfo/postgis-users regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 14:17, Shaozhong SHI wrote: Any examples in Postgres to create a dictionary type to store and access key value pairs? I'd say JSONB can be used as an associative array, and e.g. in Python can map to dict data type. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 17:24, Ron wrote: On 1/5/22 7:22 AM, Tomas Vondra wrote: On 1/5/22 14:17, Shaozhong SHI wrote: Any examples in Postgres to create a dictionary type to store and access key value pairs? I'd say JSONB can be used as an associative array, and e.g. in Python can map to dict data type. The question confuses me. Why isn't "create a table with Primary Key" the answer to "Any examples in Postgres to create a dictionary type to store and access key value pairs?" Well, yes - you can always decompose the dict and store it in an EAV, but sometimes it's more convenient to just use "nested" dictionary at the row level. Which is what a JSONB column does. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Create and access a dictionary type
On 1/5/22 14:36, Shaozhong SHI wrote: I just checked operators. I could not find any operator to set a new value given a key. I'd say jsonb_set is what you're looking for. Postgres does not have a thing that'd match the Oracle feature exactly, you'll have to adjust your application to use what's available. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/2/22 22:10, Artyom Shaposhnikov wrote: the row estimate became ~1000x smaller with the stat in place, so it looks like it grossly miscalculates the query plans without the stats for large tables representing M:M relations. Well, if the estimates are significantly off (and 3 orders of magnitude certainly qualifies), then all bets are off. There's no magical option that'd fix planning in such conditions. Ultimately, fixing the estimates (e.g. by creating extended statistics) is the right "fix" as it gives the optimizer the information needed to pick the right plan. On Wed, Feb 2, 2022 at 11:47 AM Michael Lewis wrote: What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats give a mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck. Index Scan using data_pkey on data t (cost=0.57..21427806.53 rows=58785023 width=131) (actual time=0.024..0.482 rows=854 loops=1) Index Cond: (id > 205284974) It'd be interesting to see the plans without the LIMIT, as that makes the "actual" values low simply by terminating early. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/4/22 05:21, A Shaposhnikov wrote: Tomas, thank you! The query: select 1 from data as d, data_class as dc where dc.data_id = d.id and d.id > 205284974 and dc.data_id > 205284974 -- new condition order by d.id limit 1000; totally solved it - it is now fast under all conditions! I thought that the optimizer would be able to infer it itself. Unfortunately, the optimizer is not that smart - we can do that for equality conditions, but not for other operators. There was actually a thread [1] exploring a possibility to extend this to inequalities, but it went nowhere so far. It also explains why it's done only for equality operators. In short, it's fairly expensive, makes costing of joins more difficult, and most queries can't benefit from it (because conditions on join keys are not that common). BTW how does the final query plan look like? Is it using the merge sort of nested loop? I wonder if this might be formulated as a costing issue, pushing the planner to use the nested loop. [1] https://www.postgresql.org/message-id/flat/CAFQUnFhqkWuPCwQ1NmHYrisHJhYx4DoJak-dV%2BFcjyY6scooYA%40mail.gmail.com regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: increasing effective_cache_size slows down join queries by a factor of 4000x
On 2/15/22 01:06, A Shaposhnikov wrote: Interestingly I have a second PG 14.2 database, with identical table definitions, but about 10% smaller row counts, and the exact same query works fast there without the 2nd condition: Are you sure about the 10%? Because in the plans from the first machine I see this: > > -> Index Scan using team_pkey on team t (cost=0.57..11382381.88 > rows=78693167 width=175) (actual time=0.016..0.695 rows=854 loops=1) > while the second machine does this: -> Index Scan using team_pkey on team t (cost=0.57..2366113.83 rows=2807531 width=160) (actual time=0.031..0.801 rows=888 loops=1) That's 2.8M vs. 78M, quite far from "10% difference". Not sure about team_aliases table, that's imposible to say from the plans. This may matter a lot, because we use effective cache size to calculate cache hit ratio for the query, with relation sizes as an input. So smaller relations (or larger effective_cache_size) means cheaper random I/O, hence preference for nested loop join. The other thing is data distribution - that may matter too. IMO it's pointless to investigate this further - we know what's causing the issue. The optimizer is oblivious that merge join will have to skip large part of the second input, due to the implicit condition. Notice that adding the condition changes the cost from: Limit (cost=81.33..331.82 rows=1000 width=183) ... to Limit (cost=81.33..720.48 rows=1000 width=183) ... So it seems *more* expensive than the first plan. Taken to the extreme the planner could theoretically have chosen to use the first plan (and delay the condition until after the join). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: query causes connection termination
On 11/23/2017 01:25 AM, Neto pr wrote: > Another fact is that when executing the query without the command > EXPLAIN ANALYZE, the result is usually returned after a few minutes. > I do not understand, because when using the EXPLAIN ANALYZE command the > dbms closes the connection. > Anyone have any tips on why this occurs? > Attach gdb to the backend, run the query and when it fails get us the backtrace. So something like 1) select pg_backend_pid() 2) gdb -p $PID 3) run the EXPLAIN ANALYZE again 4) watch the gdb session, when it fails do 'bt' You need to install debuginfo first, so that the backtrace makes sense. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: a question about oddities in the data directory
Hi, I think you need to read this: https://www.postgresql.org/docs/current/static/storage-file-layout.html On 11/27/2017 04:19 PM, Martin Mueller wrote: > Apologies if I asked this question before. > > I’m a newcomer to Postgres, having migrated from Mysql. Information > about tables seems harder to get at in Postgres. That data directory > on my machine is suspiciously large—well over 100 GB. The directory > Postgres/var-9.5/base/ contains a number of subdirectories with > Arabic numerals. Each of those directories stores data for a single database. Each object (including databases) has a 32-bit identifier, mapping it to object on the filesystem. databases: SELECT oid, datname FROM pg_database; tables etc.: SELECT relfilenode, relname FROM pg_class; > Directory 16385 has a subdirectory 17463 with a size of 1.07 GB. That's not a subdirectory, but a datafile segment. > But there are also 17 subdirectories with names from 17463.1 to > 17.463.17. There are also other entries with similar forms of > duplication and suspiciously identical file sizes of 1.07GB. > Again, those are files, not subdirectories. Large datafiles are split into 1GB segments, so for example 10GB table with relfilenode 17463 will be stored in files 17463, 17463.1, 17463.2, ..., 17463.9 > Is this normal behavior? Where in the postgres documentation do I > read up on this? Postgres strikes me as superior to MySQl, especially > with regard to string functions and regular expressions, but it’s > harder to look under the hood. https://www.postgresql.org/docs/current/static/storage-file-layout.html > How, for instance, do I figure out what number corresponds to the > table that I know as ‘earlyprinttuples > SELECT relfilenode FROM pg_class WHERE relname = 'earlyprinttuples'; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: large numbers of inserts out of memory strategy
Hi, On 11/28/2017 06:17 PM, Ted Toth wrote: > I'm writing a migration utility to move data from non-rdbms data > source to a postgres db. Currently I'm generating SQL INSERT > statements involving 6 related tables for each 'thing'. With 100k or > more 'things' to migrate I'm generating a lot of statements and when I > try to import using psql postgres fails with 'out of memory' when > running on a Linux VM with 4G of memory. If I break into smaller > chunks say ~50K statements then thde import succeeds. I can change my > migration utility to generate multiple files each with a limited > number of INSERTs to get around this issue but maybe there's > another/better way? > The question is what exactly runs out of memory, and how did you modify the configuration (particularly related to memory). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: large numbers of inserts out of memory strategy
On 11/28/2017 06:54 PM, Ted Toth wrote: > On Tue, Nov 28, 2017 at 11:22 AM, Tomas Vondra > wrote: >> Hi, >> >> On 11/28/2017 06:17 PM, Ted Toth wrote: >>> I'm writing a migration utility to move data from non-rdbms data >>> source to a postgres db. Currently I'm generating SQL INSERT >>> statements involving 6 related tables for each 'thing'. With 100k or >>> more 'things' to migrate I'm generating a lot of statements and when I >>> try to import using psql postgres fails with 'out of memory' when >>> running on a Linux VM with 4G of memory. If I break into smaller >>> chunks say ~50K statements then thde import succeeds. I can change my >>> migration utility to generate multiple files each with a limited >>> number of INSERTs to get around this issue but maybe there's >>> another/better way? >>> >> >> The question is what exactly runs out of memory, and how did you modify >> the configuration (particularly related to memory). >> >> regards >> >> -- >> Tomas Vondra http://www.2ndQuadrant.com >> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > I'm pretty new to postgres so I haven't changed any configuration > setting and the log is a bit hard for me to make sense of :( > The most interesting part of the log is this: SPI Proc: 2464408024 total in 279 blocks; 1672 free (1 chunks); 2464406352 used PL/pgSQL function context: 537911352 total in 74 blocks; 2387536 free (4 chunks); 535523816 used That is, most of the memory is allocated for SPI (2.4GB) and PL/pgSQL procedure (500MB). How do you do the load? What libraries/drivers? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: large numbers of inserts out of memory strategy
On 11/28/2017 07:26 PM, Ted Toth wrote: > On Tue, Nov 28, 2017 at 12:01 PM, Tomas Vondra > wrote: >> >> ... >> >> That is, most of the memory is allocated for SPI (2.4GB) and PL/pgSQL >> procedure (500MB). How do you do the load? What libraries/drivers? >> > > I'm doing the load with 'psql -f'. I using 9.6 el6 rpms on a Centos VM > I downloaded from the postgres repo. > So what does the script actually do? Because psql certainly is not running pl/pgsql procedures on it's own. We need to understand why you're getting OOM in the first place - just inserts alone should not cause failures like that. Please show us more detailed explanation of what the load actually does, so that we can try reproducing it. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Centos 6.9 and centos 7
On 12/04/2017 02:19 PM, Nicola Contu wrote: ...> > centos 7 : > > dbname=# \timing Timing is on. cmdv3=# SELECT id FROM > client_billing_account WHERE name = 'name'; id --- * (1 row) > Time: 3.884 ms > > centos 6.9 > > dbname=# SELECT id FROM client_billing_account WHERE name = 'name'; id > --- * (1 row) Time: 1.620 ms > We need to see EXPLAIN (ANALYZE,BUFFERS) for the queries. Are those VMs or bare metal? What CPUs and RAM are there? Have you checked that power management is disabled / cpufreq uses the same policy? That typically affects short CPU-bound queries. Other than that, I recommend performing basic system benchmarks (CPU, memory, ...) and only if those machines perform equally should you look for issues in PostgreSQL. Chances are the root cause is in hw or OS, in which case you need to address that first. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Centos 6.9 and centos 7
On 12/04/2017 04:57 PM, Nicola Contu wrote: > No I did not run a vacuum analyze. Do you want me to try with that first? > > @Tomas: > Talking abut power management, I changed the profile for tuned-adm > to latency-performance instead of balanced (that is the default) > > that is increasing performances for now and they are similar to centos 6.9. > > Time: 2.121 ms > Time: 2.026 ms > Time: 1.664 ms > Time: 1.749 ms > Time: 1.656 ms > Time: 1.675 ms > > Do you think this can be easily done in production as well? > How am I supposed to know? Not only that depends on your internal deployment policies, but it's also much more a CentOS/RedHat question than PostgreSQL. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Size of pg_multixact/members increases 11355
On 12/13/2017 01:42 PM, Yogesh Sharma wrote: > Dear Thomas , > > Thanks for sharing information. > > Is it possible to remove mentioned folder files in some time intervals > by some DB command? > Currently i can not upgrade to 3.6.20. > So please share if any solution is available. > There are no explicit commands to remove them. When the database decides those files are not needed, it'll remove them automatically. It's strongly discouraged to mess with those files directly, as it may easily lead to data loss or data corruption, particularly when it's unclear why the directory got so large. You could try setting autovacuum_multixact_freeze_max_age=0 and vacuuming the database, but it's hard to say if that will help. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: postgresql 9.5 has ocuuered OOM
On 12/20/2017 04:08 PM, mark wrote: > I have set shared_buffers is 1/4 of memory. > work_mem is 2% of memory. > max_connections is 50. That means if you have all 50 connections active, they may easily consume 100% of memory, because 50 * 2 is 100. It's even easier if the connections are executing complex queries, because each query may use multiple work_mem buffers. So 2% seems a bit too high. > momery size is 16GB. > postgresql process used over 70% of memory and occuered OOM. So, did a single process use 70% of memory, or all postgres processes combined? If just a single process, it might be a poor plan choice (e.g. hash aggregate may easily cause that). If all processes combined, then perhaps it's due to work_mem being too high. > what should I do to deal with this problem? > Hard to say, until you provide enough information. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Hardware advice
Hi, On 01/22/2018 09:46 PM, Alban Hertroys wrote: > Hi all, > > At work we are in the process of setting up a data-warehouse using PG > 10. I'm looking for a suitable server, but I hardly know anything > about server-grade hardware. > > Hence, we're looking for some advice, preferably with some > opportunity to discuss our situation and possibly things we did not > take into account etc. A face to talk to would be appreciated. Who > provides that in or near the eastern border of the Netherlands? > Coincidentally, there's a big conference (FOSDEM) in that area next week, with a smaller PostgreSQL conference (FOSDEM PgDay) the day before. Might be a good opportunity to talk to PostgreSQL people. If you're looking for an actual consulting, there are multiple companies that might help you with this (including ours). > More details: > > We're planning to deploy on bare-metal hardware, with a fallback > server with similar or lesser specs for emergencies and upgrades and > perhaps some (read-only) load balancing of different kinds of loads. > > The server will be accessed for reporting and ETL (or ELT) mostly. > Both reporting servers (test/devel and production) are configured for > at most 40 agents, so that's max 40 connections each to the warehouse > for now. So far, we haven't reached that number in real loads, but > reports are requested ~40,000 times a month (we measure HTTP requests > minus static content). > > We will also be doing ETL of (very) remote (SAP) tables to the > warehouse server; in what we got so far in our limited test > environment we have tables of over 30GB, most of which is from the > last 4 to 5 years. > That's nice, but it does not really tell us how much work that means for the database :-( Those queries might be touching tiny subset of the data, or it might touch the whole data set. That will have significant impact on the hardware requirements. > The biggy though is that we also plan to store factory process > measurements on this server (temperatures, pressures, etc. at 5s > intervals). So, time series data. I wonder if timescale [1] would be appropriate here (never used it, but seems to be designed for exactly this use case). And built on PostgreSQL. [1] http://www.timescale.com/ > Part of one factory has already been writing that data to > a different server, but that's already 4.3 billion records (140GB) > for about a year of measuring and that's not even half of the > factory. We will be required to retain 10-15 years of data from > several factories (on the short term, at least 2). The expectancy is > that this will grow to ~15TB for our factory alone. > Storing this amounts of data is not that difficult - the DL360 machines can handle 40TB+ for example. The question is how intensive the data access and processing will be, so that you can pick the right storage configuration, size the amount of RAM etc. FWIW it makes no sense to size this for 10-15 years from the get go, because (a) you won't get it right anyway (things change over time), and (b) you're unlikely to keep the same hardware for 10+ years. Get smaller but more powerful hardware, plan to replace it in a couple of years with never machines. > We also want to keep our options for growth of this data warehouse > open. There are some lab databases, for example, that currently exist > as two separate brand database servers (with different major versions > of the lab software, so there are design differences as well), that > aren't exactly small either. > > I have been drooling over those shiny new AMD Epyc processors, which > look certainly adequate with a truckload of memory and a good RAID-10 > array and some SSD(s) for the WAL, but it's really hard to figure out > how many cores and memory we need. Sure, 2 7601's at 64 cores and 4TB > of memory (if that's even for sale) would probably do the trick, but > even I think that might be going a little overboard ;) > Well, you can either find out what your actual needs are (by looking at the current system and extrapolating it in some way) and sizing the hardware accordingly. Or you can squeeze as much money from the management as possible, and buying the shiniest stuff out possible. > Oh yeah, apparently we're married to HP or something… At least, IT > management told me to look at their offerings. > I'd say Proliant machines are pretty solid. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services