Re: PL/pgSQL question about EXCEPTION clause & corrupt records
> On Feb 14, 2020, at 04:39, Nick Renders wrote: > > I get the following message: > > ERROR: missing chunk number 0 for toast value 8289525 in pg_toast_5572299 What version of PostgreSQL are you running? I’ve seen this a number of times the past couple years; curious if the lurking bug is still observed in latest versions. -Jeremy Sent from my TI-83
Re: PL/pgSQL question about EXCEPTION clause & corrupt records
FWIW, Bertrand blogged an even faster way to do this about a month ago - using pageinspect and processing blocks instead of rows https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/ -J Sent from my TI-83 > On Feb 17, 2020, at 03:32, Nick Renders wrote: > > The problem is that I don't know which column is corrupt. But I found a > solution: by simply copying the record into another variable, the values are > parsed and the TOAST errors are thrown. > > In case anyone's interested, here's my code, based on an example from > http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html > > > DO $f$ > DECLARE >rContent1 record; >rContent2 record; >iCounter integer DEFAULT 1; >iValue integer; >pTableName varchar := 'f_gsxws_transaction'; >pFieldName varchar := 'gwta_number'; > BEGIN >FOR iValue IN EXECUTE 'SELECT ' || pFieldName || ' FROM ' || > pTableName::regclass || ' ORDER BY ' || pFieldName LOOP >BEGIN >EXECUTE 'SELECT * FROM ' || pTableName::regclass || ' WHERE ' || > pFieldName || ' = $1' >INTO rContent1 >USING iValue; >rContent2 := rContent1; >EXCEPTION WHEN OTHERS THEN >RAISE NOTICE 'data for %.% % is corrupt', pTableName, pFieldName, > iValue; >END; >IF iCounter % 10 = 0 THEN >RAISE NOTICE '% % records checked', iCounter, pTableName; >END IF; >iCounter := iCounter+1; >END LOOP; > END; > $f$; > > > Cheers, > > Nick > > >> On 14 Feb 2020, at 16:14, Tom Lane wrote: >> >> "Nick Renders" writes: >>> I thought I would write a little PL script that would loop through all >>> the data and report any inconsistencies. However, I can't get it to work >>> properly. >>> ... >>> 1) The function has no problem executing the SELECT statement. It is >>> only when "rcontents" is returned, that the function fails. This is a >>> problem, because the ultimate goal is to loop through all records and >>> only return/alert something in case of an error. >>> 2) The function never enters the EXCEPTION clause. Instead, when it hits >>> the RETURN command, it breaks and shows the same message as in pgAdmin: >>> missing chunk number 0 for toast value 8289525 in pg_toast_5572299. >> >> I think what's happening there is that the function doesn't try to >> dereference the value's TOAST pointer during SELECT INTO. It just stores >> that pointer into a variable, and only sometime later when the actual >> content of the value is demanded, do you see the error raised. >> >> The solution to that is to do something that uses the contents of the >> busted column right away while still inside the EXCEPTION block, perhaps >> along the lines of "select md5(mycolumn) into local_variable from..." >> >> A close reading of >> >> https://www.postgresql.org/docs/current/storage-toast.html >> >> would probably help you understand what's happening here. >> >>regards, tom lane > >
Re: PL/pgSQL question about EXCEPTION clause & corrupt records
On 2/17/20 08:23, Jeremy Schneider wrote: > FWIW, Bertrand blogged an even faster way to do this about a month ago - > using pageinspect and processing blocks instead of rows > > https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/ > > >> On Feb 17, 2020, at 03:32, Nick Renders wrote: >> >> The problem is that I don't know which column is corrupt. But I found >> a solution: by simply copying the record into another variable, the >> values are parsed and the TOAST errors are thrown. >> >> In case anyone's interested, here's my code, based on an example from >> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html Apologies - sent that slightly rushed email while I was taking the bus to the office this morning and linked the wrong blog post :D Here's the right one, showing a much faster way to identify which tuple links to a bad toast row: https://bdrouvot.wordpress.com/2020/01/04/get-toast-chunk_id-from-the-user-table-tuples-or-from-the-toast-index-thanks-to-pageinspect/ -Jeremy -- http://about.me/jeremy_schneider
Re: Real application clustering in postgres.
On 3/6/20 01:25, Ron wrote: > On 3/5/20 6:07 AM, Laurenz Albe wrote: >> On Thu, 2020-03-05 at 07:45 +, Daulat Ram wrote: >>> Is there any possibility/options to setup a real application clustering in >>> Postgres as in Oracle we have a RAC feature. >> No, and as far as I know nobody feels interested in providing it. >> >> RAC is a complicated architecture that doesn't do much good, so most >> people feel that it would be a waste of time and effort. >> >> RAC ist not really a scaling solution: because of the shared storage, >> you can only scale for more CPUs; I/O remains the bottleneck. >> RAC is not really a high availability solution: because of the shared >> storage, it has a sibgle point of failure. > > This is utter nonsense. Dual redundant storage controllers connected to > disks in RAID-10 configurations have been around for *at least* 25 years. > > Oracle got it's clustering technology from DEC, and I *know* that > works. Cluster members, storage controllers and disks have all gone > down, while the database and application keep on humming along. Just want to point out that while the merits of RAC are considered as universal truth in the Oracle marketing community, they are certainly still debated and nuanced within the Oracle technical community. Some great reading would be Mogens Norgaard's article "You Probably Don't Need RAC" from 2003 (which is surprisingly hard to find on the internet) and much of the ensuing discussion between then and now, for example this recent message over on the oracle-l list: https://www.freelists.org/post/oracle-l/Chuckleworthy-issue-of-the-NoCOUG-Journal,1 For my part, I spent many years running RAC clusters and solving the usual handful of problems you bump into, and I definitely have a strong bias now toward as simple of architectures as possible. Regardless of which parties participate in your operations, and regardless of who owns the data center where your stack is running. Note that I apply this to all the new databases as well. I enjoy and appreciate the opportunities I've had to work on some really interesting new technology - but I also still see merit in simple, boring, mature, well-understood architectures if they are doing the job. Complexity will find us all soon enough driven by true business needs without being helped by the pursuit of shiny things! It seemed to me there was a major and well-funded effort to market and sell cluster databases for many years, and as a result I suspect that while there are certainly some good use cases, there are probably also some people using RAC today who would do fine (or better) without it. Seems I even touched on this all the way back in 2007 in an article on my own blog... https://ardentperf.com/2007/05/10/fidelity-case-study/ -Jeremy -- http://about.me/jeremy_schneider
Re: Efficiently advancing a sequence without risking it going backwards.
> On Jul 6, 2020, at 19:06, Paul McGarry wrote: > > I don't think I can use setval(), because it risks making sequences go > backwards, eg: > > 1) Check values > DB1sequence: 1234 > DB2sequence: 1233 (1 behind) > 2) setval('DB2sequence',1234); > > but if between (1) and (2) there are 2 nextval(DB2sequence) calls on another > process, (2) would take the sequence back from 1235 to 1234 and I would end > up trying to create a duplicate key ID from the sequence. An ability to “lock” the sequence momentarily would give you the tool you need, but I don’t think it’s there. Total hack, but if your application or users can retry when the rare error is encountered then one idea is to rename the sequence momentarily while you do the setval() then rename it back. Do an initial check without renaming, then re-check after renaming and before the setval() call. If you put retry logic into your application then make sure to include back-off logic so you don’t get an outage induced by thundering herd. -Jeremy Sent from my TI-83
Re: Efficiently advancing a sequence without risking it going backwards.
>> On Jul 9, 2020, at 14:08, Christopher Browne wrote: > >> On Thu, 9 Jul 2020 at 12:59, Jeremy Schneider >> wrote: > >> >> > On Jul 6, 2020, at 19:06, Paul McGarry wrote: >> > >> > I don't think I can use setval(), because it risks making sequences go >> > backwards, eg: >> > >> > 1) Check values >> > DB1sequence: 1234 >> > DB2sequence: 1233 (1 behind) >> > 2) setval('DB2sequence',1234); >> > >> > but if between (1) and (2) there are 2 nextval(DB2sequence) calls on >> > another process, (2) would take the sequence back from 1235 to 1234 and I >> > would end up trying to create a duplicate key ID from the sequence. >> >> An ability to “lock” the sequence momentarily would give you the tool you >> need, but I don’t think it’s there. > > The use-case where you need a lock on the value so that there can't possibly > be a hole in the sequence OP asked for a way to call setval() with a guarantee the sequence will never go backwards IIUC. His code can check that the new value he wants to set is higher than the current value, but there’s a race condition where a second connection could quickly advance the sequence between the check and the setval() call and then cause duplicates from the next call which is bad. The ideal solution is a setval_forward_only() or setval_no_duplicates() function that does it atomically or something. If it were possible to “lock” the entire sequence to prevent any other sessions from using it at all, that would work too. Not locking a value, locking the whole thing. Very bad hack solution is renaming the sequence then renaming it back as a blunt form of locking... and to be clear I don’t think is a good idea I just was saying that technically it might work. :) -Jeremy Sent from my TI-83
Re: Could not open file pg_xact/0E97
> On Jul 18, 2020, at 14:18, Radoslav Nedyalkov wrote: > > > Well. the vacuum full failed with > > vacuumdb: vacuuming of table "olap.transactions_and_fees_2020_01" in database > "db" failed: ERROR: found xmin 3916900817 from before relfrozenxid 80319533 Do you have checksums enabled for this database? -Jeremy Sent from my TI-83
Re: Orphaned relations after crash/sigkill during CREATE TABLE
On 8/20/20 14:46, Tom Lane wrote: > There might be value in a user-invokable tool that runs in an existing > non-crashed database and looks for orphan files, but I'm not aware that > anyone has written one. (Race conditions against concurrent table > creation would be a problem; but probably that can be finessed somehow, > maybe by noting the file's creation time.) Bertrand Drouvot just put out a tool to list potentially orphaned files a couple months ago. https://github.com/bdrouvot/pg_orphaned Implemented as an extension, doesn't remove the files, but useful nonetheless. -Jeremy -- http://about.me/jeremy_schneider
Re: Returning timestamp with timezone at specified timezone irrespective of client timezone
On 9/27/20 16:13, Ron wrote: > On 9/27/20 4:16 PM, aNullValue (Drew Stemen) wrote: >> What I need is for the ability to return a timestamp with timezone, >> using the UTC offset that corresponds to a column-defined timezone, >> irrespective of the client/session configured timezone. >> >> I have three columns in a table: >> Timezone: 'US/Eastern' >> Date: 2020-10-31 >> Time: 08:00 >> >> The output I'm able to find includes these possibilities: >> '2020-10-31 08:00:00' >> '2020-10-31 12:00:00+00' >> >> Whereas what I actually need is: >> '2020-10-31 08:00:00-05' >> >> Using the postgresql session-level timezone configuration won't work >> because I need multiple timezones to be handled in a single set. > > Are you really asking what the TZ offset was on a specific date (Like > DST or not)? IIUC, there is a gap here in PostgreSQL. i think it could most easily/quickly be addressed with an overloaded version of to_char that accepts a "display timezone" for its timestamp to character conversion. FWIW - in Oracle this is handled by having two different data types: 1) TIMESTAMP WITH TIME ZONE 2) TIMESTAMP WITH LOCAL TIME ZONE ironically, oracle's "local" data type is the equivalent PostgreSQL's timestamp with time zone where the timestamp is converted and processed/stored without a time zone. afaik postgresql doesn't have a native data type equivalent to the first variant in oracle, which actually considers the time zone as part of the data. (am i missing something?) in lieu of having built-in support, a PL/pgSQL function to set the session-level timezone in between processing each record is the best approach i've thought of so far. -Jeremy -- http://about.me/jeremy_schneider
Re: Returning timestamp with timezone at specified timezone irrespective of client timezone
On 9/27/20 16:42, Jeremy Schneider wrote: > On 9/27/20 16:13, Ron wrote: >> On 9/27/20 4:16 PM, aNullValue (Drew Stemen) wrote: >>> What I need is for the ability to return a timestamp with timezone, >>> using the UTC offset that corresponds to a column-defined timezone, >>> irrespective of the client/session configured timezone. >>> >>> I have three columns in a table: >>> Timezone: 'US/Eastern' >>> Date: 2020-10-31 >>> Time: 08:00 >>> >>> The output I'm able to find includes these possibilities: >>> '2020-10-31 08:00:00' >>> '2020-10-31 12:00:00+00' >>> >>> Whereas what I actually need is: >>> '2020-10-31 08:00:00-05' >>> >>> Using the postgresql session-level timezone configuration won't work >>> because I need multiple timezones to be handled in a single set. >> >> Are you really asking what the TZ offset was on a specific date (Like >> DST or not)? > > in lieu of having built-in support, a PL/pgSQL function to set the > session-level timezone in between processing each record is the best > approach i've thought of so far FYI, here's the hack approach I was thinking of. I intentionally didn't preserve the session's timezone in the transaction, but that could easily be done with a few more lines of PL/pgSQL. create or replace function to_char( v_tstz timestamp with time zone ,v_format text ,v_tz text ) returns text language plpgsql immutable parallel safe as $$ begin perform set_config('timezone',v_tz,true); return to_char(v_tstz,v_format); end; $$ ; SELECT id ,to_char(l.loc_date+l.loc_time::time ,'-MM-DD HH24:MI:SSOF' ,timezone) tsw FROM loc l ORDER BY timezone, loc_date, loc_time ; id | tsw + 3 | 2020-10-31 03:00:00-05 4 | 2020-11-03 08:00:00-06 1 | 2020-10-31 09:00:00-04 2 | 2020-11-03 08:00:00-05 https://gist.github.com/aNullValue/ba838d6b40495695df0daa11c2748248 -- http://about.me/jeremy_schneider
Re: Returning timestamp with timezone at specified timezone irrespective of client timezone
On 9/27/20 20:13, Jeremy Schneider wrote: > create or replace function to_char( >v_tstz timestamp with time zone > ,v_format text > ,v_tz text > ) returns text language plpgsql > immutable parallel safe > as $$ > begin > perform set_config('timezone',v_tz,true); > return to_char(v_tstz,v_format); > end; > $$ > ; Just occurred to me, I don't know if this is actually parallel safe. I'm not sure how transaction-level session configuration is handled inside parallel workers. Might be best to leave off the "parallel safe" flag from the function for now. -J -- http://about.me/jeremy_schneider
Re: Monitoring multixact members growth
On 8/19/22 12:52 AM, Vido Vlahinic wrote: > My goal here is to predict where multixact members are growing the > fastest so I can perform manual VACUUM FREEZE only on those tables > > (typically with multi-billion row count) when system is relatively > idle as opposed to just sit and wait for wraparound protection to take > over > > when autovacuum_multixact_freeze_max_age threshold is reached (slowing > the whole system down). > I think that you're probably approaching this wrong. Vacuum is something that you generally want to run more aggressively, not less. But to be fair, it's a very common misunderstanding that waiting to do vacuum processing until later can be a good idea... even though in fact it works in the opposite way - on systems with significant load (where it matters) - sometimes a long-running report or query that needs old row versions for its own processing might cause a lot of table and index bloat and negatively impact real-time transactional performance. (For really long-running stuff, it's sometimes better to use a snapshot of the DB or maybe a standby system that's disconnected from the primary for reporting and periodically replays logs to catch up. But obviously you start simple and don't add this complexity to the architecture until it's truly needed.) Funny thing is that I've had to do exactly what you're asking about, as part of troubleshooting problems - but the goal wasn't to run vacuum later but to run a vacuum freeze IMMEDIATELY. 🙂 As one example, pile-ups on LWLock multixact_offset. Here's one pageinspect query that did the trick for me. In the first line (WITH...) you change public.my_test to the table you want to inspect. This only looks at a single table and it was for troubleshooting the aforementioned wait event, so it's actually breaking down mxid's by SLRU page numbers. If you're seeing a large number of SLRU pages (lots of rows coming back) then that means you might want to proactively run a manual vacuum freeze. (And then see if you can update the app code to reduce mxid usage!) I'm not answering your question, but thought it was a nice excuse to share a related query and pontificate a bit... hopefully useful to someone! -Jeremy = pg-14.4 rw root@db1=# create extension pageinspect; CREATE EXTENSION Time: 7.561 ms pg-14.4 rw root@db1=# with tab_name as (select 'public.my_test' t) select min(now()) current_timestamp, count(*) number_mxid_on_page, min(xmax) min_mxid, max(xmax) max_mxid, trunc((xmax)/(8192/4)) page_no from ( select ((attrs).t_infomask::bit(16) & x'1000'::bit(16))::int::boolean is_multixact, (attrs).t_infomask::bit(16) infomask, (attrs).t_xmax::text::integer xmax from ( select page,heap_page_item_attrs(get_raw_page((select t from tab_name),page), (select t from tab_name)::regclass) attrs from generate_series(0,(select relpages from pg_class where oid=(select t from tab_name)::regclass)-1) page ) subq where (attrs).t_infomask is not null ) subq3 where is_multixact group by trunc((xmax)/(8192/4)); current_timestamp | number_mxid_on_page | min_mxid | max_mxid | page_no ---+-+--+--+- (0 rows) Time: 2223.640 ms (00:02.224) pg-14.4 rw root@db1=# begin; BEGIN Time: 0.466 ms pg-14.4 rw root@db1=# select * from my_test where i<5 for update; i | data ---+-- 1 | 2 | 3 | 4 | (4 rows) Time: 50.074 ms pg-14.4 rw root@db1=# savepoint a; SAVEPOINT Time: 0.605 ms pg-14.4 rw root@db1=# update my_test set i=i-10 where i<5; UPDATE 4 Time: 49.481 ms pg-14.4 rw root@db1=# with tab_name as (select 'public.my_test' t) select min(now()) current_timestamp, count(*) number_mxid_on_page, min(xmax) min_mxid, max(xmax) max_mxid, trunc((xmax)/(8192/4)) page_no from ( select ((attrs).t_infomask::bit(16) & x'1000'::bit(16))::int::boolean is_multixact, (attrs).t_infomask::bit(16) infomask, (attrs).t_xmax::text::integer xmax from ( select page,heap_page_item_attrs(get_raw_page((select t from tab_name),page), (select t from tab_name)::regclass) attrs from generate_series(0,(select relpages from pg_class where oid=(select t from tab_name)::regclass)-1) page ) subq where (attrs).t_infomask is not null ) subq3 where is_multixact group by trunc((xmax)/(8192/4)); current_timestamp | number_mxid_on_page | min_mxid | max_mxid | page_no ---+-+--+--+- 2022-08-19 23:05:43.349723+00 | 4 | 1 | 1 | 0 (1 row) Time: 2117.555 ms (00:02.118) pg-14.4 rw root@db1=# -- http://about.me/jeremy_schneider
Re: postgis after pg_upgrade
On 11/30/18 05:11, Slavcho Trnkovski wrote: > I have postgresql 9.4 with postgis extension installed (latest version, > 2.4.5). I upgraded postgresql from 9.4 to 9.6. > After upgrading to 9.6 I get the following result > from PostGIS_full_version() > select PostGIS_full_version(); > > ... (procs need upgrade for use with "96") ... > > Is there any way to resolve this besides recreating the extension? If I'm reading the postgis docs correctly, this message specifically means that you must do a full dump/reload of postgis-enabled databases. http://postgis.net/docs/manual-2.4/postgis_installation.html#upgrading > If you can't find the postgis_upgrade*.sql specific for upgrading > your version you are using a version too early for a soft upgrade > and need to do a HARD UPGRADE. > > The PostGIS_Full_Version function should inform you about the need > to run this kind of upgrade using a "procs need upgrade" message. > > By HARD UPGRADE we mean full dump/reload of postgis-enabled > databases. -- http://about.me/jeremy_schneider
Re: Transaction Id Space, Freezing and Wraparound
On 11/20/18 13:00, Tom Lane wrote: > If the stored XIDs were 64 bits wide, we'd not have to bother with all > of this mess ... but adding another 64 bits to tuple headers would be > a painful space cost, not to mention the storage compatibility issues. People keep saying that. But didn't someone come up with a way to do this by storing the top 32 bits just once somewhere in the block, rather than adding 64 bits to every tuple? I can't remember if there was an email thread. Maybe some other drawbacks to that approach, but lets at least point out adding 64 bits to the tuple header isn't the only way to accomplish this. And the other thread about memory management... if I'm going to start religious wars, might as well just start them all at once right? ;) -J -- http://about.me/jeremy_schneider
Re: Amazon Linux Support?
> On Apr 30, 2019, at 07:16, Adrian Klaver wrote: > >> On 4/30/19 2:02 AM, Lewis Shobbrook wrote: >> Hi Guys, >> With the repo changes associated with the April 17 changes, >> https://pgstef.github.io/2019/04/17/one_rpm_to_rule_them_all.html >> It is evident that support for amazon linux has been dropped. >> While you can try to use redhat pgdp packages, they are not >> installable on Amazon Linux, giving the following error... >> /etc/redhat-release is needed by pgdg-redhat-repo-42.0-4.noarch >> I confident that the rest rpms available in the repo can be used by >> Amazon linux, but the redhat-release dependency is painful for >> automation with existing recipes such as postgresql in chef. >> Before I invest the effort to produce a pull request to address this, >> I wanted to confirm that amazon linux has been dropped and will no >> longer, nor likely ever be supported for the rpms repo's, or is it >> still on the road map? > > You will need a community account to see the issue: > > https://redmine.postgresql.org/issues/4205 > > The relevant part: > > Updated by Devrim Gündüz 4 days ago > > " > Hi, > > We dropped Amazon Linux support years ago, there were lots of compatibility > issues. The new repo RPMs just reflect that. > > Please switch to a supported distro, or rebuild RPMs from SRPMS. > > Regards, Devrim > I've been using the rhel6 repo with AL without issue for a long time now. it looks like they added a new dependency that breaks this, but I recently tested this workaround and it seems to work: `rpm -Uvh --nodeps https://download.postgresql.org/pub/repos/yum/testing/11/redhat/rhel-6-x86_64/pgdg-redhat11-11-2.noarch.rpm` the other thing is that I do is a `s/$releasever/6/g` on `/etc/yum.repos.d/pgdg-*.repo` -- i have this all baked into some ansible scripts and my scripts are working fine just now when i spun up a new ec2/postgres test server -Jeremy Sent from my TI-83
Re: PostgreSQL on Amazon RDS
On 5/6/19 23:27, Rashmi V Bharadwaj wrote: > Is there a SQL query or a database parameter setting that I can use from > an external application to determine if the PostgreSQL database is on > cloud (like on Amazon RDS or IBM cloud) or on a non-cloud on-prem > environment? Here's my psqlrc file which has pretty solid logic for detecting between Community PostgreSQL, RDS PostgreSQL and Aurora PostgreSQL. Note that it just assumes "community/oss" as a default if it doesn't detect the other two. Should be easy enough to add detection of other hosted environments into the query with the "priority" column (as long as there's a reliable way to detect). https://gist.github.com/ardentperf/52bd418e44b1be26d7b63af21331cece This psqlrc also detects read-write versus read-only (e.g. hot standby), and the usual username/database/superuser info and builds everything into the prompt in a way that suits me. -Jeremy -- http://about.me/jeremy_schneider
Re: Hot Standby Conflict on pg_attribute
Just a quick footnote: If autovac truncations are frequently causing replica lag, and if this is a problem for you, IIUC one way you can stop autovac from doing the truncations even on older versions is setting old_snapshot_threshold to any value at all besides zero. (On 12+ you can directly control the truncation behavior.) -Jeremy Sent from my TI-83 > On May 10, 2019, at 12:46, Erik Jones wrote: > >> On Fri, May 10, 2019 at 12:41 PM Tom Lane wrote: > >> Andres Freund writes: >> > On 2019-05-09 13:03:50 -0700, Erik Jones wrote: >> >> The question then is: Why would these user queries be waiting on an >> >> AccessShare lock on pg_attribute? >> >> > Queries that access a table for the *first* time after DDL happened >> > (including truncating the relation), need an AccessShareLock on >> > pg_attribute (and pg_class, pg_index, ...) for a short time. >> >> Also, it seems likely that what's really triggering the issue is >> autovacuum on pg_attribute trying to truncate off empty pages >> in pg_attribute (after a bunch of dead rows were generated there >> by DDL activity). That requires exclusive lock on pg_attribute, >> which would propagate down to the standby. >> >> regards, tom lane > > Right, that part I understood after checking out pg_attribute's insert/delete > counts in pg_stat_sys_tables before and after some REFRESH MATERIALIZED VIEW > runs on an otherwise idle server. With them running 2k+ refreshes per day > autovac is regularly working on their catalog tables. > > Thanks! > -- > Erik Jones > mag...@gmail.com
Re: logical replication on 9.6: replica configuration
Because you’re on 9.6, you can’t use the native create subscription/publication sql commands. I’d check out pglogical from 2nd quadrant; this should be supported on RDS 9.6 iirc. https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/ That said, I’m a bit curious what security concern would disallow a physical hot standby and allow a logical replication setup - especially in a managed environment... are you able to share any more specifics about the reasoning here? Maybe the requirement is for the replica to be in a different account or something? -Jeremy Sent from my TI-83 > On May 12, 2019, at 20:51, Paul Costello wrote: > > I need to create a read-only replica in AWS environment. I can't use AWS/RDS > built-in read replica functionaility due to security concerns, so I'm tied to > using logical replication, on postgres 9.6. How do I configure a replica db > to receive the contents of the replication slot on master? > > I was able to get logical replication going with the following 2 commands: > > SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding'); > pg_recvlogical -d test --slot test_slot --start -f - & > > I'm very unclear on where to go from there to stand up a read-replica db. > > Thanks, > Paul C. >
Re: Native Logical Replication Initial Import Qs
On 6/7/23 2:12 PM, Don Seiler wrote: > On the logical replication front, the concern is with the initial data > import that happens when the subscription is created (by default). I > know that you can tell the subscription to not copy data and instead use > pg_dump and a replication slot snapshot to achieve this manually. > However I'm unable to explain (to myself) why this is better than just > having the subscription do it upon creation. Given that I can create > pub/sub sets for individual tables for parallel operations, I'm curious > what advantages there are in using pg_dump to do this import. FWIW, I think the place this feature shines the most is when you can safely leverage things like storage-level snapshots. Sometimes that means you can get a copy of a multi-TB database almost instantly if the storage or filesystem does copy-on-write, for example database lab (postgres.ai) which uses ZFS. Another thing I can think of is that while it's true you can create multiple pub/sub sets, I'm not sure you can reduce the number of sets later. So if you were concerned about having too many slots doing decoding on the source, then you might want the flexibility of pg_dump (or perhaps restoring a backup) to get more parallelism while having more control over how many slots will be used later. In your case, the whole setup is hopefully temporary, so maybe these particular concerns aren't as relevant to you. This is just what comes to mind... probably there's a few more things I'm not thinking of and hopefully others will chime in. :) -Jeremy -- http://about.me/jeremy_schneider
typical active table count?
Question for other PostgreSQL users On your moderately busy DB, how many different tables might receive at least one change/DML during a 10 second window? 10? 50? 100? More? Ballpark guess off the top of your head. I'm in a discussion & there's questions about whether it's unusual to have more than 10 or so. The answer isn't clear to me. Probably worthwhile to call out partitioning explicitly (ie. if this factors into an answer then mention that fact) -Jeremy -- http://about.me/jeremy_schneider
typical active table count?
Question for other PostgreSQL users On your moderately busy DB, how many different tables might receive at least one change/DML during a 10 second window? 10? 50? 100? More? Ballpark guess off the top of your head. I'm in a discussion & there's questions about whether it's unusual to have more than 10 or so. The answer isn't clear to me. Probably worthwhile to call out partitioning explicitly (ie. if this factors into an answer then mention that fact) -Jeremy -- http://about.me/jeremy_schneider
Re: typical active table count?
On 6/27/23 9:32 AM, Ben Chobot wrote: > We certainly have databases where far more than 100 tables are updated > within a 10 second period. Is there a specific concern you have? > Thank Ben, not a concern but I'm trying to better understand how common this might be. And I think sharing general statistics about how people use PostgreSQL is a great help to the developers who build and maintain it. One really nice thing about PostgreSQL is that two quick copies of pg_stat_all_tables and you can easily see this sort of info. If you have a database where more than 100 tables are updated within a 10 second period - this seems really uncommon to me - I'm very curious about the workload. For example: 1) Is the overall total number of tables for this database in the thousands, 10s of thousands or 100s of thousands? 2) How many CPUs or cores does the server have? 3) Are you using partitions and counting each one? What's the number if you count each partitioned table as a single table? 4) Would you characterize this database as SaaS, ie. many copies of a similar schema? Or is it one very large schema of many different tables? -Jeremy -- http://about.me/jeremy_schneider
Re: typical active table count?
On 6/27/23 12:08 PM, Ron wrote: > On 6/27/23 13:47, Jeremy Schneider wrote: >> On 6/27/23 9:32 AM, Ben Chobot wrote: >>> We certainly have databases where far more than 100 tables are updated >>> within a 10 second period. Is there a specific concern you have? >>> >> Thank Ben, not a concern but I'm trying to better understand how common >> this might be. And I think sharing general statistics about how people >> use PostgreSQL is a great help to the developers who build and maintain it. >> >> One really nice thing about PostgreSQL is that two quick copies of >> pg_stat_all_tables and you can easily see this sort of info. >> >> If you have a database where more than 100 tables are updated within a >> 10 second period - this seems really uncommon to me - I'm very curious >> about the workload. > > 100 tables updates just means /possibly complicated schema/, not > necessarily high volume. > > ... > > And honestly, 100 tables in 10 seconds is 10 tables/second. If each > gets one insert, that's a laughably slow transaction rate. (Unless of > course there's 85 indices per table, and foreign keys don't have > supporting indices.) I don't think the math actually works this way on highly concurrent systems. In fact, this morning I connected with a coworker who works on Amazon fulfillment center DBs and there was almost no difference in the number of tables with insert/update/delete regardless of whether you looked at a 10 second window or a 2 second window. I was also able to chat with another coworker at Amazon who got numbers from a couple of their PG databases, and connected w one person on slack at a different company who passed along numbers, and got a few emails from Oracle folks. The numbers reported back to me ranged from 29 to over a hundred. Obviously there are also lots of small databases behind wordpress websites with much less activity, but I found this to be an interesting measure of some respectably busy systems. The original context was a conversation related to logical replication of DB changes. But then I got interested in the general question and topic - and someone on the Oracle side mentioned system tables which is a really good point that hadn't occurred to me yet. The original conversation was concerned with user tables and not system ones, but there would be a fair amount of ongoing system table activity too. Besides partitioning, another interesting dimension of the conversation has been thinking about different categories of workloads. For example: SaaS or multitenant applications with many copies of a similar schema, ISVs, ERPs, or large enterprise databases with lots of development history. All of these categories can easily ramp up the counts. I'm still interested in more data - if anyone reading this can grab a couple snapshots of pg_stat_all_tables and report back numbers for a 10 second window and a 2 second window, that would be amazing! -Jeremy -- http://about.me/jeremy_schneider
Re: Corruption or wrong results with 14.10?
On Thu, Nov 23, 2023 at 7:34 AM Torsten Förtsch wrote: > On Thu, Nov 23, 2023 at 2:29 PM Daniel Westermann (DWE) < > daniel.westerm...@dbi-services.com> wrote: > >> smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group >> by crart_id, chemin having count(*) > 1; >> crart_id | chemin | count >> -++--- >> 3505463 | @ | 2 >> 473731 | @ | 2 >> 1063238 | @ | 2 >> 1468088 | @ | 2 >> 462723 | @ | 2 >> 1099378 | @ | 2 >> >> >> I think I know what I have to do. >> > > If you have moved the database from a system with glibc <2.28 to >=2.28, > that could be the reason. > > https://wiki.postgresql.org/wiki/Collations > Just one note here... the reason can be _any_ operating systems move or update. It can happen with ICU and it can happen with any version of glibc (this is easily reproducible and we've seen it happen on production PG deployments that didn't involve glibc 2.28) glibc 2.28 has certainly been the most obvious and impactful case, so the focus is understandable, but there's a bit of a myth that the problem is only with glibc 2.28 (and not ICU or other glibc versions or data structures other than indexes) The only truly safe way to update an operating system under PosgreSQL is with logical dump/load or logical replication, or continuing to compile and use the identical older version of ICU from the old OS (if you use ICU). I think the ICU folks are generally careful enough that it'll be unlikely for compiler changes and new compiler optimizations to inadvertently change collation on newer operating systems and build toolchains. Ironically I just did a detailed talk on this topic at PASS Data Summit last week, but unfortunately the recording probably won't be released for awhile. 🙂 -Jeremy -- http://about.me/jeremy_schneider
Re: Sample data generator for performance testing
On 1/2/24 11:23 PM, arun chirappurath wrote: > Do we have any open source tools which can be used to create sample data > at scale from our postgres databases? > Which considers data distribution and randomness I would suggest to use the most common tools whenever possible, because then if you want to discuss results with other people (for example on these mailing lists) then you're working with data sets that are widely and well understood. The most common tool for PostgreSQL is pgbench, which does a TPCB-like schema that you can scale to any size, always the same [small] number of tables/columns and same uniform data distribution, and there are relationships between tables so you can create FKs if needed. My second favorite tool is sysbench. Any number of tables, easily scale to any size, standardized schema with small number of colums and no relationships/FKs. Data distribution is uniformly random however on the query side it supports a bunch of different distribution models, not just uniform random, as well as queries processing ranges of rows. The other tool that I'm intrigued by these days is benchbase from CMU. It can do TPCC and a bunch of other schemas/workloads, you can scale the data sizes. If you're just looking at data generation and you're going to make your own workloads, well benchbase has a lot of different schemas available out of the box. You can always hand-roll your schema and data with scripts & SQL, but the more complex and bespoke your performance test schema is, the more work & explaining it takes to get lots of people to engage in a discussion since they need to take time to understand how the test is engineered. For very narrowly targeted reproductions this is usually the right approach with a very simple schema and workload, but not commonly for general performance testing. -Jeremy -- http://about.me/jeremy_schneider
Re: Migrating From CentOS-6 Oracle 11g R2 To CentOS-7 PostgreSQL ??
On 12/5/17 12:07, Kevin Brannen wrote: > 1. What version of PostgreSQL should I use? > > The latest that works for you. Some people don’t like .0 releases > for various reasons and if that applies to you, then use the latest > of 9.6.x (9.6.5 I think); else use 10.0. In addition, 10.1 is available as of Nov 9th > 2. I don't want to use a RPM because I like controlling where > software is installed, so where can I locate the required tar ball? I wonder if the PGDG rpms are relocatable packages? I haven't tried that yet. > You can download the source and build it yourself; there’s an arg > for where you want to install it. The instructions are pretty easy > to follow if you’ve ever built anything. Agree you can run the build script easily enough, although personally I'd prefer a package over doing my own builds. Maybe it's not a priority to you, but if you're building stuff yourself and you really want to do it right then you should be tracking a lot of stuff and making sure you are using all the right configure options and making sure you can always perfectly reproduce builds. Probably should be creating and archiving your own packages... but depends on your requirements. :) > 5. What additional information might be helpful? Just last night we had a great talk at the Seattle PostgreSQL User Group about moving from MSSQL to PG and many considerations apply equally. In particular, you can find lots of code to make PostgreSQL act more like your source DB... but you want to carefully consider whether this is the right approach in your situation. (There be flame-wars here.) And speaking of PUGs, you could check for a local one of those and get connected up to ask future questions face-to-face too. :) -J -- http://about.me/jeremy_schneider
Re: Is it possible to sort strings in EBCDIC order in PostgreSQL server?
On 12/12/17 10:21, Peter Geoghegan wrote: > ICU supports creating custom collations that reorder upper and lower > case, or digits with scripts (e.g. Latin alphabet characters). See the > documentation -- "23.2.2.3.2. ICU collations". Advanced customization > is possible. I just gave this a try and I'm not sure I completely understand what it's doing but it definitely doesn't look like it's sorting according to EBCDIC byte orders. (This was on centos 7 with libicu-50.1.2) I might have created the collation incorrectly. I'd love to hear if anyone else finds a way to get this working outside of custom comparison operators. pops-10.1 root@db1=# create collation "ebcdic" (provider=icu, locale='cp037'); CREATE COLLATION pops-10.1 root@db1=# create table test (data text); CREATE TABLE pops-10.1 root@db1=# insert into test values('a'),('b'),('A'),('B'),('1'),('2'),('!'),('^'); INSERT 0 8 pops-10.1 root@db1=# select * from test order by data collate "ebcdic"; data -- ! ^ 1 2 a A b B (8 rows) -- http://about.me/jeremy_schneider
Re: Disabling vacuum truncate for autovacuum
On Mon, 16 Dec 2024 16:25:06 -0800 Will Storey wrote: > I would like to disable vacuum's truncate behaviour for autovacuum. > Previously I had an outage due to its access exclusive lock when it > was replicated to a hot standby. > > When that outage happened it was from a VACUUM call in a cronjob > rather than autovacuum. I now run such VACUUMs with TRUNCATE false > which avoids the issue for these. However I've realized that > autovacuum could cause this as well. This is of interest to me > because I'm looking at tuning autovacuum and getting rid of the > cronjob, but I've realized relying on autovacuum could be dangerous > because of the truncates. Can you tell us a little bit more about the outage? Autovacuum is designed to quickly relinquish this lock if there is any contention, and the dangers of disabling autovacuum are significant, so your statement about autovac being "dangerous" will raise a lot of eyebrows. Did your outage involve hot standbys serving read-only traffic, or did it only involve a read-write database? What was the exact nature of the outage and how did you narrow down the cause to the exclusive lock held specifically during an autovacuum truncation? -Jeremy
Re: Disabling vacuum truncate for autovacuum
On Thu, 26 Dec 2024 13:24:03 -0800 Will Storey wrote: > My incident was actually not caused by autovacuum. A VACUUM was run > against the primary by a cronjob. A web service running read queries > against hot standbys went down for several minutes as its queries > were stuck in a lock queue. > > ... > > As I recall, I confirmed the cause via query logs. I noticed the > table was vacuumed at the time, which lead me to learning about the > page truncation behaviour. It has been a couple years though. Ah - thanks - this is very helpful. I have also seen issues specifically with hot standbys, which continue holding the exclusive lock even when the primary read-write instance releases the lock. A better solution in my opinion would be to enhance the WAL replay process so that it can somehow temporarily relinquish the exclusive lock under contention, similar to what the primary read-write instance is able to do. This is not an easy enhancement to make. Maybe we'd need the primary to put more information into the WAL than it does today. Maybe we'd need to leverage hot_standby_feedback to enable standbys to signal a primary to release the lock. Anyway thanks for the report - we need people reporting these issues on the lists so that there's a little visibility into the impact. Personally I'm still hesitant about the idea of globally disabling vacuum truncation. That was never the goal of the old_snapshot_threshold feature, interesting that you were able to capitalize on the side-effect. Personally I'd still favor disabling it only on the tables that are both frequently vacuumed and also frequently queried on hot standbys. In a pinch, you could disable it for all tables with a bit of dynamic SQL and ensuring that new tables created in the future include the syntax to disable it too. -Jeremy
Re: glibc 2.35-2.39 upgrade requirements
On Fri, 17 Jan 2025 10:27:04 -0500 Tom Lane wrote: > Kamen Kalchev writes: > > Hi everyone, we're planning to upgrade the OS running Postgres from > > ubuntu jammy to ubuntu noble. As part of the OS change, the glibc > > version will be changed from glibc 2.35 to glibc 2.39.. > > Can someone confirm if changing the glibc between those versions > > will require a full reindex of the Postgres cluster? > > Maybe, maybe not. According to [1], the last glibc collation change > that the PG community really noticed was in glibc 2.28. So maybe > there weren't any significant changes between 2.35 and 2.39. The > conservative path would certainly be to reindex all textual columns > (though you can skip any that have collation "C"). I haven't run 2.39 through the scan yet [1]; I should do that because someone was asking the same question on postgres slack. But note that every single ubuntu LTS and every single RHEL major release in the last 10 years has had collation changes, except for ubuntu 14.04 ... so it's worth being cautious. Collations are a bit like time zones - small changes are always happening, but you might not always notice. Jeff Davis and I did a talk at the last pgconf about this, the recording is online [2]. Personally I would recommend using the builtin C collation as database default starting in pg17, and using ICU to do linguistic collation at the table or query level when needed. With ICU there's at least the option to rebuild old versions on new operating system majors, if needed. (Though rebuilding objects - not just indexes, but anything depending on the collation - is the best course.) And be careful about hot standbys, FDWs, and other places where you can get little surprises with different OS majors. The YouTube recording has lots of info. -Jeremy 1: https://github.com/ardentperf/glibc-unicode-sorting 2: https://www.youtube.com/watch?v=KTA6oau7tl8
Re: verify checksums online
On Fri, 18 Apr 2025 17:32:19 -0400 Ron Johnson wrote: > On Fri, Apr 18, 2025 at 5:18 PM Jeremy Schneider > wrote: > > > i think there had been some mailing list discussions years ago? the > > pg_checksum utility in core still can't do an online check right? > > > > https://www.postgresql.org/docs/17/app-pgchecksums.html > > "The server must be shut down cleanly before running pg_checksums." Yes I read the docs. This utility has evolved over time including a complete rename, and I was just doublechecking here that we didn't miss a doc update in case this sentence applied to enable/disable but not to verify. As I said I'm assuming it's all accurate but just doing the paranoid double-check -Jeremy
verify checksums online
i should know the answer to this... but asking anyway i think there had been some mailing list discussions years ago? the pg_checksum utility in core still can't do an online check right? looking around, i think that michael's pg_checksums_ext still seems to be available in debian repos, and it advertises online checks, which makes me think that the core pg_checksum might still not have merged any code for online checking yet but i wanted to confirm with someone - my easiest options for online verifying checksums would be to trigger pg_basebackup then check pg_stat_database, or to install michael's utility? -Jeremy
the postgr.es/m/MESSAGE_ID URL format
Hello, Is there a page somewhere on the postgres wiki with a simple high-level explanation of what the https://postgr.es/m/MESSAGE_ID URL format is and how to construct one of these URLs? For example, can this be used for all postgres mailing lists or only certain lists (like hackers)? It's probably a two sentence thing, but with some basic searching I couldn't find a wiki page with those two sentences It is referenced here for commit messages https://wiki.postgresql.org/wiki/Commit_Message_Guidance I also find it useful for mailing list messages -Jeremy
Re: COLLATION update in 13.1
On Mon, 24 Feb 2025 11:08:43 +0100 Matthias Apitz wrote: > > What is the procedure on 13.1 to bring the external (glibc) version > in sync with. the used version in the PostgreSQL database? If I recall correctly, between versions 10 and 14 you need to use ALTER COLLATION name REFRESH VERSION on every collation where there's a version mismatch. (This can happen with ICU collations since version 10, and with glibc collations since version 13.) I only know of a small number of changes between glibc versions 2.31 and 2.38 ~ I suspect you already know to look out for glibc 2.28 which was the crazy one. (SLE15 Service Pack 3.) Most databases would be corrupted by that update, and running "refresh version" would simply stop the warnings without fixing the corruption. -Jeremy