Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-14 Thread Jeremy Schneider


> On Feb 14, 2020, at 04:39, Nick Renders  wrote:
> 
> I get the following message:
> 
>   ERROR:  missing chunk number 0 for toast value 8289525 in pg_toast_5572299

What version of PostgreSQL are you running? I’ve seen this a number of times 
the past couple years; curious if the lurking bug is still observed in latest 
versions.

-Jeremy

Sent from my TI-83




Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Jeremy Schneider
FWIW, Bertrand blogged an even faster way to do this about a month ago - using 
pageinspect and processing blocks instead of rows

https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/

-J

Sent from my TI-83

> On Feb 17, 2020, at 03:32, Nick Renders  wrote:
> 
> The problem is that I don't know which column is corrupt. But I found a 
> solution: by simply copying the record into another variable, the values are 
> parsed and the TOAST errors are thrown.
> 
> In case anyone's interested, here's my code, based on an example from 
> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html
> 
> 
> DO $f$
> DECLARE
>rContent1 record;
>rContent2 record;
>iCounter integer DEFAULT 1;
>iValue integer;
>pTableName varchar := 'f_gsxws_transaction';
>pFieldName varchar := 'gwta_number';
> BEGIN
>FOR iValue IN EXECUTE 'SELECT ' || pFieldName || ' FROM ' || 
> pTableName::regclass || ' ORDER BY ' || pFieldName LOOP
>BEGIN
>EXECUTE 'SELECT * FROM ' || pTableName::regclass || ' WHERE ' || 
> pFieldName || ' = $1'
>INTO rContent1
>USING iValue;
>rContent2 := rContent1;
>EXCEPTION WHEN OTHERS THEN
>RAISE NOTICE 'data for %.% % is corrupt', pTableName, pFieldName, 
> iValue;
>END;
>IF iCounter % 10 = 0 THEN
>RAISE NOTICE '% % records checked', iCounter, pTableName;
>END IF;
>iCounter := iCounter+1;
>END LOOP;
> END;
> $f$;
> 
> 
> Cheers,
> 
> Nick
> 
> 
>> On 14 Feb 2020, at 16:14, Tom Lane wrote:
>> 
>> "Nick Renders"  writes:
>>> I thought I would write a little PL script that would loop through all
>>> the data and report any inconsistencies. However, I can't get it to work
>>> properly.
>>> ...
>>> 1) The function has no problem executing the SELECT statement. It is
>>> only when "rcontents" is returned, that the function fails. This is a
>>> problem, because the ultimate goal is to loop through all records and
>>> only return/alert something in case of an error.
>>> 2) The function never enters the EXCEPTION clause. Instead, when it hits
>>> the RETURN command, it breaks and shows the same message as in pgAdmin:
>>> missing chunk number 0 for toast value 8289525 in pg_toast_5572299.
>> 
>> I think what's happening there is that the function doesn't try to
>> dereference the value's TOAST pointer during SELECT INTO.  It just stores
>> that pointer into a variable, and only sometime later when the actual
>> content of the value is demanded, do you see the error raised.
>> 
>> The solution to that is to do something that uses the contents of the
>> busted column right away while still inside the EXCEPTION block, perhaps
>> along the lines of "select md5(mycolumn) into local_variable from..."
>> 
>> A close reading of
>> 
>> https://www.postgresql.org/docs/current/storage-toast.html
>> 
>> would probably help you understand what's happening here.
>> 
>>regards, tom lane
> 
> 


Re: PL/pgSQL question about EXCEPTION clause & corrupt records

2020-02-17 Thread Jeremy Schneider
On 2/17/20 08:23, Jeremy Schneider wrote:
> FWIW, Bertrand blogged an even faster way to do this about a month ago -
> using pageinspect and processing blocks instead of rows
> 
> https://bdrouvot.wordpress.com/2020/01/18/retrieve-postgresql-variable-length-storage-information-thanks-to-pageinspect/
> 
> 
>> On Feb 17, 2020, at 03:32, Nick Renders  wrote:
>>
>> The problem is that I don't know which column is corrupt. But I found
>> a solution: by simply copying the record into another variable, the
>> values are parsed and the TOAST errors are thrown.
>>
>> In case anyone's interested, here's my code, based on an example from
>> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html

Apologies - sent that slightly rushed email while I was taking the bus
to the office this morning and linked the wrong blog post :D

Here's the right one, showing a much faster way to identify which tuple
links to a bad toast row:

https://bdrouvot.wordpress.com/2020/01/04/get-toast-chunk_id-from-the-user-table-tuples-or-from-the-toast-index-thanks-to-pageinspect/

-Jeremy

-- 
http://about.me/jeremy_schneider




Re: Real application clustering in postgres.

2020-03-06 Thread Jeremy Schneider


On 3/6/20 01:25, Ron wrote:
> On 3/5/20 6:07 AM, Laurenz Albe wrote:
>> On Thu, 2020-03-05 at 07:45 +, Daulat Ram wrote:
>>> Is there any possibility/options to setup a real application clustering in 
>>> Postgres as in Oracle we have a  RAC feature.
>> No, and as far as I know nobody feels interested in providing it.
>>
>> RAC is a complicated architecture that doesn't do much good, so most
>> people feel that it would be a waste of time and effort.
>>
>> RAC ist not really a scaling solution: because of the shared storage,
>> you can only scale for more CPUs; I/O remains the bottleneck.
>> RAC is not really a high availability solution: because of the shared
>> storage, it has a sibgle point of failure.
> 
> This is utter nonsense.  Dual redundant storage controllers connected to
> disks in RAID-10 configurations have been around for *at least* 25 years.
> 
> Oracle got it's clustering technology from DEC, and I *know* that
> works.  Cluster members, storage controllers and disks have all gone
> down, while the database and application keep on humming along.

Just want to point out that while the merits of RAC are considered as
universal truth in the Oracle marketing community, they are certainly
still debated and nuanced within the Oracle technical community.

Some great reading would be Mogens Norgaard's article "You Probably
Don't Need RAC" from 2003 (which is surprisingly hard to find on the
internet) and much of the ensuing discussion between then and now, for
example this recent message over on the oracle-l list:

https://www.freelists.org/post/oracle-l/Chuckleworthy-issue-of-the-NoCOUG-Journal,1

For my part, I spent many years running RAC clusters and solving the
usual handful of problems you bump into, and I definitely have a strong
bias now toward as simple of architectures as possible.  Regardless of
which parties participate in your operations, and regardless of who owns
the data center where your stack is running.  Note that I apply this to
all the new databases as well. I enjoy and appreciate the opportunities
I've had to work on some really interesting new technology - but I also
still see merit in simple, boring, mature, well-understood architectures
if they are doing the job. Complexity will find us all soon enough
driven by true business needs without being helped by the pursuit of
shiny things!

It seemed to me there was a major and well-funded effort to market and
sell cluster databases for many years, and as a result I suspect that
while there are certainly some good use cases, there are probably also
some people using RAC today who would do fine (or better) without it.

Seems I even touched on this all the way back in 2007 in an article on
my own blog...  https://ardentperf.com/2007/05/10/fidelity-case-study/

-Jeremy

-- 
http://about.me/jeremy_schneider




Re: Efficiently advancing a sequence without risking it going backwards.

2020-07-09 Thread Jeremy Schneider


> On Jul 6, 2020, at 19:06, Paul McGarry  wrote:
> 
> I don't think I can use setval(), because it risks making sequences go 
> backwards, eg:
> 
> 1) Check values
> DB1sequence: 1234
> DB2sequence: 1233 (1 behind)
> 2) setval('DB2sequence',1234);
> 
> but if between (1) and (2) there are 2 nextval(DB2sequence) calls on another 
> process,  (2) would take the sequence back from 1235 to 1234 and I would end 
> up trying to create a duplicate key ID from the sequence.

An ability to “lock” the sequence momentarily would give you the tool you need, 
but I don’t think it’s there.

Total hack, but if your application or users can retry when the rare error is 
encountered then one idea is to rename the sequence momentarily while you do 
the setval() then rename it back. Do an initial check without renaming, then 
re-check after renaming and before the setval() call.

If you put retry logic into your application then make sure to include back-off 
logic so you don’t get an outage induced by thundering herd.

-Jeremy

Sent from my TI-83






Re: Efficiently advancing a sequence without risking it going backwards.

2020-07-09 Thread Jeremy Schneider


>> On Jul 9, 2020, at 14:08, Christopher Browne  wrote:
> 
>> On Thu, 9 Jul 2020 at 12:59, Jeremy Schneider  
>> wrote:
> 
>> 
>> > On Jul 6, 2020, at 19:06, Paul McGarry  wrote:
>> > 
>> > I don't think I can use setval(), because it risks making sequences go 
>> > backwards, eg:
>> > 
>> > 1) Check values
>> > DB1sequence: 1234
>> > DB2sequence: 1233 (1 behind)
>> > 2) setval('DB2sequence',1234);
>> > 
>> > but if between (1) and (2) there are 2 nextval(DB2sequence) calls on 
>> > another process,  (2) would take the sequence back from 1235 to 1234 and I 
>> > would end up trying to create a duplicate key ID from the sequence.
>> 
>> An ability to “lock” the sequence momentarily would give you the tool you 
>> need, but I don’t think it’s there.
> 
> The use-case where you need a lock on the value so that there can't possibly 
> be a hole in the sequence

OP asked for a way to call setval() with a guarantee the sequence will never go 
backwards IIUC. His code can check that the new value he wants to set is higher 
than the current value, but there’s a race condition where a second connection 
could quickly advance the sequence between the check and the setval() call and 
then cause duplicates from the next call which is bad.

The ideal solution is a setval_forward_only() or setval_no_duplicates() 
function that does it atomically or something. If it were possible to “lock” 
the entire sequence to prevent any other sessions from using it at all, that 
would work too. Not locking a value, locking the whole thing. Very bad hack 
solution is renaming the sequence then renaming it back as a blunt form of 
locking... and to be clear I don’t think is a good idea I just was saying that 
technically it might work.  :)

-Jeremy

Sent from my TI-83



Re: Could not open file pg_xact/0E97

2020-07-20 Thread Jeremy Schneider

> On Jul 18, 2020, at 14:18, Radoslav Nedyalkov  wrote:
> 
> 
> Well. the vacuum full failed with 
> 
> vacuumdb: vacuuming of table "olap.transactions_and_fees_2020_01" in database 
> "db" failed: ERROR:  found xmin 3916900817 from before relfrozenxid 80319533

Do you have checksums enabled for this database?

-Jeremy

Sent from my TI-83



Re: Orphaned relations after crash/sigkill during CREATE TABLE

2020-08-20 Thread Jeremy Schneider
On 8/20/20 14:46, Tom Lane wrote:
> There might be value in a user-invokable tool that runs in an existing
> non-crashed database and looks for orphan files, but I'm not aware that
> anyone has written one.  (Race conditions against concurrent table
> creation would be a problem; but probably that can be finessed somehow,
> maybe by noting the file's creation time.)

Bertrand Drouvot just put out a tool to list potentially orphaned files
a couple months ago.

https://github.com/bdrouvot/pg_orphaned

Implemented as an extension, doesn't remove the files, but useful
nonetheless.

-Jeremy


-- 
http://about.me/jeremy_schneider




Re: Returning timestamp with timezone at specified timezone irrespective of client timezone

2020-09-27 Thread Jeremy Schneider
On 9/27/20 16:13, Ron wrote:
> On 9/27/20 4:16 PM, aNullValue (Drew Stemen) wrote:
>> What I need is for the ability to return a timestamp with timezone,
>> using the UTC offset that corresponds to a column-defined timezone,
>> irrespective of the client/session configured timezone.
>>
>> I have three columns in a table:
>> Timezone: 'US/Eastern'
>> Date: 2020-10-31
>> Time: 08:00
>>
>> The output I'm able to find includes these possibilities:
>> '2020-10-31 08:00:00'
>> '2020-10-31 12:00:00+00'
>>
>> Whereas what I actually need is:
>> '2020-10-31 08:00:00-05'
>>
>> Using the postgresql session-level timezone configuration won't work
>> because I need multiple timezones to be handled in a single set.
> 
> Are you really asking what the TZ offset was on a specific date (Like
> DST or not)?

IIUC, there is a gap here in PostgreSQL. i think it could most
easily/quickly be addressed with an overloaded version of to_char that
accepts a "display timezone" for its timestamp to character conversion.

FWIW - in Oracle this is handled by having two different data types:
1) TIMESTAMP WITH TIME ZONE
2) TIMESTAMP WITH LOCAL TIME ZONE

ironically, oracle's "local" data type is the equivalent PostgreSQL's
timestamp with time zone where the timestamp is converted and
processed/stored without a time zone. afaik postgresql doesn't have a
native data type equivalent to the first variant in oracle, which
actually considers the time zone as part of the data. (am i missing
something?)

in lieu of having built-in support, a PL/pgSQL function to set the
session-level timezone in between processing each record is the best
approach i've thought of so far.

-Jeremy

-- 
http://about.me/jeremy_schneider




Re: Returning timestamp with timezone at specified timezone irrespective of client timezone

2020-09-27 Thread Jeremy Schneider
On 9/27/20 16:42, Jeremy Schneider wrote:
> On 9/27/20 16:13, Ron wrote:
>> On 9/27/20 4:16 PM, aNullValue (Drew Stemen) wrote:
>>> What I need is for the ability to return a timestamp with timezone,
>>> using the UTC offset that corresponds to a column-defined timezone,
>>> irrespective of the client/session configured timezone.
>>>
>>> I have three columns in a table:
>>> Timezone: 'US/Eastern'
>>> Date: 2020-10-31
>>> Time: 08:00
>>>
>>> The output I'm able to find includes these possibilities:
>>> '2020-10-31 08:00:00'
>>> '2020-10-31 12:00:00+00'
>>>
>>> Whereas what I actually need is:
>>> '2020-10-31 08:00:00-05'
>>>
>>> Using the postgresql session-level timezone configuration won't work
>>> because I need multiple timezones to be handled in a single set.
>>
>> Are you really asking what the TZ offset was on a specific date (Like
>> DST or not)?
> 
> in lieu of having built-in support, a PL/pgSQL function to set the
> session-level timezone in between processing each record is the best
> approach i've thought of so far
FYI, here's the hack approach I was thinking of.

I intentionally didn't preserve the session's timezone in the
transaction, but that could easily be done with a few more lines of
PL/pgSQL.


create or replace function to_char(
   v_tstz timestamp with time zone
  ,v_format text
  ,v_tz text
) returns text language plpgsql
immutable parallel safe
as $$
begin
  perform set_config('timezone',v_tz,true);
  return to_char(v_tstz,v_format);
end;
$$
;

SELECT
 id
,to_char(l.loc_date+l.loc_time::time
 ,'-MM-DD HH24:MI:SSOF'
 ,timezone) tsw
FROM loc l
ORDER BY timezone, loc_date, loc_time
;

 id |  tsw
+
  3 | 2020-10-31 03:00:00-05
  4 | 2020-11-03 08:00:00-06
  1 | 2020-10-31 09:00:00-04
  2 | 2020-11-03 08:00:00-05


https://gist.github.com/aNullValue/ba838d6b40495695df0daa11c2748248


-- 
http://about.me/jeremy_schneider




Re: Returning timestamp with timezone at specified timezone irrespective of client timezone

2020-09-27 Thread Jeremy Schneider
On 9/27/20 20:13, Jeremy Schneider wrote:
> create or replace function to_char(
>v_tstz timestamp with time zone
>   ,v_format text
>   ,v_tz text
> ) returns text language plpgsql
> immutable parallel safe
> as $$
> begin
>   perform set_config('timezone',v_tz,true);
>   return to_char(v_tstz,v_format);
> end;
> $$
> ;


Just occurred to me, I don't know if this is actually parallel safe. I'm
not sure how transaction-level session configuration is handled inside
parallel workers. Might be best to leave off the "parallel safe" flag
from the function for now.

-J

-- 
http://about.me/jeremy_schneider




Re: Monitoring multixact members growth

2022-08-19 Thread Jeremy Schneider
On 8/19/22 12:52 AM, Vido Vlahinic wrote:
> My goal here is to predict where multixact members are growing the
> fastest so I can perform manual VACUUM FREEZE only on those tables
>
> (typically with multi-billion row count) when system is relatively
> idle as opposed to just sit and wait for wraparound protection to take
> over
>
> when autovacuum_multixact_freeze_max_age threshold is reached (slowing
> the whole system down). 
>

I think that you're probably approaching this wrong. Vacuum is something
that you generally want to run more aggressively, not less. But to be
fair, it's a very common misunderstanding that waiting to do vacuum
processing until later can be a good idea... even though in fact it
works in the opposite way - on systems with significant load (where it
matters) - sometimes a long-running report or query that needs old row
versions for its own processing might cause a lot of table and index
bloat and negatively impact real-time transactional performance. (For
really long-running stuff, it's sometimes better to use a snapshot of
the DB or maybe a standby system that's disconnected from the primary
for reporting and periodically replays logs to catch up. But obviously
you start simple and don't add this complexity to the architecture until
it's truly needed.)

Funny thing is that I've had to do exactly what you're asking about, as
part of troubleshooting problems - but the goal wasn't to run vacuum
later but to run a vacuum freeze IMMEDIATELY.  🙂  As one example,
pile-ups on LWLock multixact_offset.

Here's one pageinspect query that did the trick for me. In the first
line (WITH...) you change public.my_test to the table you want to
inspect. This only looks at a single table and it was for
troubleshooting the aforementioned wait event, so it's actually breaking
down mxid's by SLRU page numbers. If you're seeing a large number of
SLRU pages (lots of rows coming back) then that means you might want to
proactively run a manual vacuum freeze. (And then see if you can update
the app code to reduce mxid usage!)

I'm not answering your question, but thought it was a nice excuse to
share a related query and pontificate a bit... hopefully useful to someone!

-Jeremy

=

pg-14.4 rw root@db1=# create extension pageinspect;
CREATE EXTENSION
Time: 7.561 ms

pg-14.4 rw root@db1=# with tab_name as (select 'public.my_test' t)
select min(now()) current_timestamp, count(*) number_mxid_on_page,
min(xmax) min_mxid, max(xmax) max_mxid,
  trunc((xmax)/(8192/4)) page_no
from (
  select ((attrs).t_infomask::bit(16) & x'1000'::bit(16))::int::boolean
is_multixact,
 (attrs).t_infomask::bit(16) infomask,
 (attrs).t_xmax::text::integer xmax
   from (
 select page,heap_page_item_attrs(get_raw_page((select t from
tab_name),page),
  (select t from
tab_name)::regclass) attrs
 from generate_series(0,(select relpages from pg_class where
oid=(select t from tab_name)::regclass)-1) page
   ) subq where (attrs).t_infomask is not null
 ) subq3 where is_multixact
group by trunc((xmax)/(8192/4));
 current_timestamp | number_mxid_on_page | min_mxid | max_mxid | page_no
---+-+--+--+-
(0 rows)

Time: 2223.640 ms (00:02.224)

pg-14.4 rw root@db1=# begin;
BEGIN
Time: 0.466 ms

pg-14.4 rw root@db1=# select * from my_test where i<5 for update;
 i | data
---+--
 1 | 
 2 | 
 3 | 
 4 | 
(4 rows)

Time: 50.074 ms

pg-14.4 rw root@db1=# savepoint a;
SAVEPOINT
Time: 0.605 ms

pg-14.4 rw root@db1=# update my_test set i=i-10 where i<5;
UPDATE 4
Time: 49.481 ms

pg-14.4 rw root@db1=# with tab_name as (select 'public.my_test' t)
select min(now()) current_timestamp, count(*) number_mxid_on_page,
min(xmax) min_mxid, max(xmax) max_mxid,
  trunc((xmax)/(8192/4)) page_no
from (
  select ((attrs).t_infomask::bit(16) & x'1000'::bit(16))::int::boolean
is_multixact,
 (attrs).t_infomask::bit(16) infomask,
 (attrs).t_xmax::text::integer xmax
   from (
 select page,heap_page_item_attrs(get_raw_page((select t from
tab_name),page),
  (select t from
tab_name)::regclass) attrs
 from generate_series(0,(select relpages from pg_class where
oid=(select t from tab_name)::regclass)-1) page
   ) subq where (attrs).t_infomask is not null
 ) subq3 where is_multixact
group by trunc((xmax)/(8192/4));
   current_timestamp   | number_mxid_on_page | min_mxid |
max_mxid | page_no
---+-+--+--+-
 2022-08-19 23:05:43.349723+00 |   4 |    1 |   
1 |   0
(1 row)

Time: 2117.555 ms (00:02.118)

pg-14.4 rw root@db1=#


-- 
http://about.me/jeremy_schneider


Re: postgis after pg_upgrade

2018-12-04 Thread Jeremy Schneider
On 11/30/18 05:11, Slavcho Trnkovski wrote:
> I have postgresql 9.4 with postgis extension installed (latest version,
> 2.4.5). I upgraded postgresql from 9.4 to 9.6. 
> After upgrading to 9.6 I get the following result
> from PostGIS_full_version()
>  select PostGIS_full_version();
> 
>  ... (procs need upgrade for use with "96") ...
> 
> Is there any way to resolve this besides recreating the extension?

If I'm reading the postgis docs correctly, this message specifically
means that you must do a full dump/reload of postgis-enabled databases.

http://postgis.net/docs/manual-2.4/postgis_installation.html#upgrading

> If you can't find the postgis_upgrade*.sql specific for upgrading
> your version you are using a version too early for a soft upgrade
> and need to do a HARD UPGRADE.
>
> The PostGIS_Full_Version function should inform you about the need
> to run this kind of upgrade using a "procs need upgrade" message.
>
> By HARD UPGRADE we mean full dump/reload of postgis-enabled
> databases.

-- 
http://about.me/jeremy_schneider



Re: Transaction Id Space, Freezing and Wraparound

2018-12-08 Thread Jeremy Schneider
On 11/20/18 13:00, Tom Lane wrote:
> If the stored XIDs were 64 bits wide, we'd not have to bother with all
> of this mess ... but adding another 64 bits to tuple headers would be
> a painful space cost, not to mention the storage compatibility issues.

People keep saying that. But didn't someone come up with a way to do
this by storing the top 32 bits just once somewhere in the block, rather
than adding 64 bits to every tuple? I can't remember if there was an
email thread.

Maybe some other drawbacks to that approach, but lets at least point out
adding 64 bits to the tuple header isn't the only way to accomplish this.

And the other thread about memory management... if I'm going to start
religious wars, might as well just start them all at once right?  ;)

-J

-- 
http://about.me/jeremy_schneider



Re: Amazon Linux Support?

2019-05-07 Thread Jeremy Schneider


> On Apr 30, 2019, at 07:16, Adrian Klaver  wrote:
> 
>> On 4/30/19 2:02 AM, Lewis Shobbrook wrote:
>> Hi Guys,
>> With the repo changes associated with the April 17 changes,
>> https://pgstef.github.io/2019/04/17/one_rpm_to_rule_them_all.html
>> It is evident that support for amazon linux has been dropped.
>> While you can try to use redhat pgdp packages, they are not
>> installable on Amazon Linux, giving the following error...
>> /etc/redhat-release is needed by pgdg-redhat-repo-42.0-4.noarch
>> I confident that the rest rpms available in the repo can be used by
>> Amazon linux, but the redhat-release dependency is painful for
>> automation with existing recipes such as postgresql in chef.
>> Before I invest the effort to produce a pull request to address this,
>> I wanted to confirm that amazon linux has been dropped and will no
>> longer, nor likely ever be supported for the rpms repo's, or is it
>> still on the road map?
> 
> You will need a community account to see the issue:
> 
> https://redmine.postgresql.org/issues/4205
> 
> The relevant part:
> 
> Updated by Devrim Gündüz 4 days ago
> 
> "
> Hi,
> 
> We dropped Amazon Linux support years ago, there were lots of compatibility 
> issues. The new repo RPMs just reflect that.
> 
> Please switch to a supported distro, or rebuild RPMs from SRPMS.
> 
> Regards, Devrim
> 

I've been using the rhel6 repo with AL without issue for a long time now.  it 
looks like they added a new dependency that breaks this, but I recently tested 
this workaround and it seems to work:
`rpm -Uvh --nodeps 
https://download.postgresql.org/pub/repos/yum/testing/11/redhat/rhel-6-x86_64/pgdg-redhat11-11-2.noarch.rpm`

the other thing is that I do is a `s/$releasever/6/g` on 
`/etc/yum.repos.d/pgdg-*.repo` -- i have this all baked into some ansible 
scripts and my scripts are working fine just now when i spun up a new 
ec2/postgres test server

-Jeremy

Sent from my TI-83





Re: PostgreSQL on Amazon RDS

2019-05-08 Thread Jeremy Schneider
On 5/6/19 23:27, Rashmi V Bharadwaj wrote:
> Is there a SQL query or a database parameter setting that I can use from
> an external application to determine if the PostgreSQL database is on
> cloud (like on Amazon RDS or IBM cloud) or on a non-cloud on-prem
> environment?

Here's my psqlrc file which has pretty solid logic for detecting between
Community PostgreSQL, RDS PostgreSQL and Aurora PostgreSQL.  Note that
it just assumes "community/oss" as a default if it doesn't detect the
other two.  Should be easy enough to add detection of other hosted
environments into the query with the "priority" column (as long as
there's a reliable way to detect).

https://gist.github.com/ardentperf/52bd418e44b1be26d7b63af21331cece

This psqlrc also detects read-write versus read-only (e.g. hot standby),
and the usual username/database/superuser info and builds everything
into the prompt in a way that suits me.

-Jeremy

-- 
http://about.me/jeremy_schneider




Re: Hot Standby Conflict on pg_attribute

2019-05-11 Thread Jeremy Schneider
Just a quick footnote: If autovac truncations are frequently causing replica 
lag, and if this is a problem for you, IIUC one way you can stop autovac from 
doing the truncations even on older versions is setting old_snapshot_threshold 
to any value at all besides zero.  (On 12+ you can directly control the 
truncation behavior.)

-Jeremy

Sent from my TI-83

> On May 10, 2019, at 12:46, Erik Jones  wrote:
> 
>> On Fri, May 10, 2019 at 12:41 PM Tom Lane  wrote:
> 
>> Andres Freund  writes:
>> > On 2019-05-09 13:03:50 -0700, Erik Jones wrote:
>> >> The question then is: Why would these user queries be waiting on an
>> >> AccessShare lock on pg_attribute?
>> 
>> > Queries that access a table for the *first* time after DDL happened
>> > (including truncating the relation), need an AccessShareLock on
>> > pg_attribute (and pg_class, pg_index, ...) for a short time.
>> 
>> Also, it seems likely that what's really triggering the issue is
>> autovacuum on pg_attribute trying to truncate off empty pages
>> in pg_attribute (after a bunch of dead rows were generated there
>> by DDL activity).  That requires exclusive lock on pg_attribute,
>> which would propagate down to the standby.
>> 
>> regards, tom lane
> 
> Right, that part I understood after checking out pg_attribute's insert/delete 
> counts in pg_stat_sys_tables before and after some REFRESH MATERIALIZED VIEW 
> runs on an otherwise idle server.  With them running 2k+ refreshes per day 
> autovac is regularly working on their catalog tables.
> 
> Thanks!
> -- 
> Erik Jones
> mag...@gmail.com


Re: logical replication on 9.6: replica configuration

2019-05-12 Thread Jeremy Schneider
Because you’re on 9.6, you can’t use the native create subscription/publication 
sql commands. I’d check out pglogical from 2nd quadrant; this should be 
supported on RDS 9.6 iirc.

https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/

That said, I’m a bit curious what security concern would disallow a physical 
hot standby and allow a logical replication setup - especially in a managed 
environment... are you able to share any more specifics about the reasoning 
here? Maybe the requirement is for the replica to be in a different account or 
something?

-Jeremy


Sent from my TI-83

> On May 12, 2019, at 20:51, Paul Costello  wrote:
> 
> I need to create a read-only replica in AWS environment.  I can't use AWS/RDS 
> built-in read replica functionaility due to security concerns, so I'm tied to 
> using logical replication, on postgres 9.6.  How do I configure a replica db 
> to receive the contents of the replication slot on master?  
> 
> I was able to get logical replication going with the following 2 commands:
> 
> SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');
> pg_recvlogical -d test --slot test_slot --start -f - &
> 
> I'm very unclear on where to go from there to stand up a read-replica db.
> 
> Thanks,
> Paul C.
> 


Re: Native Logical Replication Initial Import Qs

2023-06-07 Thread Jeremy Schneider
On 6/7/23 2:12 PM, Don Seiler wrote:
> On the logical replication front, the concern is with the initial data
> import that happens when the subscription is created (by default). I
> know that you can tell the subscription to not copy data and instead use
> pg_dump and a replication slot snapshot to achieve this manually.
> However I'm unable to explain (to myself) why this is better than just
> having the subscription do it upon creation. Given that I can create
> pub/sub sets for individual tables for parallel operations, I'm curious
> what advantages there are in using pg_dump to do this import.

FWIW, I think the place this feature shines the most is when you can
safely leverage things like storage-level snapshots. Sometimes that
means you can get a copy of a multi-TB database almost instantly if the
storage or filesystem does copy-on-write, for example database lab
(postgres.ai) which uses ZFS.

Another thing I can think of is that while it's true you can create
multiple pub/sub sets, I'm not sure you can reduce the number of sets
later. So if you were concerned about having too many slots doing
decoding on the source, then you might want the flexibility of pg_dump
(or perhaps restoring a backup) to get more parallelism while having
more control over how many slots will be used later.

In your case, the whole setup is hopefully temporary, so maybe these
particular concerns aren't as relevant to you.

This is just what comes to mind... probably there's a few more things
I'm not thinking of and hopefully others will chime in.  :)

-Jeremy


-- 
http://about.me/jeremy_schneider





typical active table count?

2023-06-27 Thread Jeremy Schneider
Question for other PostgreSQL users

On your moderately busy DB, how many different tables might receive at
least one change/DML during a 10 second window?

10? 50? 100? More? Ballpark guess off the top of your head.

I'm in a discussion & there's questions about whether it's unusual to have
more than 10 or so. The answer isn't clear to me.

Probably worthwhile to call out partitioning explicitly (ie. if this
factors into an answer then mention that fact)

-Jeremy


-- 
http://about.me/jeremy_schneider


typical active table count?

2023-06-27 Thread Jeremy Schneider
Question for other PostgreSQL users

On your moderately busy DB, how many different tables might receive at
least one change/DML during a 10 second window?

10? 50? 100? More? Ballpark guess off the top of your head.

I'm in a discussion & there's questions about whether it's unusual to have
more than 10 or so. The answer isn't clear to me.

Probably worthwhile to call out partitioning explicitly (ie. if this
factors into an answer then mention that fact)

-Jeremy

-- 
http://about.me/jeremy_schneider


Re: typical active table count?

2023-06-27 Thread Jeremy Schneider
On 6/27/23 9:32 AM, Ben Chobot wrote:
> We certainly have databases where far more than 100 tables are updated
> within a 10 second period. Is there a specific concern you have?
> 

Thank Ben, not a concern but I'm trying to better understand how common
this might be. And I think sharing general statistics about how people
use PostgreSQL is a great help to the developers who build and maintain it.

One really nice thing about PostgreSQL is that two quick copies of
pg_stat_all_tables and you can easily see this sort of info.

If you have a database where more than 100 tables are updated within a
10 second period - this seems really uncommon to me - I'm very curious
about the workload.

For example:

1) Is the overall total number of tables for this database in the
thousands, 10s of thousands or 100s of thousands?

2) How many CPUs or cores does the server have?

3) Are you using partitions and counting each one? What's the number if
you count each partitioned table as a single table?

4) Would you characterize this database as SaaS, ie. many copies of a
similar schema?  Or is it one very large schema of many different tables?

-Jeremy


-- 
http://about.me/jeremy_schneider





Re: typical active table count?

2023-06-27 Thread Jeremy Schneider
On 6/27/23 12:08 PM, Ron wrote:
> On 6/27/23 13:47, Jeremy Schneider wrote:
>> On 6/27/23 9:32 AM, Ben Chobot wrote:
>>> We certainly have databases where far more than 100 tables are updated
>>> within a 10 second period. Is there a specific concern you have?
>>>
>> Thank Ben, not a concern but I'm trying to better understand how common
>> this might be. And I think sharing general statistics about how people
>> use PostgreSQL is a great help to the developers who build and maintain it.
>>
>> One really nice thing about PostgreSQL is that two quick copies of
>> pg_stat_all_tables and you can easily see this sort of info.
>>
>> If you have a database where more than 100 tables are updated within a
>> 10 second period - this seems really uncommon to me - I'm very curious
>> about the workload.
> 
> 100 tables updates just means /possibly complicated schema/, not
> necessarily high volume.
> 
> ...
> 
> And honestly, 100 tables in 10 seconds is 10 tables/second.  If each
> gets one insert, that's a laughably slow transaction rate.  (Unless of
> course there's 85 indices per table, and foreign keys don't have
> supporting indices.)

I don't think the math actually works this way on highly concurrent
systems. In fact, this morning I connected with a coworker who works on
Amazon fulfillment center DBs and there was almost no difference in the
number of tables with insert/update/delete regardless of whether you
looked at a 10 second window or a 2 second window. I was also able to
chat with another coworker at Amazon who got numbers from a couple of
their PG databases, and connected w one person on slack at a different
company who passed along numbers, and got a few emails from Oracle folks.

The numbers reported back to me ranged from 29 to over a hundred.
Obviously there are also lots of small databases behind wordpress
websites with much less activity, but I found this to be an interesting
measure of some respectably busy systems.

The original context was a conversation related to logical replication
of DB changes.

But then I got interested in the general question and topic - and
someone on the Oracle side mentioned system tables which is a really
good point that hadn't occurred to me yet. The original conversation was
concerned with user tables and not system ones, but there would be a
fair amount of ongoing system table activity too.

Besides partitioning, another interesting dimension of the conversation
has been thinking about different categories of workloads. For example:
SaaS or multitenant applications with many copies of a similar schema,
ISVs, ERPs, or large enterprise databases with lots of development
history. All of these categories can easily ramp up the counts.

I'm still interested in more data - if anyone reading this can grab a
couple snapshots of pg_stat_all_tables and report back numbers for a 10
second window and a 2 second window, that would be amazing!

-Jeremy


-- 
http://about.me/jeremy_schneider





Re: Corruption or wrong results with 14.10?

2023-11-24 Thread Jeremy Schneider
On Thu, Nov 23, 2023 at 7:34 AM Torsten Förtsch 
wrote:

> On Thu, Nov 23, 2023 at 2:29 PM Daniel Westermann (DWE) <
> daniel.westerm...@dbi-services.com> wrote:
>
>> smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group
>> by  crart_id, chemin having count(*) > 1;
>>   crart_id   | chemin | count
>> -++---
>>  3505463 | @  | 2
>>   473731 | @  | 2
>>  1063238 | @  | 2
>>  1468088 | @  | 2
>>   462723 | @  | 2
>>  1099378 | @  | 2
>> 
>>
>> I think I know what I have to do.
>>
>
> If you have moved the database from a system with glibc <2.28 to >=2.28,
> that could be the reason.
>
> https://wiki.postgresql.org/wiki/Collations
>


Just one note here... the reason can be _any_ operating systems move or
update.  It can happen with ICU and it can happen with any version of glibc
(this is easily reproducible and we've seen it happen on production PG
deployments that didn't involve glibc 2.28)

glibc 2.28 has certainly been the most obvious and impactful case, so the
focus is understandable, but there's a bit of a myth that the problem is
only with glibc 2.28 (and not ICU or other glibc versions or data
structures other than indexes)

The only truly safe way to update an operating system under PosgreSQL is
with logical dump/load or logical replication, or continuing to compile and
use the identical older version of ICU from the old OS (if you use ICU).  I
think the ICU folks are generally careful enough that it'll be unlikely for
compiler changes and new compiler optimizations to inadvertently change
collation on newer operating systems and build toolchains.

Ironically I just did a detailed talk on this topic at PASS Data Summit
last week, but unfortunately the recording probably won't be released for
awhile.  🙂

-Jeremy


-- 
http://about.me/jeremy_schneider


Re: Sample data generator for performance testing

2024-01-03 Thread Jeremy Schneider
On 1/2/24 11:23 PM, arun chirappurath wrote:
> Do we have any open source tools which can be used to create sample data
> at scale from our postgres databases?
> Which considers data distribution and randomness

I would suggest to use the most common tools whenever possible, because
then if you want to discuss results with other people (for example on
these mailing lists) then you're working with data sets that are widely
and well understood.

The most common tool for PostgreSQL is pgbench, which does a TPCB-like
schema that you can scale to any size, always the same [small] number of
tables/columns and same uniform data distribution, and there are
relationships between tables so you can create FKs if needed.

My second favorite tool is sysbench. Any number of tables, easily scale
to any size, standardized schema with small number of colums and no
relationships/FKs.  Data distribution is uniformly random however on the
query side it supports a bunch of different distribution models, not
just uniform random, as well as queries processing ranges of rows.

The other tool that I'm intrigued by these days is benchbase from CMU.
It can do TPCC and a bunch of other schemas/workloads, you can scale the
data sizes. If you're just looking at data generation and you're going
to make your own workloads, well benchbase has a lot of different
schemas available out of the box.

You can always hand-roll your schema and data with scripts & SQL, but
the more complex and bespoke your performance test schema is, the more
work & explaining it takes to get lots of people to engage in a
discussion since they need to take time to understand how the test is
engineered. For very narrowly targeted reproductions this is usually the
right approach with a very simple schema and workload, but not commonly
for general performance testing.

-Jeremy


-- 
http://about.me/jeremy_schneider





Re: Migrating From CentOS-6 Oracle 11g R2 To CentOS-7 PostgreSQL ??

2017-12-06 Thread Jeremy Schneider
On 12/5/17 12:07, Kevin Brannen wrote:
> 1. What version of PostgreSQL should I use?
> 
> The latest that works for you. Some people don’t like .0 releases
> for various reasons and if that applies to you, then use the latest
> of 9.6.x (9.6.5 I think); else use 10.0.

In addition, 10.1 is available as of Nov 9th


> 2. I don't want to use a RPM because I like controlling where
> software is installed, so where can I locate the required tar ball?

I wonder if the PGDG rpms are relocatable packages? I haven't tried that
yet.


> You can download the source and build it yourself; there’s an arg
> for where you want to install it. The instructions are pretty easy
> to follow if you’ve ever built anything.

Agree you can run the build script easily enough, although personally
I'd prefer a package over doing my own builds. Maybe it's not a priority
to you, but if you're building stuff yourself and you really want to do
it right then you should be tracking a lot of stuff and making sure you
are using all the right configure options and making sure you can always
perfectly reproduce builds. Probably should be creating and archiving
your own packages... but depends on your requirements.  :)


> 5. What additional information might be helpful?

Just last night we had a great talk at the Seattle PostgreSQL User Group
about moving from MSSQL to PG and many considerations apply equally. In
particular, you can find lots of code to make PostgreSQL act more like
your source DB... but you want to carefully consider whether this is the
right approach in your situation.  (There be flame-wars here.)

And speaking of PUGs, you could check for a local one of those and get
connected up to ask future questions face-to-face too.  :)

-J

-- 
http://about.me/jeremy_schneider



Re: Is it possible to sort strings in EBCDIC order in PostgreSQL server?

2018-01-23 Thread Jeremy Schneider
On 12/12/17 10:21, Peter Geoghegan wrote:
> ICU supports creating custom collations that reorder upper and lower
> case, or digits with scripts (e.g. Latin alphabet characters). See the
> documentation -- "23.2.2.3.2. ICU collations". Advanced customization
> is possible.

I just gave this a try and I'm not sure I completely understand what
it's doing but it definitely doesn't look like it's sorting according to
EBCDIC byte orders. (This was on centos 7 with libicu-50.1.2) I might
have created the collation incorrectly. I'd love to hear if anyone else
finds a way to get this working outside of custom comparison operators.

pops-10.1 root@db1=# create collation "ebcdic" (provider=icu,
locale='cp037');
CREATE COLLATION

pops-10.1 root@db1=# create table test (data text);
CREATE TABLE

pops-10.1 root@db1=# insert into test
values('a'),('b'),('A'),('B'),('1'),('2'),('!'),('^');
INSERT 0 8

pops-10.1 root@db1=# select * from test order by data collate "ebcdic";
 data
--
 !
 ^
 1
 2
 a
 A
 b
 B
(8 rows)


-- 
http://about.me/jeremy_schneider



Re: Disabling vacuum truncate for autovacuum

2024-12-26 Thread Jeremy Schneider
On Mon, 16 Dec 2024 16:25:06 -0800
Will Storey  wrote:

> I would like to disable vacuum's truncate behaviour for autovacuum.
> Previously I had an outage due to its access exclusive lock when it
> was replicated to a hot standby.
> 
> When that outage happened it was from a VACUUM call in a cronjob
> rather than autovacuum. I now run such VACUUMs with TRUNCATE false
> which avoids the issue for these. However I've realized that
> autovacuum could cause this as well. This is of interest to me
> because I'm looking at tuning autovacuum and getting rid of the
> cronjob, but I've realized relying on autovacuum could be dangerous
> because of the truncates.

Can you tell us a little bit more about the outage? Autovacuum is
designed to quickly relinquish this lock if there is any contention, and
the dangers of disabling autovacuum are significant, so your statement
about autovac being "dangerous" will raise a lot of eyebrows.

Did your outage involve hot standbys serving read-only traffic, or did
it only involve a read-write database?

What was the exact nature of the outage and how did you narrow down the
cause to the exclusive lock held specifically during an autovacuum
truncation?

-Jeremy




Re: Disabling vacuum truncate for autovacuum

2024-12-26 Thread Jeremy Schneider
On Thu, 26 Dec 2024 13:24:03 -0800
Will Storey  wrote:

> My incident was actually not caused by autovacuum. A VACUUM was run
> against the primary by a cronjob. A web service running read queries
> against hot standbys went down for several minutes as its queries
> were stuck in a lock queue.
> 
> ...
> 
> As I recall, I confirmed the cause via query logs. I noticed the
> table was vacuumed at the time, which lead me to learning about the
> page truncation behaviour. It has been a couple years though.

Ah - thanks - this is very helpful. I have also seen issues
specifically with hot standbys, which continue holding the exclusive
lock even when the primary read-write instance releases the lock.

A better solution in my opinion would be to enhance the WAL replay
process so that it can somehow temporarily relinquish the exclusive lock
under contention, similar to what the primary read-write instance is
able to do.

This is not an easy enhancement to make. Maybe we'd need the primary to
put more information into the WAL than it does today. Maybe we'd need
to leverage hot_standby_feedback to enable standbys to signal a primary
to release the lock.

Anyway thanks for the report - we need people reporting these issues on
the lists so that there's a little visibility into the impact.

Personally I'm still hesitant about the idea of globally disabling
vacuum truncation. That was never the goal of the
old_snapshot_threshold feature, interesting that you were able to
capitalize on the side-effect. Personally I'd still favor disabling it
only on the tables that are both frequently vacuumed and also
frequently queried on hot standbys.

In a pinch, you could disable it for all tables with a bit of dynamic
SQL and ensuring that new tables created in the future include the
syntax to disable it too.

-Jeremy





Re: glibc 2.35-2.39 upgrade requirements

2025-01-18 Thread Jeremy Schneider
On Fri, 17 Jan 2025 10:27:04 -0500
Tom Lane  wrote:

> Kamen Kalchev  writes:
> > Hi everyone, we're planning to upgrade the OS running Postgres from
> > ubuntu jammy to ubuntu noble. As part of the OS change, the glibc
> > version will be changed from glibc 2.35 to glibc 2.39..
> > Can someone confirm if changing the glibc between those versions
> > will require a full reindex of the Postgres cluster?  
> 
> Maybe, maybe not.  According to [1], the last glibc collation change
> that the PG community really noticed was in glibc 2.28.  So maybe
> there weren't any significant changes between 2.35 and 2.39.  The
> conservative path would certainly be to reindex all textual columns
> (though you can skip any that have collation "C").

I haven't run 2.39 through the scan yet [1]; I should do that because
someone was asking the same question on postgres slack. But note that
every single ubuntu LTS and every single RHEL major release in the last
10 years has had collation changes, except for ubuntu 14.04 ... so it's
worth being cautious. Collations are a bit like time zones - small
changes are always happening, but you might not always notice.

Jeff Davis and I did a talk at the last pgconf about this, the
recording is online [2].

Personally I would recommend using the builtin C collation as database
default starting in pg17, and using ICU to do linguistic collation at
the table or query level when needed. With ICU there's at least the
option to rebuild old versions on new operating system majors, if
needed. (Though rebuilding objects - not just indexes, but anything
depending on the collation - is the best course.)

And be careful about hot standbys, FDWs, and other places where you can
get little surprises with different OS majors. The YouTube recording
has lots of info.

-Jeremy


1: https://github.com/ardentperf/glibc-unicode-sorting
2: https://www.youtube.com/watch?v=KTA6oau7tl8







Re: verify checksums online

2025-04-18 Thread Jeremy Schneider
On Fri, 18 Apr 2025 17:32:19 -0400
Ron Johnson  wrote:

> On Fri, Apr 18, 2025 at 5:18 PM Jeremy Schneider
>  wrote:
> 
> > i think there had been some mailing list discussions years ago? the
> > pg_checksum utility in core still can't do an online check right?
> >
> 
> https://www.postgresql.org/docs/17/app-pgchecksums.html
> 
> "The server must be shut down cleanly before running pg_checksums."

Yes I read the docs. This utility has evolved over time including a
complete rename, and I was just doublechecking here that we didn't miss
a doc update in case this sentence applied to enable/disable but not to
verify. As I said I'm assuming it's all accurate but just doing the
paranoid double-check

-Jeremy




verify checksums online

2025-04-18 Thread Jeremy Schneider
i should know the answer to this... but asking anyway

i think there had been some mailing list discussions years ago? the
pg_checksum utility in core still can't do an online check right?

looking around, i think that michael's pg_checksums_ext still seems to
be available in debian repos, and it advertises online checks, which
makes me think that the core pg_checksum might still not have merged
any code for online checking yet

but i wanted to confirm with someone - my easiest options for online
verifying checksums would be to trigger pg_basebackup then check
pg_stat_database, or to install michael's utility?

-Jeremy




the postgr.es/m/MESSAGE_ID URL format

2025-02-24 Thread Jeremy Schneider
Hello,

Is there a page somewhere on the postgres wiki with a simple high-level
explanation of what the https://postgr.es/m/MESSAGE_ID URL format is
and how to construct one of these URLs?

For example, can this be used for all postgres mailing lists or only
certain lists (like hackers)?

It's probably a two sentence thing, but with some basic searching I
couldn't find a wiki page with those two sentences

It is referenced here for commit messages
https://wiki.postgresql.org/wiki/Commit_Message_Guidance

I also find it useful for mailing list messages

-Jeremy




Re: COLLATION update in 13.1

2025-02-24 Thread Jeremy Schneider
On Mon, 24 Feb 2025 11:08:43 +0100
Matthias Apitz  wrote:

> 
> What is the procedure on 13.1 to bring the external (glibc) version
> in sync with. the used version in the PostgreSQL database?

If I recall correctly, between versions 10 and 14 you need to use ALTER
COLLATION name REFRESH VERSION on every collation where there's a
version mismatch. (This can happen with ICU collations since version 10,
and with glibc collations since version 13.)

I only know of a small number of changes between glibc versions 2.31 and
2.38 ~ I suspect you already know to look out for glibc 2.28 which was
the crazy one. (SLE15 Service Pack 3.) Most databases would be
corrupted by that update, and running "refresh version" would simply
stop the warnings without fixing the corruption.

-Jeremy