RE: PostgreSQL 10.20 crashes / Antivirus

2022-04-20 Thread Thomas, Richard
Adrian Klaver wrote:
> What are the actual commands you are using to do the above?

The command used in a PowerShell script (run with Windows task scheduler) to 
dump each database should evaluate to:

"C:\Program Files\PostgreSQL\10\bin\pg_dump.exe" -b -v -F c -d $dbName -h 
localhost -p 6488 -U backup_su -f $backupFile 2`>`&1 | Out-File $pgdumpLogFile

where:
- "backup_su" is a superuser role (with password stored in the user's 
pgpass.conf file)
- $backupFile and $pgdumpLogFile are in folders excluded from McAfee scanning
- pg_dump.exe executable is not excluded from McAfee on-access scanning 
(although as recommended postgres.exe is)

The actual script segment building the command is:

$pgdumpCmd = "C:\Program Files\PostgreSQL\10\bin\pg_dump.exe"
$pgdumpArgs = @("-b", "-v",
"-F", "c",
"-d", $dbName,
"-h", "localhost",
"-p", "6488",
"-U", " backup_su",
"-f", $backupFile)
cmd /c $pgdumpCmd $pgdumpArgs 2`>`&1 | Out-File $pgdumpLogFile

Note that the .backup file for the first failing pg_dump (after several 
successful ones) is now being produced but is of length 0, with the associated 
pg_dump log file simply reading:

pg_dump: [archiver (db)] connection to database "dbexample" failed: could not 
connect to server: Connection refused (0x274D/10061)
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 6488?
could not connect to server: Connection refused (0x274D/10061)
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 6488?

Richard

-Original Message-
From: Adrian Klaver 
Sent: 19 April 2022 18:11
To: Thomas, Richard ; 
pgsql-general@lists.postgresql.org
Subject: Re: PostgreSQL 10.20 crashes / Antivirus

On 4/19/22 04:59, Thomas, Richard wrote:
> A request for further suggestions for how to fix/diagnose this PG crash 
> (having just got back from holiday I discover that PG is still crashing even 
> though I have changed the pgdump target output location to a folder excluded 
> from the McAfee Antivirus scanning - but thanks for the suggestion Adrian). 
> The last few crashes have occurred only during the backup process, when 
> database pg_dump completes (successfully - including the disconnection 
> message in the log) but before the next database pgdump can begin (that 
> pg_dump command gets the error that it cannot connect to the database as the 
> server is no longer responding). This will happen after dumping several 
> databases - sometimes it manages the complete multiple database dump fine. 
> One new thing I've noticed is that when it fails the file access 0xC022 
> error occurs a second time when PG is trying to restart itself after shutting 
> down all the other processes, suggesting whatever caused the crash is still 
> around (a later manual restart works fine):

 From you original post:

"... although the new pg_dump log file was created (is empty) it does not get 
as far as creating the .backup file ..."

What are the actual commands you are using to do the above?
At Atkins - member of the SNC-Lavalin Group, we work flexible hours around the 
world. Although I have sent this email at a time convenient for me, I don't 
expect you to respond until it works for you.
NOTICE – This email message and any attachments may contain information or 
material that is confidential, privileged, and/or subject to copyright or other 
rights. Any unauthorized viewing, disclosure, retransmission, dissemination, or 
other use of or reliance on this message or anything contained therein is 
strictly prohibited and may be unlawful. If you believe you may have received 
this message in error, kindly inform the sender by return email and delete this 
message from your system. Thank you.


LwLocks contention

2022-04-20 Thread Michael Lewis
We are occasionally seeing heavy CPU contention with hundreds of processes
active but waiting on a lightweight lock - usually lock manager or buffer
mapping it seems. This is happening with VMs configured with about 64 CPUs,
350GBs ram, and while we would typically only have 30-100 concurrent
processes, there will suddenly be ~300 and many show active with LwLock and
they take much longer than usual. Any suggested options to monitor for such
issues or logging to setup so the next issue can be debugged properly?

It has seemed to me that this occurs when there are more than the usual
number of a particular process type and also something that is a bit heavy
in usage of memory/disk. It has happened on various tenant instances and
different application processes as well.

Would/how might the use of huge pages (or transparent huge pages, or OFF)
play into this scenario?


PITR and Temp Tables

2022-04-20 Thread Huan Ruan
Hi All

Let's say at T0 a database has N session based temp tables. They would have
corresponding records in the catalog tables like pg_class and pg_attribute
that are visible to other sessions.

At T1, I do a PITR to T0. That recovered database should not have those
temp tables because the sessions they were created in are not present. My
question is what events trigger the deletion of those temp tables' catalog
records (e.g. pg_class and pg_attribute etc.) in the recovered database?

Thanks
Huan


RE: PITR and Temp Tables

2022-04-20 Thread Patrick FICHE
From: Huan Ruan 
Sent: Wednesday, April 20, 2022 2:18 PM
To: pgsql-general@lists.postgresql.org
Subject: PITR and Temp Tables

Hi All

Let's say at T0 a database has N session based temp tables. They would have 
corresponding records in the catalog tables like pg_class and pg_attribute that 
are visible to other sessions.

At T1, I do a PITR to T0. That recovered database should not have those temp 
tables because the sessions they were created in are not present. My question 
is what events trigger the deletion of those temp tables' catalog records (e.g. 
pg_class and pg_attribute etc.) in the recovered database?

Thanks
Huan

Hi,

My guess is that temp table entries will still be in your catalog until you do 
a VACUUM FULL of the pg_class / pg_attribute tables.
But you should not care about these entries if these tables are vacuumed at 
regular intervals.

Regards,
Patrick


Re: PITR and Temp Tables

2022-04-20 Thread Huan Ruan
Hi Patrick

Thanks for your reply.

>
>
> My guess is that temp table entries will still be in your catalog until
> you do a VACUUM FULL of the pg_class / pg_attribute tables.
>
> But you should not care about these entries if these tables are vacuumed
> at regular intervals.
>
What I observed in one instance seems* to be like this - by the time I
connected to the recovered database (as soon as I could), an autovacuuum
has run on those catalog tables and they don't have those temp table
entries. Normally, autovacuum removes dead tuples, but those temp table
records are live tuples at T0. So if it was autovacuum that removed them in
the recovered database, were they live or dead tuples? If they are dead,
what did the deletion that made them dead?

*I would like to confirm to be 100% sure but was wondering if I can get an
explanation here.

Regards
Huan


Re: PITR and Temp Tables

2022-04-20 Thread Tom Lane
Huan Ruan  writes:
> Let's say at T0 a database has N session based temp tables. They would have
> corresponding records in the catalog tables like pg_class and pg_attribute
> that are visible to other sessions.

> At T1, I do a PITR to T0. That recovered database should not have those
> temp tables because the sessions they were created in are not present. My
> question is what events trigger the deletion of those temp tables' catalog
> records (e.g. pg_class and pg_attribute etc.) in the recovered database?

Those records will still be there in the catalogs, yes.

Cleaning out the contents of a temporary schema is not the responsibility
of the WAL/recovery system.  It's done by live backends at two times:

1. A session that has used a temp schema will normally clean out the
contained objects when it exits.

2. As a backstop in case #1 fails, a session that is about to begin using
a temp schema will clean out any surviving contents.

So if you rewound to a point where some temp objects exist, it'd be the
responsibility of the first session that wants to use a given temp schema
to clean out those objects.

regards, tom lane




Re: PITR and Temp Tables

2022-04-20 Thread Greg Stark
There actually is a third backstop if no other session ever connects to
that temp schema and cleans them out.

Eventually autovacuum notices that they would need a vacuum "to prevent
wraparound". It can't actually did the vacuum on temp tables but if there's
no session attached to the temp schema it drops them.

This normally takes quite a long time to reach so if you routinely have
sessions using temp schemas it's unlikely to happen. But if you only use
temp schemas manually then eventually it would.

On Wed., Apr. 20, 2022, 09:37 Tom Lane,  wrote:

> Huan Ruan  writes:
> > Let's say at T0 a database has N session based temp tables. They would
> have
> > corresponding records in the catalog tables like pg_class and
> pg_attribute
> > that are visible to other sessions.
>
> > At T1, I do a PITR to T0. That recovered database should not have those
> > temp tables because the sessions they were created in are not present. My
> > question is what events trigger the deletion of those temp tables'
> catalog
> > records (e.g. pg_class and pg_attribute etc.) in the recovered database?
>
> Those records will still be there in the catalogs, yes.
>
> Cleaning out the contents of a temporary schema is not the responsibility
> of the WAL/recovery system.  It's done by live backends at two times:
>
> 1. A session that has used a temp schema will normally clean out the
> contained objects when it exits.
>
> 2. As a backstop in case #1 fails, a session that is about to begin using
> a temp schema will clean out any surviving contents.
>
> So if you rewound to a point where some temp objects exist, it'd be the
> responsibility of the first session that wants to use a given temp schema
> to clean out those objects.
>
> regards, tom lane
>
>
>


Re: PostgreSQL 10.20 crashes / Antivirus

2022-04-20 Thread Adrian Klaver

On 4/20/22 01:06, Thomas, Richard wrote:

Adrian Klaver wrote:

What are the actual commands you are using to do the above?


The command used in a PowerShell script (run with Windows task scheduler) to 
dump each database should evaluate to:

"C:\Program Files\PostgreSQL\10\bin\pg_dump.exe" -b -v -F c -d $dbName -h localhost -p 
6488 -U backup_su -f $backupFile 2`>`&1 | Out-File $pgdumpLogFile


FYI, -b is the default unless you are restricting the dump to a schema 
or a table .


Do you have large objects(blobs) in the database?



where:
- "backup_su" is a superuser role (with password stored in the user's 
pgpass.conf file)
- $backupFile and $pgdumpLogFile are in folders excluded from McAfee scanning
- pg_dump.exe executable is not excluded from McAfee on-access scanning 
(although as recommended postgres.exe is)


Why not?

I would think the whole C:\Program Files\PostgreSQL\10\bin\ would be 
excluded.




The actual script segment building the command is:

 $pgdumpCmd = "C:\Program Files\PostgreSQL\10\bin\pg_dump.exe"
 $pgdumpArgs = @("-b", "-v",
 "-F", "c",
 "-d", $dbName,
 "-h", "localhost",
 "-p", "6488",
 "-U", " backup_su",
 "-f", $backupFile)
 cmd /c $pgdumpCmd $pgdumpArgs 2`>`&1 | Out-File $pgdumpLogFile

Note that the .backup file for the first failing pg_dump (after several 
successful ones) is now being produced but is of length 0, with the associated 
pg_dump log file simply reading:


What does the Windows event log show?

Same for the A/V software log.



pg_dump: [archiver (db)] connection to database "dbexample" failed: could not 
connect to server: Connection refused (0x274D/10061)
 Is the server running on host "localhost" (127.0.0.1) and accepting
 TCP/IP connections on port 6488?
could not connect to server: Connection refused (0x274D/10061)
 Is the server running on host "localhost" (::1) and accepting
 TCP/IP connections on port 6488?

Richard





--
Adrian Klaver
adrian.kla...@aklaver.com




Can anyone confirm the flaw of postgres and how to deal with it?

2022-04-20 Thread Shaozhong SHI
I loaded several tables onto Postgres.

When you view, you can see all columns.

However, there are 'ghost columns' that I remember I used before, but not
now.

 select column_name::text from information_schema.columns where
table_name=a_table

keeps listing columns that I can not see in the current table.

Why does this happen?

What is the solution?

Regards,

David


Re: Can anyone confirm the flaw of postgres and how to deal with it?

2022-04-20 Thread David G. Johnston
On Wed, Apr 20, 2022 at 8:48 AM Shaozhong SHI 
wrote:

> I loaded several tables onto Postgres.
>
> When you view, you can see all columns.
>
> However, there are 'ghost columns' that I remember I used before, but not
> now.
>
>  select column_name::text from information_schema.columns where
> table_name=a_table
>
> keeps listing columns that I can not see in the current table.
>
> Why does this happen?
>
> What is the solution?
>
>
I'm going to assume some kind of operator error until you can prove the
observation with actual queries and output.  Hopefully putting that
information together will cause you to realize where you are wrong.  If not
we at least get something that is debuggable.

David J.


Re: Can anyone confirm the flaw of postgres and how to deal with it?

2022-04-20 Thread Adrian Klaver

On 4/20/22 08:47, Shaozhong SHI wrote:

I loaded several tables onto Postgres.

When you view, you can see all columns.

However, there are 'ghost columns' that I remember I used before, but 
not now.


  select column_name::text from information_schema.columns where 
table_name=a_table


Best guess is that since you are not filtering on table_schema you are 
seeing columns for tables with table_name=a_table across all schemas.




keeps listing columns that I can not see in the current table.

Why does this happen?

What is the solution?

Regards,

David



--
Adrian Klaver
adrian.kla...@aklaver.com




RE: PostgreSQL 10.20 crashes / Antivirus

2022-04-20 Thread Thomas, Richard
Adrian Klaver wrote:
> On 4/20/22 01:06, Thomas, Richard wrote:
> > The command used in a PowerShell script (run with Windows task scheduler)
> to dump each database should evaluate to:
> >
> > "C:\Program Files\PostgreSQL\10\bin\pg_dump.exe" -b -v -F c -d $dbName
> > -h localhost -p 6488 -U backup_su -f $backupFile 2`>`&1 | Out-File
> > $pgdumpLogFile
>
> Do you have large objects(blobs) in the database?

No, but do have PostGIS geometries in almost all tables.

> > - pg_dump.exe executable is not excluded from McAfee on-access
> > scanning (although as recommended postgres.exe is)
>
> Why not?
>
> I would think the whole C:\Program Files\PostgreSQL\10\bin\ would be
> excluded.

I was following the instructions here:
https://wiki.postgresql.org/wiki/Running_%26_Installing_PostgreSQL_On_Native_Windows#Antivirus_software
I am not particularly familiar with how our McAfee is configured or operates 
(all under control of our IT, including it seems access to many of the logs). 
With the executable postgres.exe, they have specified that in McAfee as an 
executable not to include in "on-access scanning" (wherever the executable 
lives). This differs from the file read/write scanning where the folders to 
exclude are specified. I have put in a request earlier today to add exclusions 
for pg_dump.exe; maybe I'll hear back from IT in a day or so ;-(

> What does the Windows event log show?

Only events:
- Info: "The Windows Error Reporting Service service entered the running state" 
(goes into a stopped state 2 minutes later)
- (Error event messages generated by my PowerShell script)

Not sure if WER is actually writing data somewhere (none of the 
"*\AppData\Local\Microsoft\Windows\WER\ReportArchive" folders on the server 
have any data from the last 2 years)

> Same for the A/V software log.

I can't find much separate McAfee log information beyond the last hour in 
"Endpoint Security" app. Previously (before I got the PostgreSQL backup 
PowerShell script removed from scanning, McAfee used to put messages in the 
Windows Event Viewer reporting an error that it was a violating one of its 
rules, but it would allow the operation to continue (no longer get that).

Richard
At Atkins - member of the SNC-Lavalin Group, we work flexible hours around the 
world. Although I have sent this email at a time convenient for me, I don't 
expect you to respond until it works for you.
NOTICE – This email message and any attachments may contain information or 
material that is confidential, privileged, and/or subject to copyright or other 
rights. Any unauthorized viewing, disclosure, retransmission, dissemination, or 
other use of or reliance on this message or anything contained therein is 
strictly prohibited and may be unlawful. If you believe you may have received 
this message in error, kindly inform the sender by return email and delete this 
message from your system. Thank you.


Re: Unexpected result from ALTER FUNCTION— looks like a bug

2022-04-20 Thread Bryn Llewellyn
> t...@sss.pgh.pa.us wrote:
> 
>> david.g.johns...@gmail.com wrote:
>> 
>> Might I suggest the following...
> 
> Actually, the reason proconfig is handled differently is that it's a 
> variable-length field, so it can't be represented in the C struct that we 
> overlay onto the catalog tuple...

Thanks to all who responded. Tom also wrote this, earlier:

> In any case, Bryn's right, the combination of a SET clause and a PARALLEL 
> clause is implemented incorrectly in AlterFunction.

I'm taking what I've read in the responses to mean that the testcase I showed 
is considered to be evidence of a bug (i.e. there are no semantic restrictions) 
and that fix(es) are under consideration.

I agree that, as long as you know about the bug, it's trivial to achieve your 
intended effect using two successive "alter function" statements (underlining 
the fact that there are indeed no semantic restrictions). I hardly have to say 
that the point is the risk that you silently don't get what you ask for—and 
might then need a lot of effort (like I had to spend) to work out why.



Re: PostgreSQL 10.20 crashes / Antivirus

2022-04-20 Thread Adrian Klaver

On 4/20/22 10:23 AM, Thomas, Richard wrote:

Adrian Klaver wrote:

On 4/20/22 01:06, Thomas, Richard wrote:



- pg_dump.exe executable is not excluded from McAfee on-access
scanning (although as recommended postgres.exe is)


Why not?

I would think the whole C:\Program Files\PostgreSQL\10\bin\ would be
excluded.


I was following the instructions here:




I am not particularly familiar with how our McAfee is configured or operates (all under 
control of our IT, including it seems access to many of the logs). With the executable 
postgres.exe, they have specified that in McAfee as an executable not to include in 
"on-access scanning" (wherever the executable lives). This differs from the 
file read/write scanning where the folders to exclude are specified. I have put in a 
request earlier today to add exclusions for pg_dump.exe; maybe I'll hear back from IT in 
a day or so ;-(


That link also has:

"Specific issues have also been reported with McAfee and Panda 
anti-virus software and NetLimiter network monitoring software. While 
some people do have PostgreSQL working with these software packages, 
there is no specific or even recommend solutions that have not worked in 
some cases, so the issues would appear to be installation specific, 
sometimes even requiring uninstallation. "


Now the last edit to the page was 2013 so not sure how relevant the 
above is anymore. Also, I don't use Windows anymore(other then 
occasionally working on peoples machines) so I am not the best person to 
comment on current reality. Still the Postgres process seems to be 
killed by an external program and my suspicion is that the AV software 
is involved. Looks like this is something the Sys admins are going to 
have to look into via the tools and files they have access to.





Richard




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Unexpected result from ALTER FUNCTION— looks like a bug

2022-04-20 Thread David G. Johnston
On Wed, Apr 20, 2022 at 10:45 AM Bryn Llewellyn  wrote:

> > t...@sss.pgh.pa.us wrote:
> >
>
> > In any case, Bryn's right, the combination of a SET clause and a
> PARALLEL clause is implemented incorrectly in AlterFunction.
>
> I'm taking what I've read in the responses to mean that the testcase I
> showed is considered to be evidence of a bug (i.e. there are no semantic
> restrictions) and that fix(es) are under consideration.


The test case was good.  I made an uninformed assumption that proved to be
untrue.

The patch was written and applied yesterday, at Tom's "Yeah, I arrived at
the same fix." email.

https://github.com/postgres/postgres/commit/344a225cb9d42f20df063e4d0e0d4559c5de7910

(I haven't figured out what the official way to reference a commit is, I
use the GitHub clone for research so there ya go).

David J.


Re: PostgreSQL 10.20 crashes / Antivirus

2022-04-20 Thread Adrian Klaver

On 4/20/22 10:23 AM, Thomas, Richard wrote:

Adrian Klaver wrote:

On 4/20/22 01:06, Thomas, Richard wrote:

The command used in a PowerShell script (run with Windows task scheduler)

to dump each database should evaluate to:


"C:\Program Files\PostgreSQL\10\bin\pg_dump.exe" -b -v -F c -d $dbName
-h localhost -p 6488 -U backup_su -f $backupFile 2`>`&1 | Out-File
$pgdumpLogFile




Should have asked earlier:

Have you tried doing the pg_dumps as independent actions manually for 
each database instead of iterating over a list of databases in a script?



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Unexpected result from ALTER FUNCTION— looks like a bug

2022-04-20 Thread David G. Johnston
On Wed, Apr 20, 2022 at 10:54 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

>
>
> https://github.com/postgres/postgres/commit/344a225cb9d42f20df063e4d0e0d4559c5de7910
>
> (I haven't figured out what the official way to reference a commit is, I
> use the GitHub clone for research so there ya go).
>
>
Nevermind...not a huge fan of gitweb yet but I do have the commit from the
message sent to -committers.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9130f8cbb91954f7a40de70c014c01b552df31da

David J.


Re: Error dydl : image not found when trying to install pg on Catalina

2022-04-20 Thread Peter Eisentraut

On 17.04.22 13:28, cecile rougnaux wrote:

dyld: Library not loaded: /usr/local/opt/icu4c/lib/libicui18n.70.dylib


Whenever libicu's major version changes, you need to rebuilt the 
postgresql package.





Are stored procedures/triggers common in your industry

2022-04-20 Thread Guyren Howe
I’ve really only ever worked in web development. 90+% of web developers regard 
doing anything at all clever in the database with suspicion.

I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Rob Sargent

On 4/20/22 13:18, Guyren Howe wrote:
I’ve really only ever worked in web development. 90+% of web 
developers regard doing anything at all clever in the database with 
suspicion.


I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.
I have to wonder if any particular subset of the programming industry is 
less qualified to make such a judgement?

RE: Are stored procedures/triggers common in your industry

2022-04-20 Thread Basques, Bob (CI-StPaul)
We’ve used them in the past, but sparingly.  Usually if the data is abstracted 
nicely for loading into the DB, you can get away with most processes only 
needing SQL, at least in our cases.  There are obvious exceptions for things 
like monitoring or logging.

Our use has been for running some setup scripts (with PERL) to generate some 
derivative CAD models from the PG DB on the fly, but that was a real specific 
process need.

Bobb



My machine - - - PW19-S295-C024

From: Guyren Howe 
Sent: Wednesday, April 20, 2022 2:18 PM
To: pgsql-general@lists.postgresql.org
Subject: Are stored procedures/triggers common in your industry

Think Before You Click: This email originated outside our organization.

I’ve really only ever worked in web development. 90+% of web developers regard 
doing anything at all clever in the database with suspicion.

I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.



Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Philip Semanchuk



> On Apr 20, 2022, at 3:18 PM, Guyren Howe  wrote:
> 
> I’ve really only ever worked in web development. 90+% of web developers 
> regard doing anything at all clever in the database with suspicion.
> 
> I’m considering working on a book about implementing business logic in 
> Postgres, and I’m curious about how common that actually is.
> 


We have some business logic in Postgres functions, particularly triggers. Our 
apps are written in Python, and we use pytest to exercise our SQL functions to 
ensure they're doing what we think they’re doing. It works well for us.

FWIW, we’re not a Web dev shop.

Cheers
Philip





Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Adrian Klaver

On 4/20/22 12:18, Guyren Howe wrote:
I’ve really only ever worked in web development. 90+% of web developers 
regard doing anything at all clever in the database with suspicion.


I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.


For my purposes keeping this logic in the database makes changing or 
running multiple front ends easier. There is one place to change the 
logic vs keeping the same logic in different front ends in potentially 
different languages in sync. So for me it is common.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Tim Clarke

On 20/04/2022 20:26, Philip Semanchuk wrote:
> We have some business logic in Postgres functions, particularly triggers. Our 
> apps are written in Python, and we use pytest to exercise our SQL functions 
> to ensure they're doing what we think they’re doing. It works well for us.
>
> FWIW, we’re not a Web dev shop.
>
> Cheers
> Philip


We have a a great amount of our business logic in triggers; makes for
light, multiple and consistent front-ends. It's worked very well for
many years and continues to grow.

Tim Clarke



Telephone: Witham: +44(0)1376 503500 | London: +44 (0)20 3009 0853 | Frankfurt: 
+49 (0)69 7191 6000 | Hong Kong: +852 5803 1687 | Toronto: +1 647 503 2848
Web: https://www.manifest.co.uk/

Watch our latest Minerva Briefings on 
BrightTALK



Minerva Analytics Ltd - A Solactive Company
9 Freebournes Court | Newland Street | Witham | Essex | CM8 2BL | United Kingdom



Copyright: This e-mail may contain confidential or legally privileged 
information. If you are not the named addressee you must not use or disclose 
such information, instead please report it to 
ad...@minerva.info
Legal: Minerva Analytics is the trading name of: Minerva Analytics Ltd: 
Registered in England Number 11260966 & The Manifest Voting Agency Ltd: 
Registered in England Number 2920820 Registered Office at above address. Please 
Click Here https://www.manifest.co.uk/legal/ for further information.


Re: ERROR: XX000: cache lookup failed for type 75083631

2022-04-20 Thread Jan Beseda
Hi Tom, Thanks for getting me directions for debugging, but it seems the
devops team fully restored the system snapshot on corrupted instance for
me. If it occurs again I'll reopen/write you as reply here if that's ok.

Cheers, Jan

čt 14. 4. 2022 v 19:24 odesílatel Tom Lane  napsal:

> Jan Beseda  writes:
> > I'm having an issue with dropping a view as shown below:
>
> > DROP VIEW access_group_view;
> > ERROR:  XX000: cache lookup failed for type 75083631
> > LOCATION:  format_type_internal, format_type.c:152
>
> Does the behavior change if you say CASCADE?
>
> The fact that it's failing in format_type() implies that something is
> trying to print the name of a type, which doesn't seem like a main-line
> activity for DROP VIEW.  I am suspicious that pg_depend shows this type
> OID as dependent for some reason on this view, and that the message
> that it was trying to print was complaining about how that dependency
> existed and that you'd need to say CASCADE to make it take.  However,
> since format_type() fails, the type OID must not really exist anymore,
> implying that the pg_depend entry is orphaned.
>
> That raises a different set of questions about how it got to be that way.
> But at any rate, what I'd suggest is
>
> 1. Verify that the type OID is wrong:
> select * from pg_type where oid = 75083631;
> If that finds a row then we've got a whole other set of issues.
> (BTW, if you want to be really sure, forcing a seqscan for this
> query or reindexing pg_type could be advisable.)
>
> 2. Check for bogus entries in pg_depend:
> select * from pg_depend where objid = 75083631;
> select * from pg_depend where refobjid = 75083631;
>
> 3. If there's just one hit in pg_depend then it's probably
> safe to delete that row.
>
> regards, tom lane
>


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Ravi Krishna

I've really only ever worked in web development. 90+% of web
developers regard doing anything at all clever in the database with 
suspicion.


One common argument they use is that if you write your business logic in 
stored procedure, you are locked to that database since stored procedure 
languages are pretty much vendor locked.


TBH when one sees tens of thousands of Oracle PL/SQL code, there is some 
truth in this.

Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Alex Aquino
Agree on the lock in comment, however, can't we say that of anything one is
dependent on in the tech stack, whether that be at the java vs javascript
vs python, or now aws vs azure vs gcp?

Have always wondered that lock in concern seems to be only mentioned in
light of dbs, but not any other piece of the tech stack.

On Wed, Apr 20, 2022 at 3:31 PM Ravi Krishna  wrote:

> >I've really only ever worked in web development. 90+% of web
> >developers regard doing anything at all clever in the database with
> suspicion.
>
> One common argument they use is that if you write your business logic in
> stored procedure, you are locked to that database since stored procedure
> languages are pretty much vendor locked.
>
> TBH when one sees tens of thousands of Oracle PL/SQL code, there is some
> truth in this.
>


Re: PITR and Temp Tables

2022-04-20 Thread Huan Ruan
Many thanks for the explanation, Tom and Greg. That all makes sense.

Cheers
Huan


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Benedict Holland
It's a very wierd concern for me. I have never liked that justification as
we convert 1:1 SAS to python. If you use Django, converting it to flask is
really hard. If you use postgresql, converting it to oracle is really hard.

I love stored procedures and triggers. Many of my colleagues don't
understand why sticking everything on a database is a great idea. These are
the same people who think that unique constraints are too much overhead.
It's a great tool to use. Do you need CRUD stored procedures when sql
alchemy exists? Nope. Do you need it when doing an extremely complex select
with multiple joins that you want to run all the time? Maybe. Or allowing
insert operations on base tables in a view. Or tracking and monitoring who
does what. Also, you can back up stroed procedures to make updates easy.

Thanks,
Ben

On Wed, Apr 20, 2022, 4:48 PM Alex Aquino  wrote:

> Agree on the lock in comment, however, can't we say that of anything one
> is dependent on in the tech stack, whether that be at the java vs
> javascript vs python, or now aws vs azure vs gcp?
>
> Have always wondered that lock in concern seems to be only mentioned in
> light of dbs, but not any other piece of the tech stack.
>
> On Wed, Apr 20, 2022 at 3:31 PM Ravi Krishna 
> wrote:
>
>> >I've really only ever worked in web development. 90+% of web
>> >developers regard doing anything at all clever in the database with
>> suspicion.
>>
>> One common argument they use is that if you write your business logic in
>> stored procedure, you are locked to that database since stored procedure
>> languages are pretty much vendor locked.
>>
>> TBH when one sees tens of thousands of Oracle PL/SQL code, there is some
>> truth in this.
>>
>


Large Data insert on Master server fills up /pgsql base dir when Logical replication (pglogical) is active

2022-04-20 Thread Siddhartha Gurijala
Hello Postgres community,
I have a database cluster that crashed in a way I don’t understand.
Some details about the setup:

   - The database that crashed is running postgres 14.1
   - This database has three physical standbys using repmgr
   - The database allows another database in the same location to subscribe
   to logical replication subscriptions
   - The $PGDATA directory has 1.8TB total storage and was at 50% usage
   before this issue occurred
   - The pg_wal directory is symlinked to another partition that has 1.4Tb
   of available space

Recently during a deployment, there was a single transaction that converted
many large tables from inheritance-based partitioned tables into
declarative partitioned tables. The estimate of the data moved in this
transaction is about 150GB of data.  It should also be noted that all of
these tables are part of a replication set for logical replication.
The way the SQL in this transaction is written:

   - BEGIN
   - In a DO block, create unlogged temporary tables as select * from
   (inheritance partitioned tables)
   - In a DO block, drop the old inheritance partitioned tables
   - Multiple CREATE TABLE statements to recreate these tables using
   declarative partitioning
   - DO blocks to create the individual partitions and indexes
   - DO block to set proper permissions
   - DO block that inserts the data from the unlogged temporary tables into
   the base tables and drops the temporary tables
   - END

All the SQL for this is being run as part of multiple DO pgplsql blocks
within a single transaction. In the original failure, the tables being
replaced with declarative partitions were part of the replication set with
an active subscriber.
I thought that maybe my issue is that I shouldn’t have any active
subscriptions that are replicating tables that will be dropped and
recreated within a transaction. So, I tried something different: with the
same active subscriber, I first removed the tables from the replication set
and then executed the transaction that replaces the inheritance tables with
the declarative versions. Even with the tables already removed from the
replication set, the $PGDATA/base directory keeps growing until the disk is
full.
This increase in the disk space for postgres data directory will stop when
I drop the pglogical replication sets and replication slots (in other words
just dropping the provider node).I have confirmed it’s not the pg_wal
that’s holding up space.
We did a similar migration last year (using identical SQL with different
table names) when our PG was on 10.12 and didn’t see this issue. Hence, I
am confused on why having an active replication slot but no target table
part of it will cause the postgres to fill up the base disk space. I have
also compared the schema and table sizes of these target tables from before
and after my sql run and there don’t seem to be any difference.
Postgres Version on Master database: 14.1
Postgres version on Subscriber database cluster: 14.1
Pglogical version on master:  pglogical 2.4.0-1.rhel7
pglogical version on subscriber: pglogical 2.4.0-1.rhel7
I also tested this after upgrading pglogical to 2.4.1 and still found it to
have the same issue.

Thanks & Regards,
Siddhartha Gurijala


autovacuum_freeze_max_age on append-only tables

2022-04-20 Thread senor
Hi All,
I'm attempting to mimic a new feature in version 13 where INSERTS will trigger 
vacuum for an append-only table. 

I'm using v11 and configuring autovacuum_freeze_max_age to a value representing 
some number of minutes worth of inserts on a table containing the current day 
events. I'm looking to understand the details of how the vacuum operates and 
what to expect and plan for. I first ran into an issue when a script attempted 
to alter the table to change the value of autovacuum_freeze_max_age while a 
vacuum was running. I know there is a lock conflict while the vacuum is running 
but I was under the impression that autovacuum_vacuum_cost_limit would limit 
the time blocked. The ALTER hung much longer than I expected. 

I'm apparently needing an education on how this "to avoid wraparound" vacuum 
differs from any other. I've seen it referenced as "more aggressive" but I'd 
like details. An upgrade to 13 is "right around the corner".

Pointers to documentation I might have missed is be appreciated.

-Senor



Re: autovacuum_freeze_max_age on append-only tables

2022-04-20 Thread Peter Geoghegan
On Wed, Apr 20, 2022 at 4:06 PM senor  wrote:
> I'm attempting to mimic a new feature in version 13 where INSERTS will 
> trigger vacuum for an append-only table.

The problem with that idea is that you need to express the idea that
the table needs to be vacuumed now in terms of its "age", denominated
in XIDs -- but XIDs consumed by the entire system, not just those XIDs
that happen to modify your append-only table. It will likely be very
hard for you to figure out a way to relate these logical units (XIDs)
to some kind of physical cost that captures how far behind you are on
freezing (like blocks, or even tuples). Maybe you'll find something
that works through trial and error, but I wouldn't count on it.

> I'm apparently needing an education on how this "to avoid wraparound" vacuum 
> differs from any other. I've seen it referenced as "more aggressive" but I'd 
> like details. An upgrade to 13 is "right around the corner".

It's complicated -- more complicated than it really should be.
Technically an anti-wraparound autovacuum and an aggressive vacuum are
two different things. In practice anti-wraparound autovacuums are
virtually guaranteed to be aggressive, though an aggressive autovacuum
may not be an antiwraparound VACUUM (sometimes we do aggressive
vacuuming because autovacuum launched a worker before
age(relfrozenxid) reached autovacuum_freeze_max_age, but after
age(relfrozenxid) reached vacuum_freeze_table_age).

See my recent response to a similar question here:

https://postgr.es/m/CAH2-WzkFQ-okvVXizpy4dCEVq75N-Qykh=crhzao-eajflv...@mail.gmail.com

--
Peter Geoghegan




Re: alter function/procedure depends on extension

2022-04-20 Thread David G. Johnston
On Tue, Apr 12, 2022 at 8:55 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Apr 12, 2022 at 8:49 AM Tom Lane  wrote:
>
>> "David G. Johnston"  writes:
>> > -  A function that's marked as dependent on an extension is
>> automatically
>> > -  dropped when the extension is dropped.
>> > +  A function that's marked as dependent on an extension is skipped
>> during
>> > +  dependency checking in restrict mode > linkend="sql-dropextension"/>.
>>
>> That is absolutely not an improvement.  The proposed replacement text
>> is about as clear as mud.
>>
>
> + A function that's marked as dependent on an extension is dropped when
> the extension is dropped, even if cascade is not specified.
>
> I suppose that is a wordier way to say "automatically" but referencing the
> actual command keyword seems beneficial.
>
>
I put this change into v0002 and posted it to -hackers as part of a
multi-patch thread.

David J.


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Alex Aquino
Agree with the comment on python, et al. I meant lock in within the context
of Oracle PL/SQL.

Actually, the point of Postgres support for all the languages you mentioned
is interesting in that such mitigates the lockin argument if the DB used
is. Postgres .  Another reason to use  Postgres, among all the other
reasons we know.

Architecturally speaking, using stored procs is a an elegant solution that
provides proper abstraction for the Data Layer API, effectively separating
the data access and security layers (stored proc design considerations)
from the data model and storage considerations.  This API type of layer
provided by procs is more closely aligned with how one thinks about
microservices.

Secondly, it will generally be runtime faster as it avoids the critical
network round trips that take up those precious milliseconds.


On Wed, Apr 20, 2022 at 3:54 PM Guyren Howe  wrote:

> On Apr 20, 2022, at 13:43 , Alex Aquino  wrote:
>
>
> Agree on the lock in comment, however, can't we say that of anything one
> is dependent on in the tech stack, whether that be at the java vs
> javascript vs python, or now aws vs azure vs gcp?
>
> Have always wondered that lock in concern seems to be only mentioned in
> light of dbs, but not any other piece of the tech stack.
>
> On Wed, Apr 20, 2022 at 3:31 PM Ravi Krishna 
> wrote:
>
>> >I've really only ever worked in web development. 90+% of web
>> >developers regard doing anything at all clever in the database with
>> suspicion.
>>
>> One common argument they use is that if you write your business logic in
>> stored procedure, you are locked to that database since stored procedure
>> languages are pretty much vendor locked.
>>
>> TBH when one sees tens of thousands of Oracle PL/SQL code, there is some
>> truth in this.
>>
> You can write your stored procedures and triggers in:
> - python
> - perl
> - Java
> - R
> - Javascrpt
> - Rust
> - C
> - … others (scheme, …)
>
> How is this lock-in, again?
>


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread Mladen Gogala

On 4/20/22 15:18, Guyren Howe wrote:
I’ve really only ever worked in web development. 90+% of web 
developers regard doing anything at all clever in the database with 
suspicion.


I’m considering working on a book about implementing business logic in 
Postgres, and I’m curious about how common that actually is.


Well, there are 2 schools of thought:

1. Put the business logic into the application
2. Put the business logic into the database

Putting the business logic into the application can give you more 
flexibility around enforcing them. On the other hand, you also increase 
chances of inconsistency. There will likely be more than one application 
using reference tables like ADDRESS, ZIP_CODE, STATE, COUNTRY, QUARTER, 
ACCOUNT, CUSTOMER and similar. If there is a rule that a country must 
exist before you add an address in that country into the table, that can 
be enforced by a foreign key. Enforcing it within the application does 2 
things:


1. Move the rule code to the application server which is traditionally
   weaker than a database server. In other words, you are more likely
   to run out of CPU juice and memory on an application server than you
   are likely to run out of resources on the DB server.
2. There is a possibility for inconsistency. Different applications can
   use different business rules for the same set of tables. That means
   that data entered by one application may make the table internally
   inconsistent for another application.

I am a big proponent of using foreign keys, check constraints and 
triggers to enforce business rules. I am also a big proponent of 
avoiding NULL values wherever possible. Database design is an art. CAD 
software used to be popular once upon a time, in a galaxy far, far 
away.  Properly enforcing the business rules in the database itself 
makes the application more clear and easier to write.


Regards

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com


Re: Are stored procedures/triggers common in your industry

2022-04-20 Thread raf
On Wed, Apr 20, 2022 at 12:18:23PM -0700, Guyren Howe  wrote:

> I’ve really only ever worked in web development. 90+% of web
> developers regard doing anything at all clever in the database with
> suspicion.
>
> I’m considering working on a book about implementing business logic in
> Postgres, and I’m curious about how common that actually is.

I'm used to putting all business logic in the database
(after choosing a great FLOSS database that you'll
never want to migrate away from - like Postgres). And
I've never regretted it (in decades of doing it).

One of the main reasons is speed. I once had a job
where a program selected data out of a database,
dragged it over a network, effectively grouped it into
summaries, sent the summaries back over the network,
and inserted them back into the database one at a
time(!). Replacing it with a stored procedure changed
it from taking 2-3 hours to 2 minutes. And that was a
place that already made heavy use of stored procedures,
so I don't know what went wrong there. The point is
that whenever a lot of data activity is needed, it's
much faster when it's done where the data lives.

The other main reason is security. The database can
provide an effective "firewall" between the data and
the client. I never liked the idea of trusting
arbitrary SQL sent from the client. It means you have
to trust every single client application and every
single user (even the ones with good intentions that
produce bad queries in some reporting software and
throwing it at the database and bringing it to its
knees) and every single developer (who might not know
SQL and relies on ORMs that trick them into thinking
they don't need to). But when the clients are only
permitted to execute security defining stored
procedures that have been loaded by the privileged
database owner, you know exactly what code can run
inside the database. SQL injections become impossible
no matter how many bugs and flaws there are in the
client software or its supply chain.

Another good but less critical reason is that when you
use multiple languages, or you migrate partially or
completely from the old cool language to the new cool
language, you don't have to replicate the business
logic in the new language, and you can eliminate the
risk of introducing bugs into mission critical code.
The existing business logic and its test suite can stay
stable while all the bells and whistles on the outside
change however they like.

There are other nice benefits but that's enough.

I think it's safe to disregard the suspicions of the
90+% of web developers you mentioned. The requirements
that they have for a database might be quite
undemanding. Most individual actions on a website
probably don't result in a lot of data activity (or
rather activity that involves a lot of data). The CRUD
model is probably all they need. So their views are
understandable, but they are based on limited
requirements. However, I still use stored procedures
for everything on websites for security reasons.

Everyone's mileage varies. We're all in different places.

cheers,
raf