RE: PostgreSQL 10.20 crashes / Antivirus

2022-04-21 Thread Thomas, Richard
Adrian (all),

Thanks for your suggestions. Running the equivalent pg_dump commands at the 
command line instead of in a script works fine and generates no 
errors/warnings/McAfee messages in the Windows Event Viewer or on 
stdout/stderr. (Most days the backup script itself completes running multiple 
pg_dump without problem). I too suspect McAfee of causing the problem - even if 
it isn't consciously taking any action, it must be processing all file I/O 
accesses in order to be able to filter which ones it should do virus scanning 
on. Unfortunately it is a very painful process for me to get our IT to modify 
virus scanning or report back on AV messages (even if not errors), so any 
diagnostics I can perform help speed the process. (And unfortunately, my hands 
are currently tied on having to use an existing AWS EC2 Windows Server, much as 
I would prefer to use a Linux system or a cloud-based PaaS.)

Richard

> > Adrian Klaver wrote:
> >> On 4/20/22 01:06, Thomas, Richard wrote:
>
> >>> - pg_dump.exe executable is not excluded from McAfee on-access
> >>> scanning (although as recommended postgres.exe is)
> >>
> >> Why not?
> >>
> >> I would think the whole C:\Program Files\PostgreSQL\10\bin\ would be
> >> excluded.
> >
> > I was following the instructions here:
> >
>
> > I am not particularly familiar with how our McAfee is configured or
> > operates (all under control of our IT, including it seems access to
> > many of the logs). With the executable postgres.exe, they have
> > specified that in McAfee as an executable not to include in "on-access
> > scanning" (wherever the executable lives). This differs from the file
> > read/write scanning where the folders to exclude are specified. I have
> > put in a request earlier today to add exclusions for pg_dump.exe;
> > maybe I'll hear back from IT in a day or so ;-(
>
> That link also has:
>
> "Specific issues have also been reported with McAfee and Panda anti-virus
> software and NetLimiter network monitoring software. While some people do
> have PostgreSQL working with these software packages, there is no specific or
> even recommend solutions that have not worked in some cases, so the issues
> would appear to be installation specific, sometimes even requiring
> uninstallation. "
>
> Now the last edit to the page was 2013 so not sure how relevant the above is
> anymore. Also, I don't use Windows anymore(other then occasionally working
> on peoples machines) so I am not the best person to comment on current
> reality. Still the Postgres process seems to be killed by an external program 
> and
> my suspicion is that the AV software is involved. Looks like this is 
> something the
> Sys admins are going to have to look into via the tools and files they have
> access to.
At Atkins - member of the SNC-Lavalin Group, we work flexible hours around the 
world. Although I have sent this email at a time convenient for me, I don't 
expect you to respond until it works for you.
NOTICE – This email message and any attachments may contain information or 
material that is confidential, privileged, and/or subject to copyright or other 
rights. Any unauthorized viewing, disclosure, retransmission, dissemination, or 
other use of or reliance on this message or anything contained therein is 
strictly prohibited and may be unlawful. If you believe you may have received 
this message in error, kindly inform the sender by return email and delete this 
message from your system. Thank you.


Re: LwLocks contention

2022-04-21 Thread Chris Bisnett
> We are occasionally seeing heavy CPU contention with hundreds of processes 
> active but waiting on a lightweight lock - usually lock manager or buffer 
> mapping it seems. This is happening with VMs configured with about 64 CPUs, 
> 350GBs ram, and while we would typically only have 30-100 concurrent 
> processes, there will suddenly be ~300 and many show active with LwLock and 
> they take much longer than usual. Any suggested options to monitor for such 
> issues or logging to setup so the next issue can be debugged properly?
>
> It has seemed to me that this occurs when there are more than the usual 
> number of a particular process type and also something that is a bit heavy in 
> usage of memory/disk. It has happened on various tenant instances and 
> different application processes as well.
>
> Would/how might the use of huge pages (or transparent huge pages, or OFF) 
> play into this scenario?

I've also been contending with a good bit of lightweight lock
contention that causes performance issues. Most often we see this with
the WAL write lock, but when we get too many parallel queries running
we end up in a "thundering herd" type of issue were the contention for
the lock manager lock consumes significant CPU resources causing the
number of parallel queries to increase as more clients back up behind
the lock contention leading to even more lock contention. When this
happens we have to pause our background workers long enough to allow
the lock contention to reduce and then we can resume our background
workers. When we hit the lock contention it's not a gradual
degredation, it goes immediately from nothing more than 100% CPU
usage. The same is true when reducing the lock contention - it goes
from 100% to nothing.

I've been working under the assumption that this has to do with our
native partitioning scheme and the fact that some queries cannot take
advantage of partition pruning because they don't contain the
partition column. My understanding is that when this happens ACCESS
SHARED locks have to be taken on all tables as well as all associated
resources (indexes, sequences, etc.) and the act of taking and
releasing all of those locks will increase the lock contention
significantly. We're working to update our application so that we can
take advantage of the pruning. Are you also using native partitioning?

- Chris




Re: autovacuum_freeze_max_age on append-only tables

2022-04-21 Thread Laurenz Albe
On Wed, 2022-04-20 at 23:06 +, senor wrote:
> I'm apparently needing an education on how this "to avoid wraparound" vacuum 
> differs from
> any other. I've seen it referenced as "more aggressive" but I'd like details.

The difference is twofold, as far as I know:

- it will not skip any pages just because it happens not to get a lock on them
- it will refuse to die if the lock it holds on the table conflicts with a user 
lock

Unless you are in the habit of taking strong locks on the table, you shouldn't
notice a difference.  Anti-wraparound VACUUM is a routine activity and does not
interfere with DML, just like a normal VACUUM.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Are stored procedures/triggers common in your industry

2022-04-21 Thread Alex Aquino
You mentioned testing, and reminds me of another benefit.  Way faster, more
reliable, cheaper to test on the DB side.  Testing logic in SPs or SQL is
much easier, especially when testing requires a sequence of calls for a use
case.  It is easier because of the DBs support for transactions.  With
transactions and state management built into the DB, a testing process can
always revert to a reliable starting point and end point, thereby
facilitating more dependable, automated test harnesses.  The alternative
done mostly now is testing via UIs or APIs where there is no inherent
transaction management, so a lot of work goes into preparing the test bed
to be a known state and introspecting the results to verify.  This is
usually done with some mix of manual and automated processes.

On Thu, Apr 21, 2022 at 12:31 AM raf  wrote:

> On Wed, Apr 20, 2022 at 12:18:23PM -0700, Guyren Howe 
> wrote:
>
> > I’ve really only ever worked in web development. 90+% of web
> > developers regard doing anything at all clever in the database with
> > suspicion.
> >
> > I’m considering working on a book about implementing business logic in
> > Postgres, and I’m curious about how common that actually is.
>
> I'm used to putting all business logic in the database
> (after choosing a great FLOSS database that you'll
> never want to migrate away from - like Postgres). And
> I've never regretted it (in decades of doing it).
>
> One of the main reasons is speed. I once had a job
> where a program selected data out of a database,
> dragged it over a network, effectively grouped it into
> summaries, sent the summaries back over the network,
> and inserted them back into the database one at a
> time(!). Replacing it with a stored procedure changed
> it from taking 2-3 hours to 2 minutes. And that was a
> place that already made heavy use of stored procedures,
> so I don't know what went wrong there. The point is
> that whenever a lot of data activity is needed, it's
> much faster when it's done where the data lives.
>
> The other main reason is security. The database can
> provide an effective "firewall" between the data and
> the client. I never liked the idea of trusting
> arbitrary SQL sent from the client. It means you have
> to trust every single client application and every
> single user (even the ones with good intentions that
> produce bad queries in some reporting software and
> throwing it at the database and bringing it to its
> knees) and every single developer (who might not know
> SQL and relies on ORMs that trick them into thinking
> they don't need to). But when the clients are only
> permitted to execute security defining stored
> procedures that have been loaded by the privileged
> database owner, you know exactly what code can run
> inside the database. SQL injections become impossible
> no matter how many bugs and flaws there are in the
> client software or its supply chain.
>
> Another good but less critical reason is that when you
> use multiple languages, or you migrate partially or
> completely from the old cool language to the new cool
> language, you don't have to replicate the business
> logic in the new language, and you can eliminate the
> risk of introducing bugs into mission critical code.
> The existing business logic and its test suite can stay
> stable while all the bells and whistles on the outside
> change however they like.
>
> There are other nice benefits but that's enough.
>
> I think it's safe to disregard the suspicions of the
> 90+% of web developers you mentioned. The requirements
> that they have for a database might be quite
> undemanding. Most individual actions on a website
> probably don't result in a lot of data activity (or
> rather activity that involves a lot of data). The CRUD
> model is probably all they need. So their views are
> understandable, but they are based on limited
> requirements. However, I still use stored procedures
> for everything on websites for security reasons.
>
> Everyone's mileage varies. We're all in different places.
>
> cheers,
> raf
>
>
>
>


Re: PostgreSQL 10.20 crashes / Antivirus

2022-04-21 Thread Adrian Klaver

On 4/21/22 02:13, Thomas, Richard wrote:

Adrian (all),

Thanks for your suggestions. Running the equivalent pg_dump commands at the command line instead of in a script works fine and generates no errors/warnings/McAfee messages in the Windows Event Viewer or on stdout/stderr. (Most days the backup script itself completes running multiple pg_dump without problem). 


For the time being a possible solution would be to insert a time out 
between iterations of:


$pgdumpCmd = "C:\Program Files\PostgreSQL\10\bin\pg_dump.exe"
$pgdumpArgs = @("-b", "-v",
"-F", "c",
"-d", $dbName,
"-h", "localhost",
"-p", "6488",
"-U", " backup_su",
"-f", $backupFile)
cmd /c $pgdumpCmd $pgdumpArgs 2`>`&1 | Out-File $pgdumpLogFile



Richard





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Huge archive log generate in Postgresql-13

2022-04-21 Thread Adrian Klaver

On 4/19/22 21:31, Ram Pratap Maurya wrote:

Dear Adrian,

We are using binary replication and par day *pg_log_archive* total size 
is increase after upgrade for PG11 to PG13.


So what is pruning the WAL's in pg_log_archive?

Or to put it another way what is the purpose of pg_log_archive and how 
is it managed?




Regards,

Ram Pratap.





--
Adrian Klaver
adrian.kla...@aklaver.com




set column statistics to max does not help

2022-04-21 Thread Radoslav Nedyalkov
Hello all,
We're stuck at a wrong plan that the planner insists on.(pg 14.2)
It's an aggregation over a single table.
The planner always goes for an extra scan over a partial index.
We have set statistics on the relevant columns to the max of 1 and
could not get correct row estimates. None of the cost* settings helped too.
Disabling bitmapscan below brings up the correct plan. Do we have another
option here ?
Maybe it is a more general question of what to do when the statistics
samples cannot get the right numbers?

Thank you in advance for your help,

Rado


Here are the details:
See the partial index that matches one of the query where clauses :
"transaction_events_0100_0200_merchant_id_id_idx" btree (merchant_id, id)
WHERE (event_type::text = ANY (ARRAY['REFUND'::character varying::text,
'CHARGE_BACK'::character varying::text])) AND (current_status_id <> ALL
(ARRAY[24, 10]))

the distribution of event_type is
  count   | event_type
--+-
14908 | CHARGE_BACK
   134007 | REFUND
 99846581 | PAYOUT
(3 rows)

*** The table:
test=# \d+ transaction_events_0100_0200
   Table
"public.transaction_events_0100_0200"
   Column   |Type | Collation |
Nullable | Default | Storage  | Compression | Stats target | Description
+-+---+--+-+--+-+--+-
 id | bigint  |   | not
null | | plain| |  |
 transaction_id | bigint  |   |
 | | plain| | 1|
 event_type | character varying(255)  |   |
 | | extended | | 1|
 event_amount   | numeric(12,2)   |   |
 | | main | |  |
 current_status_id  | integer |   |
 | | plain| | 1|
 payout_due_date| date|   |
 | | plain| |  |
 actual_payout_date | date|   |
 | | plain| |  |
 current_payout_event_id| integer |   |
 | | plain| |  |
 created_at | timestamp without time zone |   | not
null | | plain| |  |
 updated_at | timestamp without time zone |   | not
null | | plain| |  |
 installment_number | integer |   |
 | | plain| |  |
 installments_count | integer |   |
 | | plain| |  |
 fixed_fee  | numeric(12,2)   |   |
 | | main | |  |
 acceleration_fee   | numeric(12,2)   |   |
 | | main | |  |
 processing_fee | numeric(12,2)   |   |
 | | main | |  |
 origin_event_id| bigint  |   |
 | | plain| |  |
 destination_event_id   | bigint  |   |
 | | plain| |  |
 payout_deduct_status   | character varying   |   |
 | | extended | |  |
 merchant_id| integer |   | not
null | | plain| | 1000 |
 current_merchant_payout_id | bigint  |   |
 | | plain| |  |
Indexes:
"transaction_events_0100_0200_pkey" PRIMARY KEY, btree (id)
"transaction_events_0100_0200_current_status_id_transaction__idx" btree
(current_status_id, transaction_id) WHERE current_status_id <> ALL
(ARRAY[24, 10])
"transaction_events_0100_0200_merchant_id_id_idx" btree (merchant_id,
id) WHERE (event_type::text = ANY (ARRAY['REFUND'::character varying::text,
'CHARGE_BACK'::character varying::text])) AND (current_status_id <> ALL
(ARRAY[24, 10]))
"transaction_events_0100_0200_merchant_id_transaction_id_idx" btree
(merchant_id, transaction_id) WHERE current_status_id <> ALL (ARRAY[24, 10])
"transaction_events_0100_0200_transaction_id_idx" btree
(transaction_id) WITH (fillfactor='100')
"transaction_events_0100_0200_transaction_id_idx1" btree
(transaction_id) WHERE event_type::text = 'CHARGE_BACK'::text AND
payout_deduct_status::text = 'PENDING'::text

*** The query:
SEL

Re: Are stored procedures/triggers common in your industry

2022-04-21 Thread raf
On Thu, Apr 21, 2022 at 08:42:10AM -0500, Alex Aquino  
wrote:

> You mentioned testing, and reminds me of another benefit.  Way faster, more
> reliable, cheaper to test on the DB side.  Testing logic in SPs or SQL is
> much easier, especially when testing requires a sequence of calls for a use
> case.  It is easier because of the DBs support for transactions.  With
> transactions and state management built into the DB, a testing process can
> always revert to a reliable starting point and end point, thereby
> facilitating more dependable, automated test harnesses.  The alternative
> done mostly now is testing via UIs or APIs where there is no inherent
> transaction management, so a lot of work goes into preparing the test bed
> to be a known state and introspecting the results to verify.  This is
> usually done with some mix of manual and automated processes.

Actually, my full work tests take ages to run (~40m).
I know that mocking the db to make unit tests fast is popular,
but that's not helpful when the most important code being tested
is in the database. :-) It's more important to me that the tests
actually test everything than that they be fast.

But yes, being able to do complex system testing with transaction
rollback is great.

cheers,
raf





Configuration and performance of Postgres/PostGIS

2022-04-21 Thread Shaozhong SHI
Whenever geospatial functions such as St_intersects or recursive query
used, the Postgres/PostGIS appears to spawn away to many child queries and
just obliterate the CPU.  Nothing finishes.

That forced me to try out to do the some tasks on the FME server.

I tried to use this http://blog.cleverelephant.ca/2010/07/network
-walking-in-postgis.html in the Postgres/PostGIS.

I tried to linecombiner in FME.  LineCombiner | FME (safe.com)
.

With a large data set, the running of processors were monitored.  It was
estimated the Postgres/PostGIS one would take 16 days to complete.

But, it only took a few minute to do the same thing in FME.

This suggests that something is not right with the Postages Server.

Have anyone got experience with configuration and improving performance of
Postages Server?

Regards,

David


Re: autovacuum_freeze_max_age on append-only tables

2022-04-21 Thread Senor

Thank you both Laurenz and Peter.

Laurenz - It was an article you posted a couple years ago introducing 
the V13 feature that got me thinking about the insert-only situation I had.


Peter - I had been improperly holding anti-wraparound and aggressive in 
my mind as related in a way they are not. You cleared that up.


2 last questions (maybe):

Are the autovacuum_vacuum_cost_* settings handled any differently for 
'to avoid wraparound' vacuums? I understand that it won't give up a lock 
but I was expecting it to still back off due to cost and allow the query 
with conflicting lock to proceed.


Is there any benefit to manually running a vacuum every so many inserts 
as opposed to using autovacuum_freeze_max_age. And in this case should 
it be a vacuum freeze. Rows are never updated or deleted except for the 
occasional roll back due to dropped network connections.


Thanks again

-Senor


On 4/21/2022 6:35, Laurenz Albe wrote:

On Wed, 2022-04-20 at 23:06 +, senor wrote:

I'm apparently needing an education on how this "to avoid wraparound" vacuum 
differs from
any other. I've seen it referenced as "more aggressive" but I'd like details.

The difference is twofold, as far as I know:

- it will not skip any pages just because it happens not to get a lock on them
- it will refuse to die if the lock it holds on the table conflicts with a user 
lock

Unless you are in the habit of taking strong locks on the table, you shouldn't
notice a difference.  Anti-wraparound VACUUM is a routine activity and does not
interfere with DML, just like a normal VACUUM.

Yours,
Laurenz Albe





Re: autovacuum_freeze_max_age on append-only tables

2022-04-21 Thread Peter Geoghegan
On Thu, Apr 21, 2022 at 8:14 PM Senor  wrote:
> Are the autovacuum_vacuum_cost_* settings handled any differently for
> 'to avoid wraparound' vacuums? I understand that it won't give up a lock
> but I was expecting it to still back off due to cost and allow the query
> with conflicting lock to proceed.

In general, no. For the most part an antiwraparound autovacuum does
exactly the same work as any other autovacuum. Or any other aggressive
VACUUM, at least. But even the extra work that it does over what
non-aggressive VACUUM is still work that any VACUUM might do, if the
circumstances were right.

We still freeze in regular VACUUMs, provided we scan pages with XIDs
that are sufficiently old. The most important difference between it
and aggressive VACUUM is that the former can skip all-visible pages
that have unfrozen XIDs, putting that work off. This can sometimes
lead to a big balloon payment later on, when you finally have an
aggressive VACUUM. I think that that's a design flaw that ought to be
fixed.

Currently non-aggressive VACUUMs always skip all-visible pages. They
should probably freeze some older all-visible pages eagerly, rather
than skipping them, so that the system never gets too far behind on
freezing.

> Is there any benefit to manually running a vacuum every so many inserts
> as opposed to using autovacuum_freeze_max_age. And in this case should
> it be a vacuum freeze.

Given your restrictions, this is probably the best option available.
But maybe you should just set vacuum_freeze_min_age to 0 at the table
level, instead of using vacuum freeze (so you freeze more without
doing aggressive vacuuming all the time, which FREEZE also forces).

Users understandably think that there are several different flavors of
vacuum, but that's not really true (apart from VACUUM FULL, which
really is quite different). The difference between aggressive and
non-aggressive can be big in practice due to an accumulation of
unfrozen pages over multiple non-aggressive vacuums.

-- 
Peter Geoghegan




Re: autovacuum_freeze_max_age on append-only tables

2022-04-21 Thread David G. Johnston
On Thu, Apr 21, 2022 at 8:15 PM Senor  wrote:

>
> Are the autovacuum_vacuum_cost_* settings handled any differently for
> 'to avoid wraparound' vacuums?



> I understand that it won't give up a lock
> but I was expecting it to still back off due to cost and allow the query
> with conflicting lock to proceed.
>

IIUC "conflicting lock to proceed" is just a different way to say "give up
a lock".

In any case the cost-based stuff throttles I/O only (per the docs at least)
but even while sleeping it still holds its lock.  And it won't be kicked
off of the lock by other processes.  I don't see where it is documented
that the autovacuum cost settings are altered during the anti-wraparound
vacuum so I presume it will still sleep by default.


> Is there any benefit to manually running a vacuum every so many inserts
> as opposed to using autovacuum_freeze_max_age. And in this case should
> it be a vacuum freeze. Rows are never updated or deleted except for the
> occasional roll back due to dropped network connections.
>
>
You might consider creating a security definer function (that performs
vacuum freeze on the table) owned by the table owner and grant your
inserting process the ability to execute it.

David J.


RE: Huge archive log generate in Postgresql-13

2022-04-21 Thread Ram Pratap Maurya
Dear Adrian,

We are using  "pg_log_archive" for Point-in-Time Recovery and DR replication.


Regards,
Ram Pratap.

-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com] 
Sent: 21 April 2022 21:35
To: Ram Pratap Maurya ; 
pgsql-gene...@postgresql.org
Subject: Re: Huge archive log generate in Postgresql-13

On 4/19/22 21:31, Ram Pratap Maurya wrote:
> Dear Adrian,
> 
> We are using binary replication and par day *pg_log_archive* total 
> size is increase after upgrade for PG11 to PG13.

So what is pruning the WAL's in pg_log_archive?

Or to put it another way what is the purpose of pg_log_archive and how is it 
managed?

> 
> Regards,
> 
> Ram Pratap.
> 



--
Adrian Klaver
adrian.kla...@aklaver.com