Re: libpq: Which functions may hang due to network issues?

2021-12-05 Thread Daniel Frey
> On 4. Dec 2021, at 22:43, Laurenz Albe  wrote:
> 
> On Fri, 2021-12-03 at 21:33 +0100, Daniel Frey wrote:
>> But the real issue, at least for me, is PQfinish(). Considering that my 
>> application is not
>> allowed to hang (or crash, leak, ...), what should I do in case of a timeout?
> 
> I am tempted to say that you shouldn't use TCP with the requirement that it 
> should not hang.

We actually use UDP in a lot of places, specifically Radius. But the DB 
connection is supposed to be TCP, no?

>> I have existing
>> connections and at some point the network connections stop working (e.g. due 
>> to a firewall
>> issue/reboot), etc. If I don't want a resource leak, I *must* call 
>> PQfinish(), correct?
>> But I have no idea whether it might hang. If you don't want to guarantee 
>> that PQfinish()
>> will not hang, then please advise how to use libpq properly in this 
>> situation. If there
>> some asynchronous version of PQfinish()? Or should I handle hanging 
>> connections differently?
> 
> You could start a separate process that has your PostgreSQL connection and 
> kill it if it
> times out.  But then you'd have a similar problem communicating with that 
> process.

Shifting the problem somewhere else (and adding even more complexity to the 
system) doesn't solve it.

> A normal thing to do when your database call times out or misbehaves in other 
> ways is
> to give up, report an error and die (after some retries perhaps).

Our software is expected to run 24/7 without dying just because some other 
system has a  (temporary) outage. And when database connections die, we issue 
an alarm and we regularly check if we can open new ones in a rate limited 
manner, so we don't flood the network and the DB with connection requests. We 
then clear the alarm once DB connectivity comes up again. Our software includes 
fallback logic on how to minimize customer impact while DB connectivity is down 
or when another systems is temporarily unavailable, this is a defined and 
controlled scenario. If we were to simply crash, what would the next system up 
the chain do? See that we are not responsing, so it would also crash? (BTW, I'm 
working for a big telco company in Germany, just to give some idea/perspective 
what kind of systems we are talking about).

With all that said, I think that PostgreSQL/libpq should have a clear, 
documented way to get rid of a connection that is guaranteed to not hang. It 
has something similar for almost all other methods like opening connections, 
sending request, retrieving results. Why stop there?





Re: Max connections reached without max connections reached

2021-12-05 Thread Dilip Kumar
On Sun, Dec 5, 2021 at 10:55 AM Dilip Kumar  wrote:
>
> On Fri, Dec 3, 2021 at 9:02 PM Tom Lane  wrote:
> >
> > Dilip Kumar  writes:
> > > On Thu, Dec 2, 2021 at 9:35 AM Dilip Kumar  wrote:
> > >> I think there is no such view or anything which tells about which
> > >> backend or transaction has more than 64 sub transaction.  But if we
> > >> are ready to modify the code then we can LOG that information in
> > >> GetNewTransactionId(), when first time we are marking it overflown.
> >
> > > I have prepared a small patch to log this information.
> >
> > Putting an elog call into GetNewTransactionId seems like a completely
> > horrid idea from a performance standpoint.  Especially if you put it
> > inside the XidGenLock hold, where it can block the entire system not just
> > the one process.  But even without that, this seems like a performance
> > penalty with basically no real-world benefit.  People who have issues
> > like this are not going to want to trawl the postmaster log for such
> > messages.
>
> Agreed with both points.  What about we add, subxid count and overflow
> status in LocalPgBackendStatus and through that, we can show in
> pg_stat_activity.  That way we don't have to report it ever and
> whenever the user is running pg_stat_activity they can fetch it
> directly from "proc->subxidStatus", along with fetching the proc.xid
> and proc.xmin.  Does this make sense?
>

The attached patch implements the idea I explained.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 622e1012667c3cfa0c71f27590e2a49833970e22 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Sun, 5 Dec 2021 17:56:16 +0530
Subject: [PATCH v1] Add subtransaction count and overflow status in
 pg_stat_activity

If there are some backends having a lot of nested subtransaction
or the subtransaction cache is overflowed there is a no way to
detect that.  So this patch is making that easy by adding those
fields in pg_stat_activity view.
---
 doc/src/sgml/monitoring.sgml| 18 ++
 src/backend/catalog/system_views.sql|  4 +++-
 src/backend/storage/ipc/sinvaladt.c | 13 +
 src/backend/utils/activity/backend_status.c |  4 +++-
 src/backend/utils/adt/pgstatfuncs.c | 13 -
 src/include/catalog/pg_proc.dat |  6 +++---
 src/include/storage/sinvaladt.h |  4 +++-
 src/include/utils/backend_status.h  | 10 ++
 src/test/regress/expected/rules.out | 12 +++-
 9 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 62f2a33..3eca83a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -918,6 +918,24 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
The current backend's xmin horizon.
   
  
+
+ 
+  
+   subxact_count xid
+  
+  
+   The current backend's active subtransactions count.
+  
+ 
+
+ 
+  
+   subxact_overflowed xid
+  
+  
+   Set to true if current backend's subtransaction cache is overflowed.
+  
+ 
 
 
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 61b515c..3df23df 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -839,7 +839,9 @@ CREATE VIEW pg_stat_activity AS
 s.backend_xmin,
 S.query_id,
 S.query,
-S.backend_type
+S.backend_type,
+S.subxact_count,
+S.subxact_overflowed
 FROM pg_stat_get_activity(NULL) AS S
 LEFT JOIN pg_database AS D ON (S.datid = D.oid)
 LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 946bd8e..876d7fe 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -395,17 +395,20 @@ BackendIdGetProc(int backendID)
 
 /*
  * BackendIdGetTransactionIds
- *		Get the xid and xmin of the backend. The result may be out of date
- *		arbitrarily quickly, so the caller must be careful about how this
- *		information is used.
+ *		Get the xid and xmin, nsubxid and overflow status of the backend. The
+ *		result may be out of date arbitrarily quickly, so the caller must be
+ *		careful about how this information is used.
  */
 void
-BackendIdGetTransactionIds(int backendID, TransactionId *xid, TransactionId *xmin)
+BackendIdGetTransactionIds(int backendID, TransactionId *xid,
+		   TransactionId *xmin, int *nsubxid, bool *overflowed)
 {
 	SISeg	   *segP = shmInvalBuffer;
 
 	*xid = InvalidTransactionId;
 	*xmin = InvalidTransactionId;
+	*nsubxid = 0;
+	*overflowed = false;
 
 	/* Need to lock out additions/removals of backends */
 	LWLockAcquire(SInvalWriteLock, LW_SHARED);
@@ -419,6 +422,8 @@ BackendIdGetTransactionIds(int backend

Re: libpq: Which functions may hang due to network issues?

2021-12-05 Thread Tom Lane
Daniel Frey  writes:
> With all that said, I think that PostgreSQL/libpq should have a clear, 
> documented way to get rid of a connection that is guaranteed to not hang. It 
> has something similar for almost all other methods like opening connections, 
> sending request, retrieving results. Why stop there?

AFAICS, PQfinish() already acts that way, at least up to the same level of
guarantee as you have for "all other methods".  That is, if you previously
set the connection into nonblock mode, it won't block.

regards, tom lane




Re: libpq: Which functions may hang due to network issues?

2021-12-05 Thread Daniel Frey
> On 5. Dec 2021, at 17:01, Tom Lane  wrote:
> 
> Daniel Frey  writes:
>> With all that said, I think that PostgreSQL/libpq should have a clear, 
>> documented way to get rid of a connection that is guaranteed to not hang. It 
>> has something similar for almost all other methods like opening connections, 
>> sending request, retrieving results. Why stop there?
> 
> AFAICS, PQfinish() already acts that way, at least up to the same level of
> guarantee as you have for "all other methods".  That is, if you previously
> set the connection into nonblock mode, it won't block.

OK, thanks Tom, that is at least something. I would still like this to be kinda 
documented/guaranteed, especially if nonblocking mode is required for this 
behavior (which is given in my case). But I guess that's not up to me, so I'll 
drop the topic and I'll just have to accept the status quo.

Thanks, Daniel





Re: Max connections reached without max connections reached

2021-12-05 Thread James Sewell
>
> Agreed with both points.  What about we add, subxid count and overflow
> status in LocalPgBackendStatus and through that, we can show in
> pg_stat_activity.  That way we don't have to report it ever and
> whenever the user is running pg_stat_activity they can fetch it
> directly from "proc->subxidStatus", along with fetching the proc.xid
> and proc.xmin.  Does this make sense?
>

Not sure about the overhead and performance aspects, but I like this
solution from a functional point of view.

-- 
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: Max connections reached without max connections reached

2021-12-05 Thread Amul Sul
On Mon, Dec 6, 2021 at 6:11 AM James Sewell  wrote:
>>
>> Agreed with both points.  What about we add, subxid count and overflow
>> status in LocalPgBackendStatus and through that, we can show in
>> pg_stat_activity.  That way we don't have to report it ever and
>> whenever the user is running pg_stat_activity they can fetch it
>> directly from "proc->subxidStatus", along with fetching the proc.xid
>> and proc.xmin.  Does this make sense?
>
>
> Not sure about the overhead and performance aspects, but I like this solution 
> from a functional point of view.
>

+1, I too like the idea. The patch doesn't seem to be doing any heavy
lifting, I think that much overhead should be acceptable.

Regards,
Amul




Re: ssl_crl_file Certificate Revocation List doesn't work for postgresql 11

2021-12-05 Thread Kyotaro Horiguchi
At Fri, 3 Dec 2021 15:41:51 +0800, Yi Sun  wrote in 
> Hi Kyotaro,
> 
> Thank you for your explanation, after putting the crl file to client, it
> works now, thanks.

Good to hear that. That portion of the documentation has been fixed on
the repository, and it will be released in the next minor releases.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Max connections reached without max connections reached

2021-12-05 Thread James Sewell
+1, I too like the idea. The patch doesn't seem to be doing any heavy
> lifting, I think that much overhead should be acceptable.
>

I'm guessing this won't be back-patched? Is it possible to somehow read
this information from a C function?

- James

-- 
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.