Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día viernes, abril 23, 2021 a las 11:32:02a. m. +0200, Matthias Apitz 
escribió:

> El día viernes, abril 23, 2021 a las 10:48:24a. m. +0200, Laurenz Albe 
> escribió:
> 
> > > The serverlog has around this time (sorry for German):
> > > 
> > > 2021-04-23 05:55:23.591 CEST [2317] LOG:  unvollständige Message vom 
> > > Client
> > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER:  Speicher aufgebraucht
> > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL:  Kann Zeichenkettenpuffer mit 
> > > 0 Bytes nicht um 1380986158 Bytes vergrößern.
> > > 2021-04-23 05:55:23.593 CEST [2317] LOG:  konnte Daten nicht an den 
> > > Client senden: Datenübergabe unterbrochen (broken pipe)
> > > 2021-04-23 05:55:23.593 CEST [2317] FATAL:  Verbindung zum Client wurde 
> > > verloren
> > > 2021-04-23 05:57:20.339 CEST [27190] LOG:  unerwartetes EOF auf 
> > > Client-Verbindung mit einer offenen Transaktion
> > > 2021-04-23 05:59:17.419 CEST [26827] LOG:  unerwartetes EOF auf 
> > > Client-Verbindung mit einer offenen Transaktion
> > > 
> > > What else could be checked for this? Any hints about this issue?
> > 
> > It seems pretty clear that both the client and the server are waiting for 
> > each other.
> 
> Yes, they are both alive as processes. Both are running on the same
> Linux server, i.e. no real network between. And any loss of data would
> have caused perhaps the end of one (and then the other) process.
> I will try to gather a TCPDUMP for this.
> 

I hit the same situation again. The ESQL/C client sent an UPDATE to the
PostgreSQL server:

# gdb /opt/lib/sisis/catserver/bin/INDEX 31864
...
#0  0x7f688bb955c4 in poll () from /lib64/libc.so.6
#1  0x7f688a82622f in pqSocketPoll () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#2  0x7f688a8260ff in pqSocketCheck () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#3  0x7f688a825f83 in pqWaitTimed () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#4  0x7f688a825f55 in pqWait () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#5  0x7f688a821d3a in PQgetResult () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#6  0x7f688a822478 in PQexecFinish () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#7  0x7f688a82212e in PQexecParams () from 
/usr/local/sisis-pap/pgsql/lib/libpq.so.5
#8  0x7f688b792216 in ecpg_execute ()
   from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#9  0x7f688b7935de in ecpg_do () from 
/usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#10 0x7f688b7936ef in ECPGdo () from 
/usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#11 0x7f688dd831c6 in update_record () at per_kette.pgc:1968
#12 0x7f688dd7ed16 in per_kette (zugriff=10, scroll=-2, lock=-2, 
key=-2, 
sto=-2, p_daten=0x7ffef0499950, sel_anw=0x0, where_anw=0x0, 
p_btw_daten=0x0, 
order_by=0x0, auf_ab=0x0, group_by=0x0, having=0x0, into_temp=0x0, 
count=0x7ffef0498670) at per_kette.pgc:497
#13 0x7f688dcf770f in DB_rewr (tabmodul=0x7f688dd7e7ca , 
p_daten=0x7ffef0499950) at dbcall.pgc:3274
#14 0x7f688ea58821 in BKDeskInsert (setid=FSTAB_Personen, kette=..., 
katkey=168952979) at ./BKDesk.c:1447


and the Pos server sits idle:

# select * from pg_stat_activity where client_port = 52288;

 datid  | datname |  pid  | leader_pid | usesysid | usename |
application_name | client_addr | client_hostname | client_port |
  backend_start  |   xact_start|   
query_start   |  state_change   | wait_event_type | 
wait_event |state| backend_xid | backend_xmin |query
|  backend_type  
+-+---++--+-+-+-+-+-+-+-+-+-+-++-+-+--+-+
 951941 | sisis   | 31894 ||16384 | sisis   | SunRise DBCALL 
V7.2 (pid=31864) | 127.0.0.1   | |   52288 | 25.04.2021 
10:57:15.922137 CEST | 25.04.2021 10:57:15.968081 CEST | 25.04.2021 
10:57:16.051256 CEST | 25.04.2021 10:57:16.051273 CEST | Client  | 
ClientRead | idle in transaction |18314796 | 18314796 | fetch 
per_kette_seq | client backend


But this time I was collecting all TCP traffic of all clients to and
from the port 5432. This TCPDUMP command writes all traffic as binary
data to files of 100.000.000 bytes in size (-C 100) and appends a number
when a new file is needed:

# /usr/sbin/tcpdump -n -i lo -C 100 -w /var/spool/sisis/tcp5432. port 5432 &

Later one can look into the files for the local port 52288 of the
connection with:

# tcpdump -n -r tcp5432.2 -Xx -s1024 port 52288 > INDEX.52288.txt


10:57:15.921783 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [S], seq 3652783347, 
win 65495, options

impact of version upgrade on fts

2021-04-25 Thread Malik Rumi
Greetings.

I am about to do a long overdue upgrade to the latest version, which I
believe is 13.2. However, I have full text search in my current install
(9.4) and I am wondering if there are any special provisions I need to take
to make sure that is not lost in the transition?  If this would be true for
any other packages, please advise and or point me to the place in the docs
where I can read up on this. I looked and did not see anything, which just
proves I didn't know what to look for. Thanks.



*“None of you has faith until he loves for his brother or his neighbor what
he loves for himself.”*


Re: impact of version upgrade on fts

2021-04-25 Thread Adrian Klaver

On 4/25/21 5:28 AM, Malik Rumi wrote:

Greetings.

I am about to do a long overdue upgrade to the latest version, which I 
believe is 13.2. However, I have full text search in my current install 
(9.4) and I am wondering if there are any special provisions I need to 
take to make sure that is not lost in the transition?  If this would be 
true for any other packages, please advise and or point me to the place 
in the docs where I can read up on this. I looked and did not see 
anything, which just proves I didn't know what to look for. Thanks.


FTS used to be a separate extension(tsearch2) before version Postgres 
version 8.3. Since then it has been integrated into the core code, so it 
would not be lost. That being said it would be advisable to read the 
release notes for 9.5 --> 13 to see what changed in the core code.






*/“None of you has faith until he loves for his brother or his neighbor 
what he loves for himself.”/*



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día domingo, abril 25, 2021 a las 01:54:49p. m. +0200, Matthias Apitz 
escribió:

>From the collected data, I tend to say: this must be a bug in the Pos
> server...
> 

At the end of the day, it turned out that out client caused the problem.

Because we were hunting some other issue, the client was logging some
message with fprintf(stderr, ...) but the fd=2 was not stderr, fd=2 was the
connection to the socket of the PostgreSQL server. And ofc the Pos
server could not understand our human message.

Problem solved. Sorry for all the noise.

matthias
-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Tom Lane
Matthias Apitz  writes:
> At the end of the day, it turned out that out client caused the problem.
> Because we were hunting some other issue, the client was logging some
> message with fprintf(stderr, ...) but the fd=2 was not stderr, fd=2 was the
> connection to the socket of the PostgreSQL server. And ofc the Pos
> server could not understand our human message.

Hmph.  So why wasn't this visible in the tcpdump trace?

My initial thought was that the server process should log a message
and drop the connection once it received something that didn't look
like a valid Postgres-protocol message, on the assumption that
message-boundary sync could never be recovered reliably.  But on
further thought, it would start by taking the second through fifth
bytes of the input as a message length, and it wouldn't really do
anything interesting until it had collected that much input ---
and, if those bytes were text, that would look like a large number.
So what the server is really doing is waiting for the rest of the
imaginary message, which will never arrive.

Maybe we should have more sanity checking on alleged message lengths.
I recall there is some checking of that sort in libpq, but I'm less
sure about the server.

regards, tom lane




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día domingo, abril 25, 2021 a las 11:51:45a. m. -0400, Tom Lane escribió:

> Matthias Apitz  writes:
> > At the end of the day, it turned out that out client caused the problem.
> > Because we were hunting some other issue, the client was logging some
> > message with fprintf(stderr, ...) but the fd=2 was not stderr, fd=2 was the
> > connection to the socket of the PostgreSQL server. And ofc the Pos
> > server could not understand our human message.
> 
> Hmph.  So why wasn't this visible in the tcpdump trace?

It was visible. That's why I detected it while looking up the TCP
packages before the UPDATE:

10:57:16.051326 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 
14280:14407, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr 
3424445708], length 127
0x:  4500 00b3 b84e 4000 4006 83f4 7f00 0001  EN@.@...
0x0010:  7f00 0001 cc40 1538 d9b9 44bb fcd1 afb7  .@.8..D.
0x0020:  8018 0200 fea7  0101 080a cc1c e50c  
0x0030:  cc1c e50c 5352 502d 3236 3937 363a 2072  SRP-26976:.r
0x0040:  6573 746f 7265 6420 7374 7275 6374 206b  estored.struct.k
0x0050:  6574 7465 7361 747a 2066 6f72 2073 6574  ettesatz.for.set
0x0060:  6964 3a32 2c20 6465 736b 3a33 3831 3832  id:2,.desk:38182
0x0070:  3038 302c 2066 656c 646e 723a 362c 206b  080,.feldnr:6,.k
0x0080:  6174 6b65 793a 3136 3839 3532 3937 392c  atkey:168952979,
0x0090:  206b 6574 7465 7361 747a 2e66 7265 693a  .kettesatz.frei:
0x00a0:  302c 206b 6574 7465 7361 747a 2e61 6e7a  0,.kettesatz.anz
0x00b0:  3235 0a  25.
10:57:16.051388 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 
14407:14812, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr 
3424445708], length 405
0x:  4500 01c9 b84f 4000 4006 82dd 7f00 0001  EO@.@...
0x0010:  7f00 0001 cc40 1538 d9b9 453a fcd1 afb7  .@.8..E:
0x0020:  8018 0200 ffbd  0101 080a cc1c e50c  
0x0030:  cc1c e50c 5000  7e00 7570 6461 7465  P...~.update
0x0040:  2070 6572 5f6b 6574 7465 2073 6574 2064  .per_kette.set.d
0x0050:  6573 6b6e 7220 3d20 2431 2020 2c20 6672  esknr.=.$1..,.fr

> 
> My initial thought was that the server process should log a message
> and drop the connection once it received something that didn't look
> like a valid Postgres-protocol message, on the assumption that
> message-boundary sync could never be recovered reliably.  But on
> further thought, it would start by taking the second through fifth
> bytes of the input as a message length, and it wouldn't really do
> anything interesting until it had collected that much input ---
> and, if those bytes were text, that would look like a large number.
> So what the server is really doing is waiting for the rest of the
> imaginary message, which will never arrive.
> 
> Maybe we should have more sanity checking on alleged message lengths.
> I recall there is some checking of that sort in libpq, but I'm less
> sure about the server.

Yes, this would make sense.

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Tom Lane
Matthias Apitz  writes:
> El día domingo, abril 25, 2021 a las 11:51:45a. m. -0400, Tom Lane escribió:
>> Hmph.  So why wasn't this visible in the tcpdump trace?

> It was visible. That's why I detected it while looking up the TCP
> packages before the UPDATE:

> 10:57:16.051326 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 
> 14280:14407, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr 
> 3424445708], length 127
> 0x:  4500 00b3 b84e 4000 4006 83f4 7f00 0001  EN@.@...
> 0x0010:  7f00 0001 cc40 1538 d9b9 44bb fcd1 afb7  .@.8..D.
> 0x0020:  8018 0200 fea7  0101 080a cc1c e50c  
> 0x0030:  cc1c e50c 5352 502d 3236 3937 363a 2072  SRP-26976:.r
> 0x0040:  6573 746f 7265 6420 7374 7275 6374 206b  estored.struct.k
> 0x0050:  6574 7465 7361 747a 2066 6f72 2073 6574  ettesatz.for.set
> 0x0060:  6964 3a32 2c20 6465 736b 3a33 3831 3832  id:2,.desk:38182
> 0x0070:  3038 302c 2066 656c 646e 723a 362c 206b  080,.feldnr:6,.k
> 0x0080:  6174 6b65 793a 3136 3839 3532 3937 392c  atkey:168952979,
> 0x0090:  206b 6574 7465 7361 747a 2e66 7265 693a  .kettesatz.frei:
> 0x00a0:  302c 206b 6574 7465 7361 747a 2e61 6e7a  0,.kettesatz.anz
> 0x00b0:  3235 0a  25.

Ah.  On looking at the server code, I see there *is* some validation
on the message type code --- but here, the first byte is 'S' which
happens to be a legal message type, so you got past that check.
Still, the apparent message length is then "RP-2" (0x52502d32)
which is bigger than 1GB so I don't quite understand why you didn't
get an "out of memory" error.

In any case, we could clearly do with some more sanity checking than
is there now.  I'll post a patch for that in a bit.

regards, tom lane




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Karsten Hilbert
Am Sun, Apr 25, 2021 at 01:21:25PM -0400 schrieb Tom Lane:

> > 10:57:16.051326 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 
> > 14280:14407, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr 
> > 3424445708], length 127
> > 0x:  4500 00b3 b84e 4000 4006 83f4 7f00 0001  EN@.@...
> > 0x0010:  7f00 0001 cc40 1538 d9b9 44bb fcd1 afb7  .@.8..D.
> > 0x0020:  8018 0200 fea7  0101 080a cc1c e50c  
> > 0x0030:  cc1c e50c 5352 502d 3236 3937 363a 2072  SRP-26976:.r
> > 0x0040:  6573 746f 7265 6420 7374 7275 6374 206b  estored.struct.k
> > 0x0050:  6574 7465 7361 747a 2066 6f72 2073 6574  ettesatz.for.set
> > 0x0060:  6964 3a32 2c20 6465 736b 3a33 3831 3832  id:2,.desk:38182
> > 0x0070:  3038 302c 2066 656c 646e 723a 362c 206b  080,.feldnr:6,.k
> > 0x0080:  6174 6b65 793a 3136 3839 3532 3937 392c  atkey:168952979,
> > 0x0090:  206b 6574 7465 7361 747a 2e66 7265 693a  .kettesatz.frei:
> > 0x00a0:  302c 206b 6574 7465 7361 747a 2e61 6e7a  0,.kettesatz.anz
> > 0x00b0:  3235 0a  25.
>
> Ah.  On looking at the server code, I see there *is* some validation
> on the message type code --- but here, the first byte is 'S' which
> happens to be a legal message type, so you got past that check.
> Still, the apparent message length is then "RP-2" (0x52502d32)
> which is bigger than 1GB so I don't quite understand why you didn't
> get an "out of memory" error.

But he did, didn't he ?

> > > The serverlog has around this time (sorry for German):
> > >
> > > 2021-04-23 05:55:23.591 CEST [2317] LOG:  unvollständige Message vom 
> > > Client

"incomplete message from client"

> > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER:  Speicher aufgebraucht

"memory exhausted"

> > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL:  Kann Zeichenkettenpuffer mit 
> > > 0 Bytes nicht um 1380986158 Bytes vergrößern.

"cannot extend 0-byte string buffer by 1380986158 bytes"

> > > 2021-04-23 05:55:23.593 CEST [2317] LOG:  konnte Daten nicht an den 
> > > Client senden: Datenübergabe unterbrochen (broken pipe)

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Tom Lane
Karsten Hilbert  writes:
> Am Sun, Apr 25, 2021 at 01:21:25PM -0400 schrieb Tom Lane:
>> Still, the apparent message length is then "RP-2" (0x52502d32)
>> which is bigger than 1GB so I don't quite understand why you didn't
>> get an "out of memory" error.

> But he did, didn't he ?

Those log messages do appear to match the theory, but in the case at
hand, the server seems to have hung up without throwing any error.

However, we'd reproduce that if the apparent "message length" was
less than 1GB (assuming the server process could malloc that much
memory).  That'd occur for any second byte less than 0x40, such
as an ASCII digit.

regards, tom lane




Re: impact of version upgrade on fts

2021-04-25 Thread Robert Treat
On Sun, Apr 25, 2021 at 11:27 AM Adrian Klaver
 wrote:
>
> On 4/25/21 5:28 AM, Malik Rumi wrote:
> > Greetings.
> >
> > I am about to do a long overdue upgrade to the latest version, which I
> > believe is 13.2. However, I have full text search in my current install
> > (9.4) and I am wondering if there are any special provisions I need to
> > take to make sure that is not lost in the transition?  If this would be
> > true for any other packages, please advise and or point me to the place
> > in the docs where I can read up on this. I looked and did not see
> > anything, which just proves I didn't know what to look for. Thanks.
>
> FTS used to be a separate extension(tsearch2) before version Postgres
> version 8.3. Since then it has been integrated into the core code, so it
> would not be lost. That being said it would be advisable to read the
> release notes for 9.5 --> 13 to see what changed in the core code.
>

You should also check the release notes / process of any "non-core"
extensions you might be using, for example PostGIS has had a number of
changes and you'll need to upgrade the extension itself to work in the
new version of Postgres. Specifics around that will also depend on how
you instead to run your upgrade process.


Robert Treat
https://xzilla.net




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día domingo, abril 25, 2021 a las 08:05:21p. m. +0200, Karsten Hilbert 
escribió:

> > > > The serverlog has around this time (sorry for German):
> > > >
> > > > 2021-04-23 05:55:23.591 CEST [2317] LOG:  unvollständige Message vom 
> > > > Client
> 
>   "incomplete message from client"
> 
> > > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER:  Speicher aufgebraucht
> 
>   "memory exhausted"
> 
> > > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL:  Kann Zeichenkettenpuffer 
> > > > mit 0 Bytes nicht um 1380986158 Bytes vergrößern.
> 
>   "cannot extend 0-byte string buffer by 1380986158 bytes"
> 
> > > > 2021-04-23 05:55:23.593 CEST [2317] LOG:  konnte Daten nicht an den 
> > > > Client senden: Datenübergabe unterbrochen (broken pipe)
> 

These messages at 05:55:xx are caused by our automatic deployment of the
software every day which starts at 05:55:00 by cron and short after installing 
all
software all servers (the PostgreSQL clients) get stopped (i.e. killed)
and restarted. The hung appears a few minutes later at 05:57:xx.

matthias
-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día domingo, abril 25, 2021 a las 01:54:49p. m. +0200, Matthias Apitz 
escribió:

...

> # select * from pg_stat_activity where client_port = 52288;
> 
>  datid  | datname |  pid  | leader_pid | usesysid | usename |
> application_name | client_addr | client_hostname | client_port |  
> backend_start  |   xact_start|   
> query_start   |  state_change   | wait_event_type | 
> wait_event |state| backend_xid | backend_xmin |query  
>   |  backend_type  
> +-+---++--+-+-+-+-+-+-+-+-+-+-++-+-+--+-+
>  951941 | sisis   | 31894 ||16384 | sisis   | SunRise DBCALL 
> V7.2 (pid=31864) | 127.0.0.1   | |   52288 | 25.04.2021 
> 10:57:15.922137 CEST | 25.04.2021 10:57:15.968081 CEST | 25.04.2021 
> 10:57:16.051256 CEST | 25.04.2021 10:57:16.051273 CEST | Client  | 
> ClientRead | idle in transaction |18314796 | 18314796 | fetch 
> per_kette_seq | client backend

Having solved this issue, one last question: Is there any good way for a
C-written or ESQL/C-written client to get the local port number or even
the PID of the PostgreSQL server process on the other end of the
connection for logging/tracing purpose?

Thanks

matthias
-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
¡Con Cuba no te metas!  «»  Don't mess with Cuba!  «»  Leg Dich nicht mit Kuba 
an!
http://www.cubadebate.cu/noticias/2020/12/25/en-video-con-cuba-no-te-metas/




Re: client waits for end of update operation and server proc is idle

2021-04-25 Thread Matthias Apitz
El día lunes, abril 26, 2021 a las 07:38:40a. m. +0200, Matthias Apitz escribió:

> Having solved this issue, one last question: Is there any good way for a
> C-written or ESQL/C-written client to get the local port number or even
> the PID of the PostgreSQL server process on the other end of the
> connection for logging/tracing purpose?

Ah, I found it as

int PQbackendPID(const PGconn *conn);

One should read firt the fine documentatio befor asking stupid
questions: https://www.postgresql.org/docs/11/libpq-status.html

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
¡Con Cuba no te metas!  «»  Don't mess with Cuba!  «»  Leg Dich nicht mit Kuba 
an!
http://www.cubadebate.cu/noticias/2020/12/25/en-video-con-cuba-no-te-metas/