> -----Original Message-----
> From: Justin Pryzby <[email protected]>
> Sent: Tuesday, March 1, 2022 14:27
> To: [email protected]
> Cc: [email protected]
> Subject: Re: An I/O error occurred while sending to the backend (PG 13.4)
>
> On Tue, Mar 01, 2022 at 04:28:31PM +0000, [email protected]
> wrote:
> > Now, there is an additional component I think... Storage is on an array
> and I am not getting a clear answer as to where it is 😊 Is it possible
that
> something is happening at the storage layer? Could that be reported as a
> network issue vs a storage issue for Postgres?
>
> No. If there were an error with storage, it'd be reported as a local
error,
> and the query would fail, rather than failing with client-server
> communication.
>
> > Also, both machines are actually VMs. I forgot to mention that and not
> sure if that's relevant.
>
> Are they running on the same hypervisor ? Is that hyperv ?
> Lacking other good hypotheses, that does seem relevant.
>
> --
> Justin
Issue happened again last night. I did implement your recommendations but it
didn't seem to prevent the issue:
tcp_keepalives_idle=9 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
tcp_keepalives_interval=9 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
tcp_keepalives_count=0 # TCP_KEEPCNT;
# 0 selects the system default
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
# 0 selects the system default
On the client application, the exceptions are:
2022/03/03 01:04:56 - Upsert2.0 - ERROR (version 7.1.0.0-12, build 1 from
2017-05-16 17.18.02 by buildguy) : Unexpected error
2022/03/03 01:04:56 - Upsert2.0 - ERROR (version 7.1.0.0-12, build 1 from
2017-05-16 17.18.02 by buildguy) :
org.pentaho.di.core.exception.KettleStepException:
2022/03/03 01:04:56 - Upsert2.0 - Error in step, asking everyone to stop
because of:
2022/03/03 01:04:56 - Upsert2.0 -
2022/03/03 01:04:56 - Upsert2.0 - Error inserting/updating row
2022/03/03 01:04:56 - Upsert2.0 - An I/O error occurred while sending to the
backend.
2022/03/03 01:04:56 - Upsert2.0 -
2022/03/03 01:04:56 - Upsert2.0 -
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.trans.steps.insertupdate.InsertUpdate.processRow(InsertUpdate.java:313)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.trans.step.RunThread.run(RunThread.java:62)
2022/03/03 01:04:56 - Upsert2.0 - at java.lang.Thread.run(Thread.java:745)
2022/03/03 01:04:56 - Upsert2.0 - Caused by:
org.pentaho.di.core.exception.KettleDatabaseException:
2022/03/03 01:04:56 - Upsert2.0 - Error inserting/updating row
2022/03/03 01:04:56 - Upsert2.0 - An I/O error occurred while sending to the
backend.
2022/03/03 01:04:56 - Upsert2.0 -
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.core.database.Database.insertRow(Database.java:1321)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.core.database.Database.insertRow(Database.java:1245)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.core.database.Database.insertRow(Database.java:1233)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.trans.steps.insertupdate.InsertUpdate.lookupValues(InsertUpdate.java:163)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.trans.steps.insertupdate.InsertUpdate.processRow(InsertUpdate.java:299)
2022/03/03 01:04:56 - Upsert2.0 - ... 2 more
2022/03/03 01:04:56 - Upsert2.0 - Caused by: org.postgresql.util.PSQLException:
An I/O error occurred while sending to the backend.
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:382)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:166)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:134)
2022/03/03 01:04:56 - Upsert2.0 - at
org.pentaho.di.core.database.Database.insertRow(Database.java:1288)
2022/03/03 01:04:56 - Upsert2.0 - ... 6 more
2022/03/03 01:04:56 - Upsert2.0 - Caused by: java.net.SocketException:
Connection reset
2022/03/03 01:04:56 - Upsert2.0 - at
java.net.SocketInputStream.read(SocketInputStream.java:209)
2022/03/03 01:04:56 - Upsert2.0 - at
java.net.SocketInputStream.read(SocketInputStream.java:141)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.PGStream.receiveChar(PGStream.java:453)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2119)
2022/03/03 01:04:56 - Upsert2.0 - at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
2022/03/03 01:04:56 - Upsert2.0 - ... 11 more
On the DB:
2022-03-03 01:04:40 EST [21228] LOG: could not receive data from client: An
existing connection was forcibly closed by the remote host.
2022-03-03 01:04:40 EST [21228] LOG: unexpected EOF on client connection with
an open transaction
2022-03-03 01:04:40 EST [21228] LOG: disconnection: session time: 0:02:07.570
user=postgres database=Pepper host=10.64.17.211 port=63686
2022-03-03 01:04:41 EST [21160] LOG: could not receive data from client: An
existing connection was forcibly closed by the remote host.
2022-03-03 01:04:41 EST [21160] LOG: unexpected EOF on client connection with
an open transaction
2022-03-03 01:04:41 EST [21160] LOG: disconnection: session time: 0:02:07.730
user=postgres database=Pepper host=10.64.17.211 port=63688
I don't know if that is meaningful, but I see a 15s delay between the timestamp
on the database and on the application. The servers are synchronized properly.
I have asked the IT team to look at the VMs and see if anything strange is
happening. They are not too happy with installing WireShark to do more analysis
given the "complexity of the tools and size of the logs" 😊 I keep on pushing.
Thank you,
Laurent.