restore and recovery using WAL: unkown messages in serverlog
Hello, I'm testing WAL based backup and restore/recovery and encounter some unclear messages in the 'serverlog' on recovery. The used procedure on a test server is: Kick the copy of the current WAL to the archive: $ psql -Usisis -c "select pg_switch_wal();" $ ls -l /data/postgresql11/wal_archive insgesamt 49212 -rw--- 1 postgres postgres 16777216 3. Sep 07:53 00010001008B -rw--- 1 postgres postgres 16777216 3. Sep 08:52 00010001008C -rw--- 1 postgres postgres 16777216 3. Sep 08:54 00010001008D Now we backup the cluster: $ pg_basebackup -Usisis -Ft -D /data/postgresql11/backup-20190903-1 $ ls -l /data/postgresql11/backup-20190903-1 insgesamt 6379972 -rw--- 1 postgres postgres 6509913088 3. Sep 09:02 base.tar -rw--- 1 postgres postgres 16779264 3. Sep 09:02 pg_wal.tar $ ls -l /data/postgresql11/wal_archive/ insgesamt 65620 -rw--- 1 postgres postgres 16777216 3. Sep 07:53 00010001008B -rw--- 1 postgres postgres 16777216 3. Sep 08:52 00010001008C -rw--- 1 postgres postgres 16777216 3. Sep 08:54 00010001008D -rw--- 1 postgres postgres 16777216 3. Sep 09:02 00010001008E -rw--- 1 postgres postgres 342 3. Sep 09:02 00010001008E.0028.backup We stop and "destroy" the cluster (i.e. we move it away): # /etc/init.d/postgres stop # mv /data/postgresql11/data /data/postgresql11/data.away # mkdir /data/postgresql11/data # chown postgres:postgres /data/postgresql11/data # chmod 0700 /data/postgresql11/data We restore the cluster as user 'postgres' and recover based on the WAL: $ tar xf /data/postgresql11/backup-20190903-1/base.tar-C /data/postgresql11/data/ $ tar xf /data/postgresql11/backup-20190903-1/pg_wal.tar -C /data/postgresql11/data/pg_wal/ For recover we create a file recovery.conf $ vim /data/postgresql11/data/recovery.conf restore_command = 'cp /data/postgresql11/wal_archive/%f %p' Now start PostgreSQL again: # /etc/init.d/postgres start In the 'serverlog' it gives some errors (marked with ***>); sorry that the messages are in German, it means 'cp: stat failed for ..., file or directory not found': 2019-09-03 09:18:44.355 CEST [25388] LOG: starte Wiederherstellung aus Archiv 2019-09-03 09:18:45.249 CEST [25388] LOG: Logdatei »00010001008E« aus Archiv wiederhergestellt 2019-09-03 09:18:45.306 CEST [25388] LOG: Redo beginnt bei 1/8E28 2019-09-03 09:18:45.308 CEST [25388] LOG: konsistenter Wiederherstellungszustand erreicht bei 1/8E000130 2019-09-03 09:18:45.308 CEST [25387] LOG: Datenbanksystem ist bereit, um lesende Verbindungen anzunehmen 2019-09-03 09:18:46.024 CEST [25388] LOG: Logdatei »00010001008F« aus Archiv wiederhergestellt ***> cp: der Aufruf von stat für '/data/postgresql11/wal_archive/000100010090' ist nicht möglich: Datei oder Verzeichnis nicht gefunden 2019-09-03 09:18:47.838 CEST [25388] LOG: Redo fertig bei 1/8F000140 2019-09-03 09:18:47.869 CEST [25388] LOG: Logdatei »00010001008F« aus Archiv wiederhergestellt ***> cp: der Aufruf von stat für '/data/postgresql11/wal_archive/0002.history' ist nicht möglich: Datei oder Verzeichnis nicht gefunden 2019-09-03 09:18:47.900 CEST [25388] LOG: gewählte neue Zeitleisten-ID: 2 2019-09-03 09:18:47.961 CEST [25388] LOG: Wiederherstellung aus Archiv abgeschlossen ***> cp: der Aufruf von stat für '/data/postgresql11/wal_archive/0001.history' ist nicht möglich: Datei oder Verzeichnis nicht gefunden 2019-09-03 09:18:48.072 CEST [25387] LOG: Datenbanksystem ist bereit, um Verbindungen anzunehmen What is causing the errors about the missing files in /data/postgresql11/wal_archive? Btw: Is there away to run the server for German UTF-8, but with English messages in the log? Thanks matthias -- Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub
Re: implicit transaction changes trigger behaviour
Thank you so much, the "stable" thing was it. I'm not sure if it is underdocumented, i clearly didn't adhere to the rule that a stable function " is guaranteed to return the same results given the same arguments for all rows within a single statement". BTW in my example i made a mistake too, but that was beside the point really. Cheers, Willy-Bas On Thu, Aug 29, 2019 at 3:35 PM Tom Lane wrote: > Willy-Bas Loos writes: > > I currently have a fairly complex use case to solve and one thing i tried > > was a deferred constraint trigger. I'm not sure if this solution is the > way > > to go, but anyway: As i was testing my code, i noticed that the trigger > > behaves differently depending on whether or not i explicitly use BEGIN > and > > COMMIT, even though there is only 1 query in the transaction. > > I am wondering if this is a bug in postgresql? > > I think the issue is that you marked the trigger as STABLE. That causes > it to use the calling query's snapshot so it doesn't see the updates, > if it's fired during the delete query and not during the subsequent > COMMIT. If I remove the STABLE label then it works as you expect. > > This is probably under-documented but I'm not sure that it should be > considered a bug. > > The trigger seems a bit broken besides that, in that the comments claim it > has something to do with the OLD row's id field(s) but the query is not in > fact taking that into account. > > regards, tom lane > -- Willy-Bas Loos
Re: restore and recovery using WAL: unkown messages in serverlog
On Tue, Sep 3, 2019 at 9:57 AM Matthias Apitz wrote: > 2019-09-03 09:18:46.024 CEST [25388] LOG: Logdatei > »00010001008F« aus Archiv wiederhergestellt > ***> cp: der Aufruf von stat für > '/data/postgresql11/wal_archive/000100010090' ist nicht möglich: > Datei oder Verzeichnis nicht gefunden It's not an error, it is searching the next wal that after 0x8F is 0x90, but the log is not there, so the recovery has finished. The cluster does not know where to stop because your recovery.conf did not specified, so it tries to do as much wal processing as it can and stops when it finds no more. > 2019-09-03 09:18:47.838 CEST [25388] LOG: Redo fertig bei 1/8F000140 Redo is done. > ***> cp: der Aufruf von stat für > '/data/postgresql11/wal_archive/0002.history' ist nicht möglich: Datei > oder Verzeichnis nicht gefunden having finisched the wals it is searching for a branch on timeline 2 > 2019-09-03 09:18:47.900 CEST [25388] LOG: gewählte neue Zeitleisten-ID: > 2 and it switches on timeline 2 autonomously Pretty much that should be what you see in the logs. > Btw: Is there away to run the server for German UTF-8, but with English > messages in the log? Set lc_messages in the postgresql.conf to something in english, I think 'en_US.UTF-8' could do the trick, but haven't tried it. Luca
Re: pgmodeler ?
<< On Sun, 1 Sep 2019 at 13:45, Thiemo Kellner wrote: > Quoting Olivier Gautherot : > > This is the specific error message: > >> >> Could not execute the SQL command. >> Message returned: ERROR: column pr.proisagg does not exist >> LINE 1: ...namespace AS ns ON pr.pronamespace = ns.oid WHERE pr.proisag... >> ^ >> HINT: Perhaps you meant to reference the column >> >> -- >> "They that would give up essential liberty for temporary safety deserve >> neither liberty nor safety." >> -- Benjamin Franklin > > > Can you confirm the versions of pgModeler and PG? Pgmodeler 0.91 does not > fully support PG 11.x and there is an update in preparation in 0.92 which > should. You may want to givi it a try. > > > Actually, this is a known problem: > https://github.com/pgmodeler/pgmodeler/issues/1281 > > Maybe you want to give the beta a shot: https://pgmodeler.io/download > >> > I built 9.2 beta and have beeen using it quite successfully against pg11 > for a while >
Slow statement using parallelism after 9.6>11 upgrade
Hi list We have upgraded our database from 9.6 to 11 (and updated PostGIS from 2.3 to 2.5 as well). We are using it among other things to store an OpenStreetMap database, populated by the osm2pgsql program and updated on a daily basis. The process used to take ~1h30 minutes before the upgrade, but after the upgrade it looks like it will never run to completion in a 24h time frame. The problem is apparently that the planner choses to run some statements in parallel, but these take a lot longer than when run without it. Here are the to explain analyze of the same statement, with parallelism on and off : Statement : PREPARE mark_ways_by_node(int8) AS select id from planet_osm_ways WHERE nodes && ARRAY[$1]; EXPLAIN EXECUTE mark_ways_by_node(1); max_parallel_workers_per_gather = 2 : Gather (cost=12545.61..2357352.25 rows=1420982 width=8) (actual time=2.577..64.028 rows=1 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Bitmap Heap Scan on planet_osm_ways (cost=11545.61..2214254.05 rows=592076 width=8) (actual time=0.213..0.213 rows=0 loops=3) Recheck Cond: (nodes && '{1}'::bigint[]) Heap Blocks: exact=1 -> Bitmap Index Scan on planet_osm_ways_nodes_idx (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.173..0.174 rows=1 loops=1) Index Cond: (nodes && '{1}'::bigint[]) Planning Time: 8.596 ms Execution Time: 64.135 ms max_parallel_workers_per_gather = 0 : Bitmap Heap Scan on planet_osm_ways (cost=11545.61..3462154.46 rows=1420982 width=8) (actual time=0.677..0.679 rows=1 loops=1) Recheck Cond: (nodes && '{1}'::bigint[]) Heap Blocks: exact=1 -> Bitmap Index Scan on planet_osm_ways_nodes_idx (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 rows=1 loops=1) Index Cond: (nodes && '{1}'::bigint[]) Planning Time: 0.193 ms Execution Time: 0.773 ms So this Parallel Bitmap Heap Scan seems to be quite problematic here. What runtime setting should I change to fix this, without losing the benefit of parallelism ? Thanks a lot ! Cheers -- Arnaud
Re: Slow statement using parallelism after 9.6>11 upgrade
"Arnaud L." writes: > We have upgraded our database from 9.6 to 11 (and updated PostGIS from > 2.3 to 2.5 as well). > ... Have you re-ANALYZEd the database? The problem with this query seems to be the spectacularly awful rowcount estimate here: >-> Bitmap Index Scan on planet_osm_ways_nodes_idx > (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 > rows=1 loops=1) > Index Cond: (nodes && '{1}'::bigint[]) The planner should be able to do better than that, given up-to-date statistics on the "nodes" column. regards, tom lane
Re: Slow statement using parallelism after 9.6>11 upgrade
Le 03/09/2019 à 15:43, Tom Lane a écrit : "Arnaud L." writes: We have upgraded our database from 9.6 to 11 (and updated PostGIS from 2.3 to 2.5 as well). ... Have you re-ANALYZEd the database? The problem with this query seems to be the spectacularly awful rowcount estimate here: You mean after the upgrade process ? Yes I have. I've juste re-run "ANALYZE table" to rule this out, estimates are the same. Maybe some statistic target problem ? Estimated number of rows is 284.196.352 Also, this is a GIN index on a bigint[] column. I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for the time being which solves this specific problem. These value don't look very sensible though, they are very high compared to the default ones. Cheers -- Arnaud
Re: Slow statement using parallelism after 9.6>11 upgrade
Le 03/09/2019 à 16:39, Paul Ramsey a écrit : On Sep 3, 2019, at 7:06 AM, Arnaud L. wrote: I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for the time being which solves this specific problem. These value don't look very sensible though, they are very high compared to the default ones. You can also leave that setting unchanged and just change the behaviour on your one table: ALTER TABLE nodes SET ( parallel_workers = 0); Nice, this will fix this precise query and leave parallelism in place for the other ones. Thanks ! -- Arnaud
Re: Slow statement using parallelism after 9.6>11 upgrade
Le 03/09/2019 à 17:02, Imre Samu a écrit : > We have upgraded our database from 9.6 to 11 This is now the latest PG ? PG11.5? Yes, PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit. - the osm2pgsql has an own parallelizations ... ( osm2pgsql --number-processes .. ) so be careful to add more parallelisation to the PG side with the same osm2pgsql parameters ! ( check the memory usages! ) - check the benchmarks and the tunnings: https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks - you can ask help on the : "osm dev mail list" (="dev OpenStreetMap developer discusssion" https://lists.openstreetmap.org/listinfo/dev ) Yes, I know. It as been fine tuned over some time now to fit our server's capacity. Disabling parallelisme has allowed the update to run in ~5000s, which is consistent with what we could witness before the upgrade. I had started writing an issue on the osm2pgsql github before realizing it was probably a PG-only issue. Cheers -- Arnaud
Re: Postgres HA issue - Standby server does not start after Master compute host is shut down
On Tue, 3 Sep 2019 20:27:37 +0530 Nagendra Bandi wrote: ... > *Problem Description :* > > PostgreSQL database is set up for High Availability and *asynchronous > streaming replication* with *hot standby* as described in > https://www.postgresql.org/docs/9.4/warm-standby.html#STREAMING-REPLICATION. > There is only 1 master and 1 standby server set up. The deployment is on > Openstack cloud where master database is running on one compute host and > standby server is running on another compute host. When master is brought > down (*compute host is powered off*), stand by is *not* able to take over. > The PostgreSQL database server on the standby fails to start with errors > messages given below. > > *Error logs from Standby Postgres DB:* > > *Logs form postgres log : > *< 2019-08-22 16:00:04.110 UTC >LOG: invalid record length at C/513BF150 > > < 2019-08-22 16:00:04.110 UTC >LOG: redo done at C/513BF118 > > < 2019-08-22 16:00:04.115 UTC >FATAL: WAL ends before end of online backup It looks like the standby never received the end of online backup record from the master. Could you describe how you built your standby? Regards,
Re: Slow statement using parallelism after 9.6>11 upgrade
> We have upgraded our database from 9.6 to 11 This is now the latest PG ? PG11.5? ( for example - in PG11.5 fixed: * "Fix failure to resolve deadlocks involving multiple parallel worker processes"* https://www.postgresql.org/docs/current/release-11-5.html ) > populated by the osm2pgsql program and updated on a daily basis. > What runtime setting should I change to fix this, without losing the benefit of parallelism ? - the osm2pgsql has an own parallelizations ... ( osm2pgsql --number-processes .. ) so be careful to add more parallelisation to the PG side with the same osm2pgsql parameters ! ( check the memory usages! ) - check the benchmarks and the tunnings: https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks - you can ask help on the : "osm dev mail list" (="dev OpenStreetMap developer discusssion" https://lists.openstreetmap.org/listinfo/dev ) Imre Arnaud L. ezt írta (időpont: 2019. szept. 3., K, 14:11): > Hi list > > We have upgraded our database from 9.6 to 11 (and updated PostGIS from > 2.3 to 2.5 as well). > We are using it among other things to store an OpenStreetMap database, > populated by the osm2pgsql program and updated on a daily basis. > > The process used to take ~1h30 minutes before the upgrade, but after the > upgrade it looks like it will never run to completion in a 24h time frame. > > The problem is apparently that the planner choses to run some statements > in parallel, but these take a lot longer than when run without it. > Here are the to explain analyze of the same statement, with parallelism > on and off : > > Statement : > PREPARE mark_ways_by_node(int8) AS select id from planet_osm_ways WHERE > nodes && ARRAY[$1]; > EXPLAIN EXECUTE mark_ways_by_node(1); > > > max_parallel_workers_per_gather = 2 : > > Gather (cost=12545.61..2357352.25 rows=1420982 width=8) (actual > time=2.577..64.028 rows=1 loops=1) >Workers Planned: 2 >Workers Launched: 2 >-> Parallel Bitmap Heap Scan on planet_osm_ways > (cost=11545.61..2214254.05 rows=592076 width=8) (actual > time=0.213..0.213 rows=0 loops=3) > Recheck Cond: (nodes && '{1}'::bigint[]) > Heap Blocks: exact=1 > -> Bitmap Index Scan on planet_osm_ways_nodes_idx > (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.173..0.174 > rows=1 loops=1) >Index Cond: (nodes && '{1}'::bigint[]) > Planning Time: 8.596 ms > Execution Time: 64.135 ms > > > max_parallel_workers_per_gather = 0 : > > Bitmap Heap Scan on planet_osm_ways (cost=11545.61..3462154.46 > rows=1420982 width=8) (actual time=0.677..0.679 rows=1 loops=1) >Recheck Cond: (nodes && '{1}'::bigint[]) >Heap Blocks: exact=1 >-> Bitmap Index Scan on planet_osm_ways_nodes_idx > (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268 > rows=1 loops=1) > Index Cond: (nodes && '{1}'::bigint[]) > Planning Time: 0.193 ms > Execution Time: 0.773 ms > > > So this Parallel Bitmap Heap Scan seems to be quite problematic here. > What runtime setting should I change to fix this, without losing the > benefit of parallelism ? > > Thanks a lot ! > > Cheers > -- > Arnaud > > >
Re: Postgres HA issue - Standby server does not start after Master compute host is shut down
Hi Jehan, Thanks for your quick response. Standby is built from the primary using pg_basebackup. i.e. Initial copy of the primary database is taken with pg_basebackup command and then restarted the server. Regards, Nagendra On Tue, Sep 3, 2019 at 8:43 PM Jehan-Guillaume (ioguix) de Rorthais < iog...@free.fr> wrote: > On Tue, 3 Sep 2019 20:27:37 +0530 > Nagendra Bandi wrote: > ... > > *Problem Description :* > > > > PostgreSQL database is set up for High Availability and *asynchronous > > streaming replication* with *hot standby* as described in > > > https://www.postgresql.org/docs/9.4/warm-standby.html#STREAMING-REPLICATION > . > > There is only 1 master and 1 standby server set up. The deployment is on > > Openstack cloud where master database is running on one compute host and > > standby server is running on another compute host. When master is brought > > down (*compute host is powered off*), stand by is *not* able to take > over. > > The PostgreSQL database server on the standby fails to start with errors > > messages given below. > > > > *Error logs from Standby Postgres DB:* > > > > *Logs form postgres log : > > *< 2019-08-22 16:00:04.110 UTC >LOG: invalid record length at C/513BF150 > > > > < 2019-08-22 16:00:04.110 UTC >LOG: redo done at C/513BF118 > > > > < 2019-08-22 16:00:04.115 UTC >FATAL: WAL ends before end of online > backup > > It looks like the standby never received the end of online backup record > from > the master. > > Could you describe how you built your standby? > > Regards, >
Re: Slow statement using parallelism after 9.6>11 upgrade
> On Sep 3, 2019, at 7:06 AM, Arnaud L. wrote: > > Le 03/09/2019 à 15:43, Tom Lane a écrit : >> "Arnaud L." writes: >>> We have upgraded our database from 9.6 to 11 (and updated PostGIS from 2.3 >>> to 2.5 as well). >>> ... >> Have you re-ANALYZEd the database? The problem with this query >> seems to be the spectacularly awful rowcount estimate here: > > You mean after the upgrade process ? Yes I have. > I've juste re-run "ANALYZE table" to rule this out, estimates are the same. > > Maybe some statistic target problem ? Estimated number of rows is 284.196.352 > Also, this is a GIN index on a bigint[] column. > > I've setup parallel_tuple_cost to 1.0 parallel_setup_cost to 5000.0 for the > time being which solves this specific problem. These value don't look very > sensible though, they are very high compared to the default ones. You can also leave that setting unchanged and just change the behaviour on your one table: ALTER TABLE nodes SET ( parallel_workers = 0); P. > > Cheers > -- > Arnaud > >
Re: Bad Estimate for multi tenant database queries
> > CREATE STATISTICS MT_ReservationDepartureUtc (ndistinct) on "AccountCode", > "DepartureUtc" from "Reservation"."Reservation"; > CREATE STATISTICS MT_ReservationArrivalUtc (ndistinct) on "AccountCode", > "ArrivalUtc" from "Reservation"."Reservation"; > CREATE STATISTICS MT_ReservationNoShowFeeId (ndistinct) on "AccountCode", > "NoShowFeeId" from "Reservation"."Reservation"; > CREATE STATISTICS MT_ReservationTimeSliceDefinitionId (ndistinct) on > "AccountCode", "TimeSliceDefinitionId" from "Reservation"."Reservation"; > > but that didn't help either > Did you try with 'dependencies' for the statistics_kind or only ndistinct? What was default_statistics_target set to at the time you created the extended statistics? I am not sure if that value is used, but I would assume so.
Re: literal vs dynamic partition constraint in plan execution
I think I would expect this behavior with how you have defined the constraints as the function results rather than just being strict comparisons to the timestamp field. Instead of this- Check constraints: > "y2018_mis_ora_check" CHECK (date_part('year'::text, mis_ora) = > 2018::double precision) I would expect this- Check constraints: > "y2018_mis_ora_check" CHECK mis_ora) >= make_timestamp(2018, 1, > 1, 0, 0, 0::double precision) AND < make_timestamp(2019, 1, > 1, 0, 0, 0::double precision)
Re: Upgrade 96 -> 11
On 9/2/19 5:52 PM, Adrian Klaver wrote: It's still creating the schema elements when it fails, it hasn't started linking yet Alright at least you still a working 9.6 cluster . Not sure where to go from here. Like you I am not sure how it can CREATE EXTENSION and not actually follow through on that. Especially with no errors for that operation. I'm going to have to think on this. Hopefully someone else has an idea on this and can chime in. Aah. I don't have postgis installed, still: pg_dump --binary-upgrade -s -d production -U postgres > production_binary.sql -- For binary upgrade, create an empty extension and insert objects into it DROP EXTENSION IF EXISTS tablefunc; SELECT pg_catalog.binary_upgrade_create_empty_extension('tablefunc', 'public', true, '1.0', NULL, NULL, ARRAY[]::pg_catalog.text[]); Try the above on your schema and see what you get. > > I have set PGBINOLD, PGBINNEW, PGDATAOLD, PGDATANEW correctly. -- Adrian Klaver adrian.kla...@aklaver.com
Running a Simple Update Statement Fails, Second Time Suceeds.
Here is simple query, I am running via function call. This statement runs fine as SQL but when put in function as plpgsql it failes with error below, when I ran second times in same session it succeeds: -- product_master_fdw=> \set VERBOSITY verbose product_master_fdw=> select sddf_update.tps_update_1(p_pres_id_in=>50241::integer, p_last_audit_update_dt_in=>'2019-09-03 12:44:21.356638'::timestamp , p_audit_update_user_name_tx_in=>'a...@xyz.com'::character varying, p_major_class_name_tx_in=>'TEST0826222'::character varying, p_effective_dt_in=>CURRENT_TIMESTAMP::timestamp ); ERROR: XX000: cache lookup failed for type 196609 CONTEXT: SQL statement "UPDATE product_history.external_sys_class_code_pres SET class_code_id = CASE WHEN p_class_code_id_in='0.1' THEN class_code_id ELSE p_class_code_id_in END , major_classification_cd = CASE WHEN p_major_classification_cd_in='.1' THEN major_classification_cd ELSE p_major_classification_cd_in END , major_classification_name_tx = CASE WHEN p_major_class_name_tx_in='0.1' THEN major_classification_name_tx ELSE p_major_class_name_tx_in END , coverage_short_name_tx = CASE WHEN p_coverage_short_name_tx_in='0.1' THEN coverage_short_name_tx ELSE p_coverage_short_name_tx_in END , coverage_name_tx = CASE WHEN p_coverage_name_tx_in='0.1' THEN coverage_name_tx ELSE p_coverage_name_tx_in END , cdt_source_system_cd = CASE WHEN p_cdt_source_system_cd_in='0.1' THEN cdt_source_system_cd ELSE p_cdt_source_system_cd_in END , cdt_submitting_country_cd = CASE WHEN p_cdt_submitting_country_cd_in='0.1' THEN cdt_submitting_country_cd ELSE p_cdt_submitting_country_cd_in END , cdt_status_cd = CASE WHEN p_cdt_status_cd_in='0.1' THEN cdt_status_cd ELSE p_cdt_status_cd_in END , effective_dt = CASE WHEN p_effective_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN effective_dt ELSE p_effective_dt_in END , expiration_dt = CASE WHEN p_expiration_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN expiration_dt ELSE p_expiration_dt_in END , audit_insert_user_name_tx = CASE WHEN p_audit_insert_user_name_tx_in='0.1' THEN audit_insert_user_name_tx ELSE p_audit_insert_user_name_tx_in END , audit_update_dt = CURRENT_TIMESTAMP, audit_update_user_name_tx = CASE WHEN p_audit_update_user_name_tx_in='0.1' THEN audit_update_user_name_tx ELSE p_audit_update_user_name_tx_in END , latest_version_in = CASE WHEN p_latest_version_in_in='0' THEN latest_version_in ELSE p_latest_version_in_in END , delete_in = CASE WHEN p_delete_in_in='0' THEN delete_in ELSE p_delete_in_in END WHERE pres_id = p_pres_id_in AND audit_update_dt = p_last_audit_update_dt_in" PL/pgSQL function px_co_pr_pres_pg.spt_update_1(bigint,timestamp without time zone,timestamp without time zone,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,timestamp without time zone) line 7 at SQL statement LOCATION: getTypeOutputInfo, lsyscache.c:2681 Regards, Virendra This message is intended only for the use of the addressee and may contain information that is PRIVILEGED AND CONFIDENTIAL. If you are not the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify the sender immediately. Thank you.
RE: Running a Simple Update Statement Fails, Second Time Suceeds.
Type ID doesn't seem to be existing: product_master_fdw=> select * from pg_type where oid=196609; typname | typnamespace | typowner | typlen | typbyval | typtype | typcategory | typispreferred | typisdefined | typdelim | typrelid | typelem | typarray | typinput | typoutput | typreceive | t ypsend | typmodin | typmodout | typanalyze | typalign | typstorage | typnotnull | typbasetype | typtypmod | typndims | typcollation | typdefaultbin | typdefault | typacl -+--+--++--+-+-++--+--+--+-+--+--+---++-- ---+--+---++--+++-+---+--+--+---++ (0 rows) Regards, Virendra From: Kumar, Virendra Sent: Tuesday, September 03, 2019 4:56 PM To: pgsql-general@lists.postgresql.org Subject: Running a Simple Update Statement Fails, Second Time Suceeds. Here is simple query, I am running via function call. This statement runs fine as SQL but when put in function as plpgsql it failes with error below, when I ran second times in same session it succeeds: -- product_master_fdw=> \set VERBOSITY verbose product_master_fdw=> select sddf_update.tps_update_1(p_pres_id_in=>50241::integer, p_last_audit_update_dt_in=>'2019-09-03 12:44:21.356638'::timestamp , p_audit_update_user_name_tx_in=>'a...@xyz.com'::character varying, p_major_class_name_tx_in=>'TEST0826222'::character varying, p_effective_dt_in=>CURRENT_TIMESTAMP::timestamp ); ERROR: XX000: cache lookup failed for type 196609 CONTEXT: SQL statement "UPDATE product_history.external_sys_class_code_pres SET class_code_id = CASE WHEN p_class_code_id_in='0.1' THEN class_code_id ELSE p_class_code_id_in END , major_classification_cd = CASE WHEN p_major_classification_cd_in='.1' THEN major_classification_cd ELSE p_major_classification_cd_in END , major_classification_name_tx = CASE WHEN p_major_class_name_tx_in='0.1' THEN major_classification_name_tx ELSE p_major_class_name_tx_in END , coverage_short_name_tx = CASE WHEN p_coverage_short_name_tx_in='0.1' THEN coverage_short_name_tx ELSE p_coverage_short_name_tx_in END , coverage_name_tx = CASE WHEN p_coverage_name_tx_in='0.1' THEN coverage_name_tx ELSE p_coverage_name_tx_in END , cdt_source_system_cd = CASE WHEN p_cdt_source_system_cd_in='0.1' THEN cdt_source_system_cd ELSE p_cdt_source_system_cd_in END , cdt_submitting_country_cd = CASE WHEN p_cdt_submitting_country_cd_in='0.1' THEN cdt_submitting_country_cd ELSE p_cdt_submitting_country_cd_in END , cdt_status_cd = CASE WHEN p_cdt_status_cd_in='0.1' THEN cdt_status_cd ELSE p_cdt_status_cd_in END , effective_dt = CASE WHEN p_effective_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN effective_dt ELSE p_effective_dt_in END , expiration_dt = CASE WHEN p_expiration_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN expiration_dt ELSE p_expiration_dt_in END , audit_insert_user_name_tx = CASE WHEN p_audit_insert_user_name_tx_in='0.1' THEN audit_insert_user_name_tx ELSE p_audit_insert_user_name_tx_in END , audit_update_dt = CURRENT_TIMESTAMP, audit_update_user_name_tx = CASE WHEN p_audit_update_user_name_tx_in='0.1' THEN audit_update_user_name_tx ELSE p_audit_update_user_name_tx_in END , latest_version_in = CASE WHEN p_latest_version_in_in='0' THEN latest_version_in ELSE p_latest_version_in_in END , delete_in = CASE WHEN p_delete_in_in='0' THEN delete_in ELSE p_delete_in_in END WHERE pres_id = p_pres_id_in AND audit_update_dt = p_last_audit_update_dt_in" PL/pgSQL function px_co_pr_pres_pg.spt_update_1(bigint,timestamp without time zone,timestamp without time zone,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,timestamp without time zone) line 7 at SQL statement LOCATION: getTypeOutputInfo, lsyscache.c:2681 Regards, Virendra This message is intended only for the use of the addressee and may contain information that is PRIVILEGED AND CONFIDENTIAL. If you are not the intended recipient, you are hereby notified that any dissemination of this commun
Re: Running a Simple Update Statement Fails, Second Time Suceeds.
On 9/3/19 1:56 PM, Kumar, Virendra wrote: Here is simple query, I am running via function call. This statement runs fine as SQL but when put in function as plpgsql it failes with error below, when I ran second times in same session it succeeds: We will need to see the function definition. -- product_master_fdw=> \set VERBOSITY verbose product_master_fdw=> select sddf_update.tps_update_1(p_pres_id_in=>50241::integer, p_last_audit_update_dt_in=>'2019-09-03 12:44:21.356638'::timestamp , p_audit_update_user_name_tx_in=>'a...@xyz.com'::character varying, p_major_class_name_tx_in=>'TEST0826222'::character varying, p_effective_dt_in=>CURRENT_TIMESTAMP::timestamp ); ERROR: XX000: cache lookup failed for type 196609 CONTEXT: SQL statement "UPDATE product_history.external_sys_class_code_pres SET class_code_id = CASE WHEN p_class_code_id_in='0.1' THEN class_code_id ELSE p_class_code_id_in END , major_classification_cd = CASE WHEN p_major_classification_cd_in='.1' THEN major_classification_cd ELSE p_major_classification_cd_in END , major_classification_name_tx = CASE WHEN p_major_class_name_tx_in='0.1' THEN major_classification_name_tx ELSE p_major_class_name_tx_in END , coverage_short_name_tx = CASE WHEN p_coverage_short_name_tx_in='0.1' THEN coverage_short_name_tx ELSE p_coverage_short_name_tx_in END , coverage_name_tx = CASE WHEN p_coverage_name_tx_in='0.1' THEN coverage_name_tx ELSE p_coverage_name_tx_in END , cdt_source_system_cd = CASE WHEN p_cdt_source_system_cd_in='0.1' THEN cdt_source_system_cd ELSE p_cdt_source_system_cd_in END , cdt_submitting_country_cd = CASE WHEN p_cdt_submitting_country_cd_in='0.1' THEN cdt_submitting_country_cd ELSE p_cdt_submitting_country_cd_in END , cdt_status_cd = CASE WHEN p_cdt_status_cd_in='0.1' THEN cdt_status_cd ELSE p_cdt_status_cd_in END , effective_dt = CASE WHEN p_effective_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN effective_dt ELSE p_effective_dt_in END , expiration_dt = CASE WHEN p_expiration_dt_in=TO_DATE('01/01/1600', 'mm/dd/') THEN expiration_dt ELSE p_expiration_dt_in END , audit_insert_user_name_tx = CASE WHEN p_audit_insert_user_name_tx_in='0.1' THEN audit_insert_user_name_tx ELSE p_audit_insert_user_name_tx_in END , audit_update_dt = CURRENT_TIMESTAMP, audit_update_user_name_tx = CASE WHEN p_audit_update_user_name_tx_in='0.1' THEN audit_update_user_name_tx ELSE p_audit_update_user_name_tx_in END , latest_version_in = CASE WHEN p_latest_version_in_in='0' THEN latest_version_in ELSE p_latest_version_in_in END , delete_in = CASE WHEN p_delete_in_in='0' THEN delete_in ELSE p_delete_in_in END WHERE pres_id = p_pres_id_in AND audit_update_dt = p_last_audit_update_dt_in" PL/pgSQL function px_co_pr_pres_pg.spt_update_1(bigint,timestamp without time zone,timestamp without time zone,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,character varying,timestamp without time zone) line 7 at SQL statement LOCATION: getTypeOutputInfo, lsyscache.c:2681 Regards, Virendra This message is intended only for the use of the addressee and may contain information that is PRIVILEGED AND CONFIDENTIAL. If you are not the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify the sender immediately. Thank you. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Running a Simple Update Statement Fails, Second Time Suceeds.
Adrian Klaver writes: > On 9/3/19 1:56 PM, Kumar, Virendra wrote: >> Here is simple query, I am running via function call. This statement >> runs fine as SQL but when put in function as plpgsql it failes with >> error below, when I ran second times in same session it succeeds: > We will need to see the function definition. Also, what PG version is that? This looks a bit like bug #15913, but it might be something else. >> This message is intended only for the use of the addressee and may contain >> information that is PRIVILEGED AND CONFIDENTIAL. >> >> If you are not the intended recipient, you are hereby notified that any >> dissemination of this communication is strictly prohibited. If you have >> received this communication in error, please erase all copies of the message >> and its attachments and notify the sender immediately. Thank you. I rather wonder whether I'm even allowed to read this, let alone answer it. You do realize that this sort of add-on is completely silly on a public mailing list? regards, tom lane
Re: Running a Simple Update Statement Fails, Second Time Suceeds.
Hi Tom, Sincere apologies for that privacy notice in email, this company policy which I cannot skip. Adrian, The function is really simple. I’ll share the code as soon as I can. Regards, Virendra > On Sep 3, 2019, at 5:41 PM, Tom Lane wrote: > > Adrian Klaver writes: >>> On 9/3/19 1:56 PM, Kumar, Virendra wrote: >>> Here is simple query, I am running via function call. This statement >>> runs fine as SQL but when put in function as plpgsql it failes with >>> error below, when I ran second times in same session it succeeds: > >> We will need to see the function definition. > > Also, what PG version is that? This looks a bit like bug #15913, > but it might be something else. > >>> This message is intended only for the use of the addressee and may contain >>> information that is PRIVILEGED AND CONFIDENTIAL. >>> >>> If you are not the intended recipient, you are hereby notified that any >>> dissemination of this communication is strictly prohibited. If you have >>> received this communication in error, please erase all copies of the message >>> and its attachments and notify the sender immediately. Thank you. > > I rather wonder whether I'm even allowed to read this, let alone > answer it. You do realize that this sort of add-on is completely > silly on a public mailing list? > >regards, tom lane This message is intended only for the use of the addressee and may contain information that is PRIVILEGED AND CONFIDENTIAL. If you are not the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify the sender immediately. Thank you.
killing vacuum analyze process
Hello there, PostgreSQL 9.6.2 on x86_64-pc-linux-gnu It seems like we have one process running since 8/19. Is it any repercussion if we kill it? postgres=# select * from pg_stat_activity where pid = '18986'; -[ RECORD 1 ]+- datid| 2342921 datname | symphony_middleware pid | 18986 usesysid | 10 usename | postgres application_name | client_addr | client_hostname | client_port | backend_start| 2019-08-19 02:00:59.683198+00 xact_start | 2019-08-19 02:03:54.191741+00 query_start | 2019-08-19 02:03:54.191741+00 state_change | 2019-08-19 02:03:54.191742+00 wait_event_type | BufferPin wait_event | BufferPin state| active backend_xid | backend_xmin | 1075730757 query| autovacuum: VACUUM ANALYZE public.resource_build (to prevent wraparound)
Re: killing vacuum analyze process
Julie Nishimura writes: > PostgreSQL 9.6.2 on x86_64-pc-linux-gnu > It seems like we have one process running since 8/19. Is it any repercussion > if we kill it? > postgres=# select * from pg_stat_activity where pid = '18986'; > -[ RECORD 1 > ]+- > datid| 2342921 > datname | symphony_middleware > pid | 18986 > usesysid | 10 > usename | postgres > application_name | > client_addr | > client_hostname | > client_port | > backend_start| 2019-08-19 02:00:59.683198+00 > xact_start | 2019-08-19 02:03:54.191741+00 > query_start | 2019-08-19 02:03:54.191741+00 > state_change | 2019-08-19 02:03:54.191742+00 > wait_event_type | BufferPin > wait_event | BufferPin > state| active > backend_xid | > backend_xmin | 1075730757 > query| autovacuum: VACUUM ANALYZE public.resource_build (to > prevent wraparound) Since this is an anti-wraparound vacuum, autovacuum is just going to launch another one pretty soon if you kill this one. Assuming that the buffer pin blockage is real and not some kind of broken shared memory state, the new one will hang up at the same spot. You'd be better advised to find out what's pinning that buffer and kill that. Admittedly this is easier said than done, since there's not much infrastructure for seeing what's happening at that level. But you could look for transactions that are at least as old as this one and have some kind of lock on that table (according to pg_locks). If there are no such transactions, then the corrupt-shared-memory hypothesis becomes likely, and a postmaster restart is indicated. BTW, you really ought to be running something newer than 9.6.2. regards, tom lane
Re: killing vacuum analyze process
Thank you Tom. I can see bunch of old processes running ROLLBACK... Should I kill them or they only way to clear those is to restart the server? postgres=# select * from pg_stat_activity where datname ='symphony_middleware' and query_start < '08/20/2019' and query like '%ROLLBACK%'; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query -+-++--+-+--+--+-+-+---++---+---+-++---+-+--+-- 2342921 | symphony_middleware | 72165 |16407 | symphony_middleware | | 10.24.33.226 | | 37252 | 2019-06-04 13:17:56.204483+00 || 2019-06-24 13:51:52.834401+00 | 2019-06-24 13:51:52.834479+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 159407 |16407 | symphony_middleware | | 10.24.33.6 | | 45482 | 2019-07-29 16:23:36.271366+00 || 2019-08-16 18:28:27.924116+00 | 2019-08-16 18:28:27.92419+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 135443 |16407 | symphony_middleware | | 10.24.33.226 | | 47712 | 2019-05-31 16:17:55.143017+00 || 2019-08-13 15:17:01.685057+00 | 2019-08-13 15:17:01.685103+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 135442 |16407 | symphony_middleware | | 10.24.33.226 | | 47710 | 2019-05-31 16:17:55.132574+00 || 2019-08-13 15:17:32.973151+00 | 2019-08-13 15:17:32.97322+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 135440 |16407 | symphony_middleware | | 10.24.33.226 | | 47706 | 2019-05-31 16:17:55.082091+00 || 2019-06-24 17:23:15.519127+00 | 2019-06-24 17:23:15.519181+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 38211 |16407 | symphony_middleware | | 10.24.33.6 | | 43318 | 2019-06-04 10:33:19.742976+00 || 2019-06-24 17:23:09.212067+00 | 2019-06-24 17:23:09.212117+00 | || idle | | | ROLLBACK 2342921 | symphony_middleware | 136304 |16407 | symphony_middleware | | 10.24.33.6 | | 51012 | 2019-05-31 16:20:23.911493+00 || 2019-08-19 10:17:33.284189+00 | 2019-08-19 10:17:33.284256+00 | || idle | | | ROLLBACK (7 rows) From: Tom Lane Sent: Tuesday, September 3, 2019 4:11 PM To: Julie Nishimura Cc: pgsql-general@lists.postgresql.org Subject: Re: killing vacuum analyze process Julie Nishimura writes: > PostgreSQL 9.6.2 on x86_64-pc-linux-gnu > It seems like we have one process running since 8/19. Is it any repercussion > if we kill it? > postgres=# select * from pg_stat_activity where pid = '18986'; > -[ RECORD 1 > ]+- > datid| 2342921 > datname | symphony_middleware > pid | 18986 > usesysid | 10 > usename | postgres > application_name | > client_addr | > client_hostname | > client_port | > backend_start| 2019-08-19 02:00:59.683198+00 > xact_start | 2019-08-19 02:03:54.191741+00 > query_start | 2019-08-19 02:03:54.191741+00 > state_change | 2019-08-19 02:03:54.191742+00 > wait_event_type | BufferPin > wait_event | BufferPin > state| active > backend_xid | > backend_xmin | 1075730757 > query| autovacuum: VACUUM ANALYZE public.resource_build (to > prevent wraparound) Since this is an anti-wraparound vacuum, autovacuum is just going to launch another one pretty soon if you kill this one. Assuming that the buffer pin blockage is real and not some kind of broken shared memory state, the new one will hang up at the same spot. You'd be better advised to find out what's pinning that buffer and kill that. Admittedly this is easier said than done, since there's not much infrastructure for seeing what's happening at that level. But you could look for transactions that are at least as ol
Re: Unexpected "canceling statement due to user request" error
On Sun 2019-09-01 20:58:30 -0400, Tom Lane wrote: > >> A separate question is how come the particular query you're complaining > >> about has (seemingly) a fairly wide window where it never does any > >> CHECK_FOR_INTERRUPTS call before terminating. Perhaps there's someplace > >> we need to sprinkle a few more of those. > > > Yeah, it is strange. There are many queries in this system with statement > > timeouts and this is the only one where I see this and it happens several > > times a day. > > It'd be interesting to see EXPLAIN ANALYZE output from a typical execution > of that query. Even better would be a test case (w/ tables and dummy > data) so somebody else could reproduce it; but maybe looking at EXPLAIN > ANALYZE would be enough to determine where we're missing checks. I put output from a query with values that caused the error here: https://gist.githubusercontent.com/horgh/653e60b8c2f071d859424517d653eb4e/raw/5d16e11d0ac884ed9dc120c4521e2de8f2e7c3d6/test.txt I sanitised some of the names and values again. Sorry about that. As far as a test case: I'm not sure how to reproduce it myself. It just happens periodically (15 times in the past 24 hours). The tables are somewhat large (60+ GiB per partition). I could try to generate dummy versions if you like, but they would have to be significantly dummied, so I'm not sure it would end up being representative.
Re: Upgrade 96 -> 11
On 9/3/19 3:45 PM, James Sewell wrote: -- For binary upgrade, create an empty extension and insert objects into it DROP EXTENSION IF EXISTS tablefunc; SELECT pg_catalog.binary_upgrade_create_empty_extension('tablefunc', 'public', true, '1.0', NULL, NULL, ARRAY[]::pg_catalog.text[]); Try the above on your schema and see what you get. Yep - an empty extension. I think this logic is wrong. Creating an empty extension is fine and makes sense but extension owned relations should be created as the next step, not just at some time later. So to be clear you ran pg_dump with --binary-upgrade and the extension elements where created after the user table that tripped the error? -- Adrian Klaver adrian.kla...@aklaver.com
Re: killing vacuum analyze process
Those are all idle. I've got a cron job to kill all idle processes more than 60 (or 30, I don't remember) minutes old. On 9/3/19 6:29 PM, Julie Nishimura wrote: Thank you Tom. I can see bunch of old processes running ROLLBACK... Should I kill them or they only way to clear those is to restart the server? postgres=# select * from pg_stat_activity where datname ='symphony_middleware' and query_start < '08/20/2019' and query like '%ROLLBACK%'; datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query -+-++--+-+--+--+-+-+---++---+---+-++---+-+--+-- 2342921 | symphony_middleware | 72165 | 16407 | symphony_middleware | | 10.24.33.226 | | 37252 | 2019-06-04 13:17:56.204483+00 | | 2019-06-24 13:51:52.834401+00 | 2019-06-24 13:51:52.834479+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 159407 | 16407 | symphony_middleware | | 10.24.33.6 | | 45482 | 2019-07-29 16:23:36.271366+00 | | 2019-08-16 18:28:27.924116+00 | 2019-08-16 18:28:27.92419+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 135443 | 16407 | symphony_middleware | | 10.24.33.226 | | 47712 | 2019-05-31 16:17:55.143017+00 | | 2019-08-13 15:17:01.685057+00 | 2019-08-13 15:17:01.685103+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 135442 | 16407 | symphony_middleware | | 10.24.33.226 | | 47710 | 2019-05-31 16:17:55.132574+00 | | 2019-08-13 15:17:32.973151+00 | 2019-08-13 15:17:32.97322+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 135440 | 16407 | symphony_middleware | | 10.24.33.226 | | 47706 | 2019-05-31 16:17:55.082091+00 | | 2019-06-24 17:23:15.519127+00 | 2019-06-24 17:23:15.519181+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 38211 | 16407 | symphony_middleware | | 10.24.33.6 | | 43318 | 2019-06-04 10:33:19.742976+00 | | 2019-06-24 17:23:09.212067+00 | 2019-06-24 17:23:09.212117+00 | | | idle | | | ROLLBACK 2342921 | symphony_middleware | 136304 | 16407 | symphony_middleware | | 10.24.33.6 | | 51012 | 2019-05-31 16:20:23.911493+00 | | 2019-08-19 10:17:33.284189+00 | 2019-08-19 10:17:33.284256+00 | | | idle | | | ROLLBACK (7 rows) *From:* Tom Lane *Sent:* Tuesday, September 3, 2019 4:11 PM *To:* Julie Nishimura *Cc:* pgsql-general@lists.postgresql.org *Subject:* Re: killing vacuum analyze process Julie Nishimura writes: > PostgreSQL 9.6.2 on x86_64-pc-linux-gnu > It seems like we have one process running since 8/19. Is it any repercussion if we kill it? > postgres=# select * from pg_stat_activity where pid = '18986'; > -[ RECORD 1 ]+- > datid | 2342921 > datname | symphony_middleware > pid | 18986 > usesysid | 10 > usename | postgres > application_name | > client_addr | > client_hostname | > client_port | > backend_start | 2019-08-19 02:00:59.683198+00 > xact_start | 2019-08-19 02:03:54.191741+00 > query_start | 2019-08-19 02:03:54.191741+00 > state_change | 2019-08-19 02:03:54.191742+00 > wait_event_type | BufferPin > wait_event | BufferPin > state | active > backend_xid | > backend_xmin | 1075730757 > query | autovacuum: VACUUM ANALYZE public.resource_build (to prevent wraparound) Since this is an anti-wraparound vacuum, autovacuum is just going to launch another one pretty soon if you kill this one. Assuming that the buffer pin blockage is real and not some kind of broken shared memory state, the new one will hang up at the same spot. You'd be better advised to find out what's pinning that buffer and kill that. Admittedly
Re: Upgrade 96 -> 11
On Wed, 4 Sep 2019 at 5:47 am, Adrian Klaver wrote: > On 9/2/19 5:52 PM, Adrian Klaver wrote: > > >> It's still creating the schema elements when it fails, it hasn't > >> started linking yet > > > > Alright at least you still a working 9.6 cluster . > > > > Not sure where to go from here. Like you I am not sure how it can CREATE > > EXTENSION and not actually follow through on that. Especially with no > > errors for that operation. I'm going to have to think on this. Hopefully > > someone else has an idea on this and can chime in. > > Aah. I don't have postgis installed, still: > > pg_dump --binary-upgrade -s -d production -U postgres > > production_binary.sql > > -- For binary upgrade, create an empty extension and insert objects into it > DROP EXTENSION IF EXISTS tablefunc; > SELECT pg_catalog.binary_upgrade_create_empty_extension('tablefunc', > 'public', true, '1.0', NULL, NULL, ARRAY[]::pg_catalog.text[]); > > > Try the above on your schema and see what you get. Yep - an empty extension. I think this logic is wrong. Creating an empty extension is fine and makes sense but extension owned relations should be created as the next step, not just at some time later. > > > > > > >> > >> > > >> > I have set PGBINOLD, PGBINNEW, PGDATAOLD, PGDATANEW correctly. > >> > > -- > Adrian Klaver > adrian.kla...@aklaver.com > -- James Sewell, Chief Architect Suite 112, Jones Bay Wharf, 26-32 Pirrama Road, Pyrmont NSW 2009 P (+61) 2 8099 9000 W www.jirotech.com F (+61) 2 8099 9099 -- The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.
Re: killing vacuum analyze process
Julie Nishimura writes: > Thank you Tom. I can see bunch of old processes running ROLLBACK... Should I > kill them or they only way to clear those is to restart the server? As Ron pointed out, those processes aren't actually running ROLLBACK; that was just their last query. They're idle (and not "idle in transaction") so in theory they're not holding any locks. After thinking more about it I realize that the process holding the buffer pin doesn't have to be as old as the VACUUM; so you probably should just look at pg_locks and see what else has got a lock on the table the VACUUM is trying to process. The hypothetical offender is probably somewhat old, or you'd not have noticed the blockage, but we don't have any clear evidence as to how old. regards, tom lane
Re: Upgrade 96 -> 11
If anyone hits this it is an issue with using the Geography type with a non 4326 SRID on a table and pg_upgrade. It should be fixed (the check dropped as it's something of a relic) in the next version of PostGIS. In the meantime you would have to patch it out yourself. ( https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_typmod.c#L296 ) On Wed, 4 Sep 2019 at 10:03, Adrian Klaver wrote: > On 9/3/19 3:45 PM, James Sewell wrote: > > > > > > > > > -- For binary upgrade, create an empty extension and insert objects > > into it > > DROP EXTENSION IF EXISTS tablefunc; > > SELECT pg_catalog.binary_upgrade_create_empty_extension('tablefunc', > > 'public', true, '1.0', NULL, NULL, ARRAY[]::pg_catalog.text[]); > > > > > > Try the above on your schema and see what you get. > > > > > > Yep - an empty extension. I think this logic is wrong. Creating an empty > > extension is fine and makes sense but extension owned relations should > > be created as the next step, not just at some time later. > > > > So to be clear you ran pg_dump with --binary-upgrade and the extension > elements where created after the user table that tripped the error? > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com > -- The contents of this email are confidential and may be subject to legal or professional privilege and copyright. No representation is made that this email is free of viruses or other defects. If you have received this communication in error, you may not copy or distribute any part of it or otherwise disclose its contents to anyone. Please advise the sender of your incorrect receipt of this correspondence.