Re: Concurrent CTE
Can you pass full query & how many rows each table has & how often the tables change & full explain ? On Thu, Apr 5, 2018 at 8:01 AM, wrote: > Did you look at this approach using dblink already? > > https://gist.github.com/mjgleaso/8031067 > > In your situation, you will have to modify the example but it may give an > idea where to start. > > Klaus > > > -Ursprüngliche Nachricht- > > Von: Artur Formella > > Gesendet: Dienstag, 3. April 2018 22:01 > > An: pgsql-general@lists.postgresql.org > > Betreff: Concurrent CTE > > > > Hello! > > We have a lot of big CTE (~40 statements, ~1000 lines) for very dynamic > OLTP > > content and avg response time 50-300ms. Our setup has 96 threads (Intel > > Xeon Gold 6128), 256 GB RAM and 12 SSD (3 tablespaces). DB size < RAM. > > Simplifying the problem: > > > > WITH aa as ( > >SELECT * FROM table1 > > ), bb ( > >SELECT * FROM table2 > > ), cc ( > >SELECT * FROM table3 > > ), dd ( > >SELECT * FROM aa,bb > > ), ee ( > >SELECT * FROM aa,bb,cc > > ), ff ( > >SELECT * FROM ee,dd > > ), gg ( > >SELECT * FROM table4 > > ), hh ( > >SELECT * FROM aa > > ) > > SELECT * FROM gg,hh,ff /* primary statement */ > > > > Execution now: > > time--> > > Thread1: aa | bb | cc | dd | ee | ff | gg | hh | primary > > > > And the question: is it possible to achieve more concurrent execution > plan to > > reduce the response time? For example: > > Thread1: aa | dd | ff | primary > > Thread2: bb | ee | gg > > Thread3: cc | -- | hh > > > > Table1, table2 and table3 are located on separate tablespaces and are > > independent. > > Partial results (aa,bb,cc,dd,ee) are quite big and slow (full text > search, arrays, > > custom collations, function scans...). > > > > We consider resigning from the CTE and rewrite to RX Java but we are > afraid > > of downloading partial results and sending it back with WHERE IN(...). > > > > Thanks! > > > > Artur Formella > > > > > > > >
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Tom writes: > a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes: >> Also, the error we are getting is now: "unexpected chunk number 2 >> (expected 3) for toast value 1498303849 in pg_toast_10919630", where >> previously we've only seen "unexpected chunk number 0 (expected 1)". > >> We are kind of at a loss, so any suggestions on what we could try are >> welcome. > > The basic thrust of these messages is "I'm reading what should be > sequentially numbered data chunks for this toast OID, and the sequence > numbers are wrong". Both of these instances could be explained by > duplicate toast rows (or duplicate index entries pointing at one row), > though of course that would just move to the next question of how it > got that way. Anyway, you could move the investigation along with > some manual checking into what's in that toast table. For instance > > select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) > from pg_toast.pg_toast_10919630 > where chunk_id = 1498303849 > order by 1,2; > > might be informative. If you do see what seem to be duplicate > chunk_seq values, checking whether they're still there in a > seqscan would be good. Here's a statement which currently gives an unexpected chunk error: efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698936148 in pg_toast_10919630 And when I run the suggested query, I get: efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 (2 rows) How would I go about checking if they are still in a seqscan? (Note: this is on PostgreSQL 9.3.22.) Best regards, Adam -- "No more than that, but very powerful all theAdam Sjøgren same; simple things are good."a...@novozymes.com
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Adam writes: > Here's a statement which currently gives an unexpected chunk error: > > efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908'; > ERROR: unexpected chunk number 0 (expected 1) for toast value 1698936148 > in pg_toast_10919630 > > And when I run the suggested query, I get: > > efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, > length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = > 1698936148 order by 1,2; > chunk_id | chunk_seq | ctid |xmin| xmax | length > +---+--++--+ >1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 >1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 > (2 rows) More examples: efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '237764759'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698958350 in pg_toast_10919630 efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698958350 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698958350 | 0 | (54859821,2) | 1511487270 |0 | 1448 (1 row) And: efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '366275833'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698945095 in pg_toast_10919630 efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698945095 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698945095 | 0 | (53706565,3) | 1511426847 |0 | 1996 1698945095 | 1 | (53706565,6) | 1511426847 |0 |108 (2 rows) One more: efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698936148 in pg_toast_10919630 efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 (2 rows) And here is one from another table: efamroot@kat efam=# SELECT * FROM efam.sequence WHERE id = '235887163'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698750544 in pg_toast_10919630 efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698750544 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698750544 | 0 | (39575142,3) | 1510704835 |0 | 1996 1698750544 | 1 | (39575142,4) | 1510704835 |0 |716 (2 rows) Let me know what other relevant info I can provide. Best regards, Adam -- "No more than that, but very powerful all theAdam Sjøgren same; simple things are good."a...@novozymes.com
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes: > Here's a statement which currently gives an unexpected chunk error: > efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908'; > ERROR: unexpected chunk number 0 (expected 1) for toast value 1698936148 > in pg_toast_10919630 OK ... > And when I run the suggested query, I get: > efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, > length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = > 1698936148 order by 1,2; > chunk_id | chunk_seq | ctid |xmin| xmax | length > +---+--++--+ >1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 >1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 > (2 rows) Hmph. So if you EXPLAIN that query, does it show it's doing it as an indexscan? I'd expect so, but it's always good to make sure. Assuming it does say that, then the other test I had in mind would involve "set enable_indexscan = 0", then repeat the EXPLAIN to make sure that you now get a seqscan plan (you might need to turn off enable_bitmapscan too), then do the query again and see whether the results are the same. regards, tom lane
pg_basebackup or dump for starting replication process
For replication purposes only, there are any difference between pg_basebackup or dump to copy data from Master to Slave ? On Docs is written that pg_basebackup can be used both for point-in-time recovery and as the starting point for a log shipping or streaming replication standby servers. We are using just Dump, there are any caveat of using it for logical replication ? -- Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Tom writes: >> And when I run the suggested query, I get: > >> efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, >> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = >> 1698936148 order by 1,2; >> chunk_id | chunk_seq | ctid |xmin| xmax | length >> +---+--++--+ >>1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 >>1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 >> (2 rows) > Hmph. So if you EXPLAIN that query, does it show it's doing it as an > indexscan? I'd expect so, but it's always good to make sure. It does: efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908'; ERROR: unexpected chunk number 0 (expected 1) for toast value 1698936148 in pg_toast_10919630 efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 (2 rows) efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; QUERY PLAN -- Index Scan using pg_toast_10919630_index on pg_toast_10919630 (cost=0.57..2627179.25 rows=2135674 width=54) Index Cond: (chunk_id = 1698936148::oid) (2 rows) > Assuming it does say that, then the other test I had in mind would > involve "set enable_indexscan = 0", then repeat the EXPLAIN to make > sure that you now get a seqscan plan (you might need to turn off > enable_bitmapscan too), then do the query again and see whether the > results are the same. Ok (if I don't disable bitmapscan, I get Bitmap Heap Scans in the EXPLAIN, so): efamroot@kat efam=# set enable_indexscan = 0; SET efamroot@kat efam=# set enable_bitmapscan = 0; SET efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; QUERY PLAN - Sort (cost=96465280.57..96470619.75 rows=2135674 width=54) Sort Key: chunk_seq -> Seq Scan on pg_toast_10919630 (cost=0.00..96240754.39 rows=2135674 width=54) Filter: (chunk_id = 1698936148::oid) (4 rows) efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; [... still waiting for the result, I will return with what it said when the server does ...] Best regards, Adam -- "No more than that, but very powerful all theAdam Sjøgren same; simple things are good."a...@novozymes.com
Docker + postgreSQL : OOM killing in a large Group by operation
Hi Guys: I have a problem with a query that grabs a bunch of rows and then does an aggreate operation, at that moment it gots killed by OOM-killer, I don't know why, the engine starts using tmpfiles as expected , and then tries to work in memory and gots killed. I've test it in an small enviroment for more dramatic/quick results Env: Running on a 1GB memory Docker-container . PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit Database Analyzed and the stats are up-to-date postgresql.conf: shared_buffers = 512MB work_mem = 16MB effective_cache_size = 256MB Bad Query: SELECT count(*) FROM "changelog_change_transaction" INNER JOIN "changelog_change_stats" ON ( changelog_change_stats.changelog_change_transactionid = changelog_change_transaction.changelog_change_transactionid ) LEFT OUTER JOIN "changelog_change_group" ON ( changelog_change_transaction.changelog_change_groupid = changelog_change_group.changelog_change_groupid ) WHERE ( changelog_change_group.companyid = 40 OR changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 OR changelog_change_group.companyid IS NULL ) AND changelog_change_transaction.started_at > '2017-04-21' GROUP BY "changelog_change_transaction"."changelog_change_transactionid", "changelog_change_transaction"."epoch", "changelog_change_transaction" ."changelog_change_groupid", "changelog_change_transaction"."started_at", "changelog_change_transaction"."duration_microseconds", "changelog_change_transaction"."changed_items", "changelog_change_transaction"."xmin" ; Explain : HashAggregate (cost=7845766.73..8117654.17 rows=27188744 width=152) Group Key: changelog_change_transaction.changelog_change_transactionid, changelog_change_transaction.epoch, changelog_change_transaction.changelog_change_groupid, changelog_change_transaction.started_at, changelog_change_transaction.duration_microseconds, changelog_change_transaction.changed_items, changelog_change_transaction.xmin -> Hash Left Join (cost=2498235.67..7301991.85 rows=27188744 width=152) Hash Cond: (changelog_change_transaction.changelog_change_groupid = changelog_change_group.changelog_change_groupid) Filter: ((changelog_change_group.companyid = 40) OR (changelog_change_group.companyid = 1) OR (changelog_change_group.companyid = 53) OR (changelog_change_group.companyid IS NULL)) -> Hash Join (cost=2142692.83..5176273.34 rows=27726867 width=152) Hash Cond: (changelog_change_stats.changelog_change_transactionid = changelog_change_transaction.changelog_change_transactionid) -> Seq Scan on changelog_change_stats (cost=0.00..689345.48 rows=33612148 width=6) -> Hash (cost=1215858.45..1215858.45 rows=27272350 width=152) -> Seq Scan on changelog_change_transaction (cost=0.00..1215858.45 rows=27272350 width=152) Filter: (started_at > '2017-04-21 00:00:00-07'::timestamp with time zone) -> Hash (cost=200291.82..200291.82 rows=8931282 width=10) -> Seq Scan on changelog_change_group (cost=0.00..200291.82 rows=8931282 width=10) Log execution: 2018-04-05 09:07:12.444 PDT rhost=[local] app=psql:user=postgres:db=telecom:LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp6455.19", size 9437168 .. 2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:LOG: server process (PID 6455) was terminated by signal 9: Killed 2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:DETAIL: Failed process was running: SELECT count(*) FROM "changelog_change_transaction" INNER JOIN "changelog_change_stats" ON ( changelog_change_stats.changelog_change_transactionid = changelog_change_transaction.changelog_change_transactionid ) LEFT OUTER JOIN "changelog_change_group" ON ( changelog_change_transaction.changelog_change_groupid = changelog_change_group.changelog_change_groupid ) WHERE ( changelog_change_group.companyid = 40 OR changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 OR changelog_change_group.companyid IS NULL ) AND changelog_change_transaction.started_at > '2017-04-21' GROUP BY "changelog_change_transaction"."changelog_change_transactionid", "changelog_change_transaction"."epoch", "changelog_change_transaction" ."changelog_change_groupid", "changelog_change_transaction"."started_at", "changelog_change_transaction"."duration_microseconds", "changelog_change_transaction"."changed_items", "changelog_change_transaction"."xmin" ; 2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:LOG: terminating any other active server processes 2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:WARNING: terminating connection because of crash of another server process Monitoring the /proc/*/status of the running client process ,I've noticed this : VmData: 7944 kB VmData: 7944 kB VmData: 7944 kB VmData:
Re: Docker + postgreSQL : OOM killing in a large Group by operation
Jorge Daniel writes: > I have a problem with a query that grabs a bunch of rows and then does an > aggreate operation, at that moment it gots killed by OOM-killer, I don't know > why, the engine starts using tmpfiles as expected , and then tries to work > in memory and gots killed. > SELECT count(*) > FROM "changelog_change_transaction" > INNER JOIN "changelog_change_stats" ON ( > changelog_change_stats.changelog_change_transactionid = > changelog_change_transaction.changelog_change_transactionid ) > LEFT OUTER JOIN "changelog_change_group" ON ( > changelog_change_transaction.changelog_change_groupid = > changelog_change_group.changelog_change_groupid ) > WHERE ( changelog_change_group.companyid = 40 OR > changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 > OR changelog_change_group.companyid IS NULL ) > AND changelog_change_transaction.started_at > '2017-04-21' > GROUP BY "changelog_change_transaction"."changelog_change_transactionid", > "changelog_change_transaction"."epoch", "changelog_change_transaction" > ."changelog_change_groupid", "changelog_change_transaction"."started_at", > "changelog_change_transaction"."duration_microseconds", > "changelog_change_transaction"."changed_items", > "changelog_change_transaction"."xmin" > ; Why are you grouping on xmin? > For sure if the GROUP BY the one that causes this OOM (when I removed it, the > query finish ok ) , so I've change the query-plan to avoid the HashAggregate: > But the explain still shows: That's because type XID doesn't have sort support, only hash support, so hash aggregation is the only way to do the query at all. regards, tom lane
Re: PgUpgrade bumped my XIDs by ~50M?
Jerry Sievers writes: > Bruce Momjian writes: > >> On Wed, Apr 4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote: >> >>> Bruce Momjian writes: >>> > Is it possible that pg_upgrade used 50M xids while upgrading? >>> >>> Hi Bruce. >>> >>> Don't think so, as I did just snap the safety snap and ran another >>> upgrade on that. >>> >>> And I also compared txid_current for the upgraded snap and our running >>> production instance. >>> >>> The freshly upgraded snap is ~50M txids behind the prod instance. >> >> Are the objects 50M behind or is txid_current 50M different? Higher or >> lower? > > txid_current is another 12M higher then a few hours ago. Still waiting > to hear from my reporting team if they changed anything. Reporting team claims nothing changed. I still have 150 tables ready for autovac just based on freeze age value. Autovac is running at our as-config'd max worker count of 20 w/all threads busy as expected. If I can assume stats such as pg_stat_database start initially cleared after an upgrade... Please see that pg_stat_database showing about the number of transactions that I'd expect for this system and ~1.5 day duration. How have some objects apparently aged by ~100M transactions (by now at last check) since the upgrade ?? Thanks postgres=# select sum(xact_rollback+ xact_commit) from pg_stat_database; sum - 5292417 (1 row) postgres=# select now() - pg_postmaster_start_time(); ?column? --- 1 day 13:18:48.721896 (1 row) postgres=# select version(); version -- PostgreSQL 9.6.8 on x86_64-pc-linux-gnu (Ubuntu 9.6.8-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit (1 row) > > This thing is running PgLogical and has a few of our event triggers as > well. But nothing in this regard changed with the upgrade. > > What if some very frequent but trivial statements that did not get > assigned a real TXID in 9.5 on this configuration now are being treated > differently? > > What's puzzling too is that when I run my TPS monitor script, it's > clicking along at what looks typical, presently would only amount to > 700k transactions/day but we're off-peak. > > Thx >> >> >>> >>> If this is a not too uncommon case of users running amok, then this time >>> in particular they really went off the charts :-) >> >> I have never heard of this problem. -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consult...@comcast.net p: 312.241.7800
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Adam writes: > efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, > length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = > 1698936148 order by 1,2; >QUERY PLAN > > > - >Sort (cost=96465280.57..96470619.75 rows=2135674 width=54) > Sort Key: chunk_seq > -> Seq Scan on pg_toast_10919630 (cost=0.00..96240754.39 rows=2135674 > width=54) >Filter: (chunk_id = 1698936148::oid) > (4 rows) > > efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, > length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = > 1698936148 order by 1,2; > [... still waiting for the result, I will return with what it said >when the server does ...] It did eventually finish, with the same result: efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 order by 1,2; chunk_id | chunk_seq | ctid |xmin| xmax | length +---+--++--+ 1698936148 | 0 | (52888694,2) | 1511390221 |0 | 1996 1698936148 | 1 | (52888694,4) | 1511390221 |0 | 1148 (2 rows) Best regards, Adam -- "No more than that, but very powerful all theAdam Sjøgren same; simple things are good."a...@novozymes.com
Re: PgUpgrade bumped my XIDs by ~50M?
On Wed, Apr 4, 2018 at 08:29:06PM -0400, Bruce Momjian wrote: > On Wed, Apr 4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote: > > Bruce Momjian writes: > > > Is it possible that pg_upgrade used 50M xids while upgrading? > > > > Hi Bruce. > > > > Don't think so, as I did just snap the safety snap and ran another > > upgrade on that. > > > > And I also compared txid_current for the upgraded snap and our running > > production instance. > > > > The freshly upgraded snap is ~50M txids behind the prod instance. > > Are the objects 50M behind or is txid_current 50M different? Higher or > lower? Uh, here is a report of a similar problem from March 6, 2018: https://www.postgresql.org/message-id/flat/C44C73BC-6B3A-42E0-9E44-6CE4E5B5D601%40ebureau.com#c44c73bc-6b3a-42e0-9e44-6ce4e5b5d...@ebureau.com I upgraded a very large database from 9.6 to 10.1 via pg_upgrade recently, and ever since, the auto vacuum has been busy on a large legacy table that has experienced no changes since the upgrade. If the whole table had been frozen prior to the upgrade, would you expect it to stay frozen? It sure smells like we have a bug here. Could this be statistics collection instead? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Re: PgUpgrade bumped my XIDs by ~50M?
Bruce Momjian writes: > On Wed, Apr 4, 2018 at 08:29:06PM -0400, Bruce Momjian wrote: > >> On Wed, Apr 4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote: >> > Bruce Momjian writes: >> > > Is it possible that pg_upgrade used 50M xids while upgrading? >> > >> > Hi Bruce. >> > >> > Don't think so, as I did just snap the safety snap and ran another >> > upgrade on that. >> > >> > And I also compared txid_current for the upgraded snap and our running >> > production instance. >> > >> > The freshly upgraded snap is ~50M txids behind the prod instance. >> >> Are the objects 50M behind or is txid_current 50M different? Higher or >> lower? > > Uh, here is a report of a similar problem from March 6, 2018: > > > https://www.postgresql.org/message-id/flat/C44C73BC-6B3A-42E0-9E44-6CE4E5B5D601%40ebureau.com#c44c73bc-6b3a-42e0-9e44-6ce4e5b5d...@ebureau.com > > I upgraded a very large database from 9.6 to 10.1 via pg_upgrade > recently, and ever since, the auto vacuum has been busy on a large > legacy table that has experienced no changes since the upgrade. If the > whole table had been frozen prior to the upgrade, would you expect it to > stay frozen? > > It sure smells like we have a bug here. Could this be statistics > collection instead? No clue but we still have the 9.5 safety snap that I can make repeated sub-snaps of for mor testing. I did one such test yesterday and things looked $normal after the upgrade. Noteworthy omission was that I did *not* run the post-analysis. I could repeat the test more thoroughly. We run a parallel analyzer framework to get huge DBs processed quickly. And we generally do it in 2 passes; the first with def_stats_target scaled down to finish quicker and hopefully be good enough to run production. At this point we open the DB cluster for business. Immediately following we again run the analyzer at full stats_target. Any other suggestions what to also look at and I'll be glad to do and report back. Big Thanks! -- Jerry Sievers e: jerry.siev...@comcast.net p: 312.241.7800
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes: >> [... still waiting for the result, I will return with what it said >> when the server does ...] > It did eventually finish, with the same result: Huh. So what we have here, apparently, is that regular MVCC snapshots think there is exactly one copy of the 1698936148/0 row, but TOAST fetches think there is more than one. This is darn odd, not least because we never do UPDATEs in toast tables, only inserts and deletes, so there certainly shouldn't be update chains there. It seems like you've got some corner case wherein SnapshotToast sees a row that isn't visible according to MVCC --- probably a row left over from some previous cycle of life. That is, I'm imagining the OID counter wrapped around and we've reused a toast OID, but for some reason there's still a row in the table with that OID. I'm not sure offhand how we could get into such a state. Alvaro, does this ring any bells (remembering that this is 9.3)? regards, tom lane
decompose big queries
Hi, I want to know what are the best practice to use in order to decompose a big query which contains so many joins.Is it recommended to use stored procedures ? or is there any other solution?
Re: decompose big queries
On Thu, Apr 5, 2018 at 3:39 PM, hmidi slim wrote: > I want to know what are the best practice to use in order to decompose a > big query which contains so many joins.Is it recommended to use stored > procedures ? or is there any other solution? > Views are another solution. https://www.postgresql.org/docs/10/static/tutorial-views.html Though while "building views upon views" is not uncommon, and it does increase the readability of each individual view, it add update fragility to the system. Encapsulating high-level API concepts in functions and then writing queries within those functions with the benefit of views is one possibility. Your client access needs and general data model are necessary to choose a preferred design. David J.
ERROR: found multixact from before relminmxid
Hi, Some time ago, I had this errors frequently showed in logs after some autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables show the same and not complete the tasks (showed by some table bloat select). Then, I did a full dump/restore into a new version (10.2) and everything is ok for a couple of months. Now, I have errors like this again: db1=# cluster pc44t; ERROR: found multixact 134100944 from before relminmxid 192042633 Like before, the only way to make the errors to desapear is by dump/reload the whole table. Thanks for any help. Best regards, Alexandre
Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
On Fri, Apr 6, 2018 at 2:34 AM, Tom Lane wrote: > a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes: > >> [... still waiting for the result, I will return with what it said > >> when the server does ...] > > > It did eventually finish, with the same result: > > Huh. So what we have here, apparently, is that regular MVCC snapshots > think there is exactly one copy of the 1698936148/0 row, but TOAST fetches > think there is more than one. This is darn odd, not least because we > never do UPDATEs in toast tables, only inserts and deletes, so there > certainly shouldn't be update chains there. > > It seems like you've got some corner case wherein SnapshotToast sees a row > that isn't visible according to MVCC --- probably a row left over from > some previous cycle of life. That is, I'm imagining the OID counter > wrapped around and we've reused a toast OID, but for some reason there's > still a row in the table with that OID. I'm not sure offhand how we could > get into such a state. Alvaro, does this ring any bells (remembering that > this is 9.3)? > FWIW one of our support customers reported a very similar TOAST table corruption issue last week which nearly caused an outage. After a lot of analysis, I think I've now fully understood the reasons behind the corruption, the underlying bug(s) and possible remedy. I am currently working on writing a reproducible test case to demonstrate the problem and writing the fix. More details on that soon. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: pg_basebackup or dump for starting replication process
PegoraroF10 wrote: > For replication purposes only, there are any difference between pg_basebackup > or dump to copy data from Master to Slave ? > On Docs is written that pg_basebackup can be used both for point-in-time > recovery and as the starting point for a log shipping or streaming > replication standby servers. > > We are using just Dump, there are any caveat of using it for logical > replication ? For *logical* replication you can start with a "pg_dump --schema-only", but for *physical* replication only a pg_basebackup will do. Make sure that he tables are empty before you start logical replication. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com