Re: Concurrent CTE

2018-04-05 Thread Dorian Hoxha
Can you pass full query & how many rows each table has & how often the
tables change & full explain ?

On Thu, Apr 5, 2018 at 8:01 AM,  wrote:

> Did you look at this approach using dblink already?
>
> https://gist.github.com/mjgleaso/8031067
>
> In your situation, you will have to modify the example but it may give an
> idea where to start.
>
> Klaus
>
> > -Ursprüngliche Nachricht-
> > Von: Artur Formella 
> > Gesendet: Dienstag, 3. April 2018 22:01
> > An: pgsql-general@lists.postgresql.org
> > Betreff: Concurrent CTE
> >
> > Hello!
> > We have a lot of big CTE (~40 statements, ~1000 lines) for very dynamic
> OLTP
> > content and avg response time 50-300ms. Our setup has 96 threads (Intel
> > Xeon Gold 6128), 256 GB RAM and 12 SSD (3 tablespaces). DB size < RAM.
> > Simplifying the problem:
> >
> > WITH aa as (
> >SELECT * FROM table1
> > ), bb (
> >SELECT * FROM table2
> > ), cc (
> >SELECT * FROM table3
> > ), dd (
> >SELECT * FROM aa,bb
> > ), ee (
> >SELECT * FROM aa,bb,cc
> > ), ff (
> >SELECT * FROM ee,dd
> > ), gg (
> >SELECT * FROM table4
> > ), hh (
> >SELECT * FROM aa
> > )
> > SELECT * FROM gg,hh,ff /* primary statement */
> >
> > Execution now:
> > time-->
> > Thread1: aa | bb | cc | dd | ee | ff | gg | hh | primary
> >
> > And the question: is it possible to achieve more concurrent execution
> plan to
> > reduce the response time? For example:
> > Thread1: aa | dd | ff | primary
> > Thread2: bb | ee | gg
> > Thread3: cc | -- | hh
> >
> > Table1, table2 and table3 are located on separate tablespaces and are
> > independent.
> > Partial results (aa,bb,cc,dd,ee) are quite big and slow (full text
> search, arrays,
> > custom collations, function scans...).
> >
> > We consider resigning from the CTE and rewrite to RX Java but we are
> afraid
> > of downloading partial results and sending it back with WHERE IN(...).
> >
> > Thanks!
> >
> > Artur Formella
> >
> >
>
>
>
>


Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Adam Sjøgren
Tom writes:

> a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
>> Also, the error we are getting is now: "unexpected chunk number 2
>> (expected 3) for toast value 1498303849 in pg_toast_10919630", where
>> previously we've only seen "unexpected chunk number 0 (expected 1)".
>
>> We are kind of at a loss, so any suggestions on what we could try are
>> welcome.
>
> The basic thrust of these messages is "I'm reading what should be
> sequentially numbered data chunks for this toast OID, and the sequence
> numbers are wrong".  Both of these instances could be explained by
> duplicate toast rows (or duplicate index entries pointing at one row),
> though of course that would just move to the next question of how it
> got that way.  Anyway, you could move the investigation along with
> some manual checking into what's in that toast table.  For instance
>
> select chunk_id, chunk_seq, ctid, xmin, xmax, length(chunk_data)
>   from pg_toast.pg_toast_10919630
>   where chunk_id = 1498303849
>   order by 1,2;
>
> might be informative.  If you do see what seem to be duplicate
> chunk_seq values, checking whether they're still there in a
> seqscan would be good.

Here's a statement which currently gives an unexpected chunk error:

  efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 in 
pg_toast_10919630

And when I run the suggested query, I get:

  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
   1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
  (2 rows)

How would I go about checking if they are still in a seqscan?

(Note: this is on PostgreSQL 9.3.22.)


  Best regards,

Adam

-- 
 "No more than that, but very powerful all theAdam Sjøgren
  same; simple things are good."a...@novozymes.com




Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Adam Sjøgren
Adam writes:

> Here's a statement which currently gives an unexpected chunk error:
>
>   efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
>   ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 
> in pg_toast_10919630
>
> And when I run the suggested query, I get:
>
>   efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
> 1698936148 order by 1,2;
> chunk_id  | chunk_seq | ctid |xmin| xmax | length 
>   +---+--++--+
>1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
>1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
>   (2 rows)

More examples:

  efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '237764759';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698958350 in 
pg_toast_10919630
  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698958350 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698958350 | 0 | (54859821,2) | 1511487270 |0 |   1448
  (1 row)

And:

  efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '366275833';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698945095 in 
pg_toast_10919630
  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698945095 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698945095 | 0 | (53706565,3) | 1511426847 |0 |   1996
   1698945095 | 1 | (53706565,6) | 1511426847 |0 |108
  (2 rows)

One more:

  efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 in 
pg_toast_10919630
  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
   1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
  (2 rows)

And here is one from another table:

  efamroot@kat efam=# SELECT * FROM efam.sequence WHERE id = '235887163';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698750544 in 
pg_toast_10919630
  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698750544 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698750544 | 0 | (39575142,3) | 1510704835 |0 |   1996
   1698750544 | 1 | (39575142,4) | 1510704835 |0 |716
  (2 rows)

Let me know what other relevant info I can provide.


  Best regards,

Adam

-- 
 "No more than that, but very powerful all theAdam Sjøgren
  same; simple things are good."a...@novozymes.com




Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Tom Lane
a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
> Here's a statement which currently gives an unexpected chunk error:

>   efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
>   ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 
> in pg_toast_10919630

OK ...

> And when I run the suggested query, I get:

>   efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
> 1698936148 order by 1,2;
> chunk_id  | chunk_seq | ctid |xmin| xmax | length 
>   +---+--++--+
>1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
>1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
>   (2 rows)

Hmph.  So if you EXPLAIN that query, does it show it's doing it as an
indexscan?  I'd expect so, but it's always good to make sure.

Assuming it does say that, then the other test I had in mind would
involve "set enable_indexscan = 0", then repeat the EXPLAIN to make
sure that you now get a seqscan plan (you might need to turn off
enable_bitmapscan too), then do the query again and see whether the
results are the same.

regards, tom lane



pg_basebackup or dump for starting replication process

2018-04-05 Thread PegoraroF10
For replication purposes only, there are any difference between pg_basebackup
or dump to copy data from Master to Slave ?
On Docs is written that pg_basebackup can be used both for point-in-time
recovery and as the starting point for a log shipping or streaming
replication standby servers.

We are using just Dump, there are any caveat of using it for logical
replication ?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Adam Sjøgren
Tom writes:

>> And when I run the suggested query, I get:
>
>>   efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
>> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
>> 1698936148 order by 1,2;
>> chunk_id  | chunk_seq | ctid |xmin| xmax | length 
>>   +---+--++--+
>>1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
>>1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
>>   (2 rows)

> Hmph.  So if you EXPLAIN that query, does it show it's doing it as an
> indexscan?  I'd expect so, but it's always good to make sure.

It does:

  efamroot@kat efam=# SELECT * FROM efam.sendreference WHERE id = '189909908';
  ERROR:  unexpected chunk number 0 (expected 1) for toast value 1698936148 in 
pg_toast_10919630
  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
   1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
  (2 rows)

  efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
QUERY PLAN  

  
--
   Index Scan using pg_toast_10919630_index on pg_toast_10919630  
(cost=0.57..2627179.25 rows=2135674 width=54)
 Index Cond: (chunk_id = 1698936148::oid)
  (2 rows)

> Assuming it does say that, then the other test I had in mind would
> involve "set enable_indexscan = 0", then repeat the EXPLAIN to make
> sure that you now get a seqscan plan (you might need to turn off
> enable_bitmapscan too), then do the query again and see whether the
> results are the same.

Ok (if I don't disable bitmapscan, I get Bitmap Heap Scans in the EXPLAIN, so):

  efamroot@kat efam=# set enable_indexscan = 0;
  SET
  efamroot@kat efam=# set enable_bitmapscan = 0;
  SET
  efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
   QUERY PLAN   
   
  
-
   Sort  (cost=96465280.57..96470619.75 rows=2135674 width=54)
 Sort Key: chunk_seq
 ->  Seq Scan on pg_toast_10919630  (cost=0.00..96240754.39 rows=2135674 
width=54)
   Filter: (chunk_id = 1698936148::oid)
  (4 rows)

  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
  [... still waiting for the result, I will return with what it said
   when the server does ...]


  Best regards,

Adam

-- 
 "No more than that, but very powerful all theAdam Sjøgren
  same; simple things are good."a...@novozymes.com




Docker + postgreSQL : OOM killing in a large Group by operation

2018-04-05 Thread Jorge Daniel
Hi  Guys:
I have a problem with a query that grabs a bunch of rows and then does an 
aggreate operation, at that moment it gots killed by OOM-killer, I don't know 
why, the  engine starts using tmpfiles as expected , and then tries to work in  
memory and gots killed.
I've test it in an small enviroment for more dramatic/quick results

Env:
Running on a 1GB memory Docker-container .

PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 
4.8.2-19ubuntu1) 4.8.2, 64-bit
Database Analyzed and the stats are up-to-date

postgresql.conf:
shared_buffers = 512MB
work_mem = 16MB
effective_cache_size = 256MB


Bad Query:

SELECT count(*)
 FROM "changelog_change_transaction"
 INNER JOIN "changelog_change_stats" ON ( 
changelog_change_stats.changelog_change_transactionid = 
changelog_change_transaction.changelog_change_transactionid )
 LEFT OUTER JOIN "changelog_change_group" ON ( 
changelog_change_transaction.changelog_change_groupid = 
changelog_change_group.changelog_change_groupid )

WHERE ( changelog_change_group.companyid = 40 OR 
changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 
OR changelog_change_group.companyid IS NULL )
AND changelog_change_transaction.started_at > '2017-04-21'
GROUP BY  "changelog_change_transaction"."changelog_change_transactionid", 
"changelog_change_transaction"."epoch", "changelog_change_transaction"
."changelog_change_groupid", "changelog_change_transaction"."started_at", 
"changelog_change_transaction"."duration_microseconds", 
"changelog_change_transaction"."changed_items", 
"changelog_change_transaction"."xmin"
;

Explain :

 HashAggregate  (cost=7845766.73..8117654.17 rows=27188744 width=152)
   Group Key: changelog_change_transaction.changelog_change_transactionid, 
changelog_change_transaction.epoch, 
changelog_change_transaction.changelog_change_groupid, 
changelog_change_transaction.started_at, 
changelog_change_transaction.duration_microseconds, 
changelog_change_transaction.changed_items, changelog_change_transaction.xmin
   ->  Hash Left Join  (cost=2498235.67..7301991.85 rows=27188744 width=152)
 Hash Cond: (changelog_change_transaction.changelog_change_groupid = 
changelog_change_group.changelog_change_groupid)
 Filter: ((changelog_change_group.companyid = 40) OR 
(changelog_change_group.companyid = 1) OR (changelog_change_group.companyid = 
53) OR (changelog_change_group.companyid IS NULL))
 ->  Hash Join  (cost=2142692.83..5176273.34 rows=27726867 width=152)
   Hash Cond: 
(changelog_change_stats.changelog_change_transactionid = 
changelog_change_transaction.changelog_change_transactionid)
   ->  Seq Scan on changelog_change_stats  (cost=0.00..689345.48 
rows=33612148 width=6)
   ->  Hash  (cost=1215858.45..1215858.45 rows=27272350 width=152)
 ->  Seq Scan on changelog_change_transaction  
(cost=0.00..1215858.45 rows=27272350 width=152)
   Filter: (started_at > '2017-04-21 
00:00:00-07'::timestamp with time zone)
 ->  Hash  (cost=200291.82..200291.82 rows=8931282 width=10)
   ->  Seq Scan on changelog_change_group  (cost=0.00..200291.82 
rows=8931282 width=10)

Log execution:

2018-04-05 09:07:12.444 PDT rhost=[local] 
app=psql:user=postgres:db=telecom:LOG:  temporary file: path 
"base/pgsql_tmp/pgsql_tmp6455.19", size 9437168
..
2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:LOG:  server process (PID 
6455) was terminated by signal 9: Killed
2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:DETAIL:  Failed process was 
running: SELECT count(*)
 FROM "changelog_change_transaction"
 INNER JOIN "changelog_change_stats" ON ( 
changelog_change_stats.changelog_change_transactionid = 
changelog_change_transaction.changelog_change_transactionid )
 LEFT OUTER JOIN "changelog_change_group" ON ( 
changelog_change_transaction.changelog_change_groupid = 
changelog_change_group.changelog_change_groupid )

WHERE ( changelog_change_group.companyid = 40 OR 
changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 
OR changelog_change_group.companyid IS NULL )
AND changelog_change_transaction.started_at > '2017-04-21'
GROUP BY  "changelog_change_transaction"."changelog_change_transactionid", 
"changelog_change_transaction"."epoch", "changelog_change_transaction"
."changelog_change_groupid", "changelog_change_transaction"."started_at", 
"changelog_change_transaction"."duration_microseconds", 
"changelog_change_transaction"."changed_items", 
"changelog_change_transaction"."xmin"
;
2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:LOG:  terminating any other 
active server processes
2018-04-05 09:07:25.605 PDT rhost= app=:user=:db=:WARNING:  terminating 
connection because of crash of another server process


Monitoring the /proc/*/status of the running client process ,I've noticed this :

VmData: 7944 kB
VmData: 7944 kB
VmData: 7944 kB
VmData: 

Re: Docker + postgreSQL : OOM killing in a large Group by operation

2018-04-05 Thread Tom Lane
Jorge Daniel  writes:
> I have a problem with a query that grabs a bunch of rows and then does an 
> aggreate operation, at that moment it gots killed by OOM-killer, I don't know 
> why, the  engine starts using tmpfiles as expected , and then tries to work 
> in  memory and gots killed.

> SELECT count(*)
>  FROM "changelog_change_transaction"
>  INNER JOIN "changelog_change_stats" ON ( 
> changelog_change_stats.changelog_change_transactionid = 
> changelog_change_transaction.changelog_change_transactionid )
>  LEFT OUTER JOIN "changelog_change_group" ON ( 
> changelog_change_transaction.changelog_change_groupid = 
> changelog_change_group.changelog_change_groupid )

> WHERE ( changelog_change_group.companyid = 40 OR 
> changelog_change_group.companyid = 1 OR changelog_change_group.companyid = 53 
> OR changelog_change_group.companyid IS NULL )
> AND changelog_change_transaction.started_at > '2017-04-21'
> GROUP BY  "changelog_change_transaction"."changelog_change_transactionid", 
> "changelog_change_transaction"."epoch", "changelog_change_transaction"
> ."changelog_change_groupid", "changelog_change_transaction"."started_at", 
> "changelog_change_transaction"."duration_microseconds", 
> "changelog_change_transaction"."changed_items", 
> "changelog_change_transaction"."xmin"
> ;

Why are you grouping on xmin?

> For sure if the GROUP BY the one that causes this OOM (when I removed it, the 
> query finish ok ) , so I've change the query-plan to avoid the HashAggregate:
>   But the explain still shows:

That's because type XID doesn't have sort support, only hash support,
so hash aggregation is the only way to do the query at all.

regards, tom lane



Re: PgUpgrade bumped my XIDs by ~50M?

2018-04-05 Thread Jerry Sievers
Jerry Sievers  writes:

> Bruce Momjian  writes:
>
>> On Wed, Apr  4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote:
>>
>>> Bruce Momjian  writes:
>>> > Is it possible that pg_upgrade used 50M xids while upgrading?
>>> 
>>> Hi Bruce.
>>> 
>>> Don't think so, as I did just snap the safety snap and ran another
>>> upgrade on that.
>>> 
>>> And I also compared txid_current for the upgraded snap and our running
>>> production instance.
>>> 
>>> The freshly upgraded snap is ~50M txids behind the prod instance.
>>
>> Are the objects 50M behind or is txid_current 50M different?  Higher or
>> lower?
>
> txid_current is another 12M higher then a few hours ago.  Still waiting
> to hear from my reporting team if they changed anything.

Reporting team claims nothing changed.  

I still have 150 tables ready for autovac just based on freeze age
value.  Autovac is running at our as-config'd max worker count of 20
w/all threads busy as expected.

If I can assume stats such as pg_stat_database start initially cleared
after an upgrade...

Please see that pg_stat_database showing about the number of
transactions that I'd expect for this system and ~1.5 day duration.

How have some objects apparently aged by ~100M transactions (by now at
last check) since the upgrade ??

Thanks



postgres=# select sum(xact_rollback+ xact_commit) from pg_stat_database;
   sum   
-
 5292417
(1 row)

postgres=# select now() - pg_postmaster_start_time();
   ?column?
---
 1 day 13:18:48.721896
(1 row)

postgres=# select version();
   version  
  
--
 PostgreSQL 9.6.8 on x86_64-pc-linux-gnu (Ubuntu 9.6.8-1.pgdg16.04+1), compiled 
by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609, 64-bit
(1 row)


>
> This thing is running PgLogical and has a few of our event triggers as
> well.  But nothing in this regard changed with the upgrade.
>
> What if some very frequent but trivial statements that did not get
> assigned a real TXID in 9.5 on this configuration now are being treated
> differently?
>
> What's puzzling too is that when I run my TPS monitor script, it's
> clicking along at what looks typical, presently would only amount to
> 700k transactions/day but we're off-peak.
>
> Thx
>>
>>
>>> 
>>> If this is a not too uncommon case of users running amok, then this time
>>> in particular they really went off the charts :-)
>>
>> I have never heard of this problem.

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800



Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Adam Sjøgren
Adam writes:

>   efamroot@kat efam=# explain select chunk_id, chunk_seq, ctid, xmin, xmax, 
> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
> 1698936148 order by 1,2;
>QUERY PLAN 
>  
>   
> -
>Sort  (cost=96465280.57..96470619.75 rows=2135674 width=54)
>  Sort Key: chunk_seq
>  ->  Seq Scan on pg_toast_10919630  (cost=0.00..96240754.39 rows=2135674 
> width=54)
>Filter: (chunk_id = 1698936148::oid)
>   (4 rows)
>
>   efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
> length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 
> 1698936148 order by 1,2;
>   [... still waiting for the result, I will return with what it said
>when the server does ...]

It did eventually finish, with the same result:

  efamroot@kat efam=# select chunk_id, chunk_seq, ctid, xmin, xmax, 
length(chunk_data) from pg_toast.pg_toast_10919630 where chunk_id = 1698936148 
order by 1,2;
chunk_id  | chunk_seq | ctid |xmin| xmax | length 
  +---+--++--+
   1698936148 | 0 | (52888694,2) | 1511390221 |0 |   1996
   1698936148 | 1 | (52888694,4) | 1511390221 |0 |   1148
  (2 rows)


  Best regards,

Adam

-- 
 "No more than that, but very powerful all theAdam Sjøgren
  same; simple things are good."a...@novozymes.com




Re: PgUpgrade bumped my XIDs by ~50M?

2018-04-05 Thread Bruce Momjian
On Wed, Apr  4, 2018 at 08:29:06PM -0400, Bruce Momjian wrote:
> On Wed, Apr  4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote:
> > Bruce Momjian  writes:
> > > Is it possible that pg_upgrade used 50M xids while upgrading?
> > 
> > Hi Bruce.
> > 
> > Don't think so, as I did just snap the safety snap and ran another
> > upgrade on that.
> > 
> > And I also compared txid_current for the upgraded snap and our running
> > production instance.
> > 
> > The freshly upgraded snap is ~50M txids behind the prod instance.
> 
> Are the objects 50M behind or is txid_current 50M different?  Higher or
> lower?

Uh, here is a report of a similar problem from March 6, 2018:


https://www.postgresql.org/message-id/flat/C44C73BC-6B3A-42E0-9E44-6CE4E5B5D601%40ebureau.com#c44c73bc-6b3a-42e0-9e44-6ce4e5b5d...@ebureau.com

I upgraded a very large database from 9.6 to 10.1 via pg_upgrade
recently, and ever since, the auto vacuum has been busy on a large
legacy table that has experienced no changes since the upgrade. If the
whole table had been frozen prior to the upgrade, would you expect it to
stay frozen? 

It sure smells like we have a bug here.  Could this be statistics
collection instead?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: PgUpgrade bumped my XIDs by ~50M?

2018-04-05 Thread Jerry Sievers
Bruce Momjian  writes:

> On Wed, Apr  4, 2018 at 08:29:06PM -0400, Bruce Momjian wrote:
>
>> On Wed, Apr  4, 2018 at 07:13:36PM -0500, Jerry Sievers wrote:
>> > Bruce Momjian  writes:
>> > > Is it possible that pg_upgrade used 50M xids while upgrading?
>> > 
>> > Hi Bruce.
>> > 
>> > Don't think so, as I did just snap the safety snap and ran another
>> > upgrade on that.
>> > 
>> > And I also compared txid_current for the upgraded snap and our running
>> > production instance.
>> > 
>> > The freshly upgraded snap is ~50M txids behind the prod instance.
>> 
>> Are the objects 50M behind or is txid_current 50M different?  Higher or
>> lower?
>
> Uh, here is a report of a similar problem from March 6, 2018:
>
>   
> https://www.postgresql.org/message-id/flat/C44C73BC-6B3A-42E0-9E44-6CE4E5B5D601%40ebureau.com#c44c73bc-6b3a-42e0-9e44-6ce4e5b5d...@ebureau.com
>
>   I upgraded a very large database from 9.6 to 10.1 via pg_upgrade
>   recently, and ever since, the auto vacuum has been busy on a large
>   legacy table that has experienced no changes since the upgrade. If the
>   whole table had been frozen prior to the upgrade, would you expect it to
>   stay frozen? 
>
> It sure smells like we have a bug here.  Could this be statistics
> collection instead?

No clue but we still have the 9.5 safety snap that I can make repeated
sub-snaps of for mor testing.

I did one such test yesterday and things looked $normal after the
upgrade.

Noteworthy omission was that I did *not* run the post-analysis.

I could repeat the test more thoroughly.

We run a parallel analyzer framework to get huge DBs processed quickly.

And we generally do it in 2 passes; the first with def_stats_target
scaled down to finish quicker and hopefully be good enough to run
production.  At this point we open the DB cluster for business.

Immediately following we again run the analyzer at full stats_target.

Any other suggestions what to also look at and I'll be glad to do and
report back.

Big Thanks!

-- 
Jerry Sievers
e: jerry.siev...@comcast.net
p: 312.241.7800



Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Tom Lane
a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
>> [... still waiting for the result, I will return with what it said
>> when the server does ...]

> It did eventually finish, with the same result:

Huh.  So what we have here, apparently, is that regular MVCC snapshots
think there is exactly one copy of the 1698936148/0 row, but TOAST fetches
think there is more than one.  This is darn odd, not least because we
never do UPDATEs in toast tables, only inserts and deletes, so there
certainly shouldn't be update chains there.

It seems like you've got some corner case wherein SnapshotToast sees a row
that isn't visible according to MVCC --- probably a row left over from
some previous cycle of life.  That is, I'm imagining the OID counter
wrapped around and we've reused a toast OID, but for some reason there's
still a row in the table with that OID.  I'm not sure offhand how we could
get into such a state.  Alvaro, does this ring any bells (remembering that
this is 9.3)?

regards, tom lane



decompose big queries

2018-04-05 Thread hmidi slim
Hi,
I want to know what are the best practice to use in order to decompose a
big query which contains so many joins.Is it recommended to use stored
procedures ? or is there any other solution?


Re: decompose big queries

2018-04-05 Thread David G. Johnston
On Thu, Apr 5, 2018 at 3:39 PM, hmidi slim  wrote:

> I want to know what are the best practice to use in order to decompose a
> big query which contains so many joins.Is it recommended to use stored
> procedures ? or is there any other solution?
>

Views are another solution.

https://www.postgresql.org/docs/10/static/tutorial-views.html

Though while "building views upon views" is not uncommon, and it does
increase the readability of each individual view, it add update fragility
to the system.

Encapsulating high-level API concepts in functions and then writing queries
within those functions with the benefit of views is one possibility.  Your
client access needs and general data model are necessary to choose a
preferred design.

David J.


ERROR: found multixact from before relminmxid

2018-04-05 Thread Alexandre Arruda
Hi,

Some time ago, I had this errors frequently showed in logs after some
autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables
show the same and not complete the tasks (showed by some table bloat
select).
Then, I did a full dump/restore into a new version (10.2) and everything is
ok for a couple of months. Now, I have errors like this again:

db1=# cluster pc44t;

ERROR:  found multixact 134100944 from before relminmxid 192042633


Like before, the only way to make the errors to desapear is by dump/reload
the whole table.


Thanks for any help.


Best regards,


Alexandre


Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100

2018-04-05 Thread Pavan Deolasee
On Fri, Apr 6, 2018 at 2:34 AM, Tom Lane  wrote:

> a...@novozymes.com (Adam =?utf-8?Q?Sj=C3=B8gren?=) writes:
> >> [... still waiting for the result, I will return with what it said
> >> when the server does ...]
>
> > It did eventually finish, with the same result:
>
> Huh.  So what we have here, apparently, is that regular MVCC snapshots
> think there is exactly one copy of the 1698936148/0 row, but TOAST fetches
> think there is more than one.  This is darn odd, not least because we
> never do UPDATEs in toast tables, only inserts and deletes, so there
> certainly shouldn't be update chains there.
>
> It seems like you've got some corner case wherein SnapshotToast sees a row
> that isn't visible according to MVCC --- probably a row left over from
> some previous cycle of life.  That is, I'm imagining the OID counter
> wrapped around and we've reused a toast OID, but for some reason there's
> still a row in the table with that OID.  I'm not sure offhand how we could
> get into such a state.  Alvaro, does this ring any bells (remembering that
> this is 9.3)?
>

FWIW one of our support customers reported a very similar TOAST table
corruption issue last week which nearly caused an outage. After a lot of
analysis, I think I've now fully understood the reasons behind the
corruption, the underlying bug(s) and possible remedy. I am currently
working on writing a reproducible test case to demonstrate the problem and
writing the fix. More details on that soon.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: pg_basebackup or dump for starting replication process

2018-04-05 Thread Laurenz Albe
PegoraroF10 wrote:
> For replication purposes only, there are any difference between pg_basebackup
> or dump to copy data from Master to Slave ?
> On Docs is written that pg_basebackup can be used both for point-in-time
> recovery and as the starting point for a log shipping or streaming
> replication standby servers.
> 
> We are using just Dump, there are any caveat of using it for logical
> replication ?

For *logical* replication you can start with a "pg_dump --schema-only",
but for *physical* replication only a pg_basebackup will do.

Make sure that he tables are empty before you start logical replication.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com