Re: [PERFORM] POWA doesn't show queries executed

2017-11-21 Thread phb07

Hi,
You should probably report your issue at 
https://github.com/dalibo/powa/issues

KR

Le 18/11/2017 à 02:52, Neto pr a écrit :

Dear all

I have successfully installed POWA (http://dalibo.github.io/powa), 
including all required extensions, see the following Printscreen of 
its operation of end email.


But when executing queries in psql- comand line, this queries are not 
monitored by powa. I have checked that only Postgresql internal 
catalog queries are shown. .
I need the Optimize Query functionality and mainly the suggestion of 
indexes.
But that does not work, by clicking on the optimize query option, 
returns zero suggestions.


See below that I created a scenario, with a table with a large amount 
of data, to check if the tool would suggest some index, and when 
making a complex query, no index is suggested.


Someone uses POWA, knows if they have to configure something so that 
the queries are monitored and show suggestions ??


-- Printscreens of my environment partially 
working:--


https://sites.google.com/site/eletrolareshop/repositorio/powa1.jpeg
https://sites.google.com/site/eletrolareshop/repositorio/powa2.jpeg
https://sites.google.com/site/eletrolareshop/repositorio/powa3.jpeg

---
 scenario to verify the suggestion of indices  


postgres=# create table city_habitant (number_habitant text);
CREATE TABLE
postgres=# insert into  city_habitant (number_habitant) select 'São 
Paulo' from (select generate_series (1, 400)) a;

INSERT 0 400
postgres=# insert into  city_habitant (number_habitant) select 'Rio de 
Janeiro' from (select generate_series (1, 800)) a;

INSERT 0 800
postgres=# insert into  city_habitant (number_habitant) select 
'Recife' from (select generate_series (1, 600)) a;

INSERT 0 600
postgres=# insert into  city_habitant (number_habitant) select 
'Santos' from (select generate_series (1, 200)) a;

INSERT 0 200
postgres=# insert into  city_habitant (number_habitant) select 'Chui' 
from (select generate_series (1, 6)) a;

INSERT 0 6
postgres=# SELECT number_habitant, count(number_habitant) FROM 
 city_habitant GROUP BY number_habitant;

 number_habitant   |  count
---+--
 Rio de Janeiro| 800
 Recife | 600
 Santos| 200
 São Paulo  | 400
 Chui   |  6
(5 rows)

 
	Livre de vírus. www.avast.com 
. 







Re: insert and query performance on big string table with pg_trgm

2017-11-21 Thread Matthew Hall
Hi Jeff,

Thanks so much for writing. You've got some great points.

> On Nov 20, 2017, at 5:42 PM, Jeff Janes  wrote:
> While I have not done exhaustive testing, from the tests I have done I've 
> never found gist to be better than gin with trgm indexes.

Thanks, this helps considerably, as the documentation was kind of confusing and 
I didn't want to get it wrong if I could avoid it.

> Do you really need the artificial primary key, when you already have another 
> column that would be used as the primary key?  If you need to use this it a 
> foreign key in another type, then very well might.  But maintaining two 
> unique indexes doesn't come free.

OK, fair enough, I'll test with it removed and see what happens.

> Are all indexes present at the time you insert?  It will probably be much 
> faster to insert without the gin index (at least) and build it after the load.

There is some flexibility on the initial load, but the updates in the future 
will require the de-duplication capability. I'm willing to accept that might be 
somewhat slower on the load process, to get the accurate updates, provided we 
could try meeting the read-side goal I wrote about, or at least figure out why 
it's impossible, so I can understand what I need to fix to make it possible.

> Without knowing this key fact, it is hard to interpret the rest of your data.

I'm assuming you're referring to the part about the need for the primary key, 
and the indexes during loading? I did try to describe that in the earlier mail, 
but obviously I'm new at writing these, so sorry if I didn't make it more 
clear. I can get rid of the bigserial PK and the indexes could be made 
separately, but I would need a way to de-duplicate on future reloading... 
that's why I had the ON CONFLICT DO NOTHING expression on the INSERT. So we'd 
still want to learn why the INSERT is slow to fix up the update processes that 
would happen in the future.

> * maintenance_work_mem 512 MB
> 
> Building a gin index in bulk could benefit from more memory here. 

Fixed it; I will re-test w/ 1 GB. Have you got any recommended values so I 
don't screw it up?

> * synchronous_commit off
> 
> If you already are using unlogged tables, this might not be so helpful, but 
> does increase the risk of the rest of your system.

Fixed it; the unlogged mode change came later than this did.

>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 29578 postgres  20   0 6575672 6.149g 6.139g R  86.0 39.7  45:24.97 postgres
> 
> You should expand the command line (by hitting 'c', at least in my version of 
> top) so we can see which postgres process this is.

Good point, I'll write back once I retry w/ your other advice.

> explain (analyze, buffers), please.  And hopefully with track_io_timing=on.

track_io_timing was missing because sadly I had only found it in one document 
at the very end of the investigation, after doing the big job which generated 
all of the material posted. It's there now, so here is some better output on 
the query:

explain (analyze, buffers) select * from huge_table where value ilike 
'%canada%';

 Bitmap Heap Scan on huge_table  (cost=273.44..61690.09 rows=16702 width=33) 
(actual time=5701.511..76469.688 rows=110166 loops=1)
   Recheck Cond: ((value)::text ~~* '%canada%'::text)
   Rows Removed by Index Recheck: 198
   Heap Blocks: exact=66657
   Buffers: shared hit=12372 read=56201 dirtied=36906
   I/O Timings: read=74195.734
   ->  Bitmap Index Scan on huge_table_value_trgm  (cost=0.00..269.26 
rows=16702 width=0) (actual time=5683.032..5683.032 rows=110468 loops=1)
 Index Cond: ((value)::text ~~* '%canada%'::text)
 Buffers: shared hit=888 read=1028
 I/O Timings: read=5470.839
 Planning time: 0.271 ms
 Execution time: 76506.949 ms

I will work some more on the insert piece.

> If you repeat the same query, is it then faster, or is it still slow?

If you keep the expression exactly the same, it still takes a few seconds as 
could be expected for such a torture test query, but it's still WAY faster than 
the first such query. If you change it out to a different expression, it's 
longer again of course. There does seem to be a low-to-medium correlation 
between the number of rows found and the query completion time.

> Cheers,
> Jeff

Thanks,
Matthew.


Using GROUPING SETS with more than one set disables predicate pushdown?

2017-11-21 Thread Zakhar Shapurau
Hi!
First of all, thanks for the great work! PostgreSQL is amazing, and
community is super helpful.

I found an unexpected behaviour in PostgreSQL, and was advised to post
it to the performance mailing list on IRC. 

Using GROUPING SETS with more than one set disables predicate pushdown?

Version:
PostgreSQL 9.6.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit

Seems like when GROUPING SETS with at least two sets are used in the
subquery, planner
can not push WHERE clauses inside.

Here are two queries that (I think) are equivalent, but produce very
different execution
plans leading to bad performance on real data - and in effect,
making it impossible to abstract away non-trivial grouping logic into a
view.

It might as well be that queries are not really equivalent, but I don't
see how.

Same problem happens even if grouping sets are the same - like `GROUPING
SETS ((), ())`.

CREATE TEMPORARY TABLE test_gs (
x INT,
y INT,
z INT,
PRIMARY KEY (x, y, z)
);

EXPLAIN
SELECT
x,
y,
avg(z) AS mean
FROM test_gs
WHERE x = 1
GROUP BY x, GROUPING SETS ((y), ());

   QUERY PLAN
-
 GroupAggregate  (cost=0.15..8.65 rows=20 width=40)
   Group Key: x, y
   Group Key: x
   ->  Index Only Scan using test_gs_pkey on test_gs  (cost=0.15..8.33
   rows=10 width=12)
 Index Cond: (x = 1)
(5 rows)



EXPLAIN
SELECT x, y, mean
FROM (
 SELECT
 x,
 y,
 avg(z) AS mean
 FROM test_gs
 GROUP BY x, GROUPING SETS ((y), ())
 ) AS g
WHERE x = 1;

 QUERY PLAN

 GroupAggregate  (cost=0.15..62.10 rows=404 width=40)
   Group Key: test_gs.x, test_gs.y
   Group Key: test_gs.x
   Filter: (test_gs.x = 1)
   ->  Index Only Scan using test_gs_pkey on test_gs  (cost=0.15..41.75
   rows=2040 width=12)
(5 rows)


The issue here is that the second query is not using index to filter on
x = 1 , instead it reads all the tuples from an index and applies the
filter.

Here is also a description in gist:
https://gist.github.com/zeveshe/cf92c9d2a6b14518af3180113e767ae7

Thanks a lot!

-- 
  Zakhar Shapurau
  [email protected]
+47 407 54 397



pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
Hello

We use a system in filmproduction called DaVinci Resolve. It uses a pgsql 
database when you work in a collaborative workflow and multiple people share 
projects. Previously it was using pgsql 8.4 but for a new major upgrade they 
recommend an upgrade to 9.5. Probably also to some macOS limitation/support and 
that 9.x is required for macOS >10.11.

They (BlackMagic Design) provide three tools for the migration.
1. For for dumping everything form the old 8.4 database
2. One for upgrading from 8.4 to 9.5
3. One for restoring the backup in step 1 in 9.5

All that went smoothly and working in the systems also works smoothly and as 
good as previously, maybe even a bit better/faster.

What's not working smoothly is my daily pg_dump's though. I don't have a 
reference to what's a big and what's a small database since I'm no db-guy and 
don't really maintain nor work with it on a daily basis. Pretty much only this 
system we use that has a db system like this. Below is a list of what we dump.

930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup


The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the 
very same pg_dump takes 644 minutes and 40 seconds. To it takes about three 
times as long now and I have no idea to why. Nothing in the system or hardware 
other than the pgsql upgrade have change.

I dump the db's with a custom script and this is the line I use to get the DB's:
DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only 
--command="SELECT datname from pg_database WHERE NOT datistemplate")

After that I iterate over them with a for loop and dump with:
${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs 
--format=custom --verbose --file=${pg_dump_filename}_${database}.backup 
${database} | tee -a ${log_pg_dump}_${database}.log

When observing the system during the dump it LOOKS like it did in 8.4. pg_dump 
is using 100% of one core and from what I can see it does this through out the 
operation. But it's still so much slower. I read about the parallell option 
in pg_dump for 9.5 but sadly I cannot dump like that because the application in 
question can (probably) not import that format on it's own and I would have to 
use pgrestore or something. Which in theory is fine but sometimes one of the 
artists have to import the db backup. So need to keep it simple.

The system is:
MacPro 5,1
2x2.66 GHz Quad Core Xeon
64 GB RAM
macOS 10.11.6
PostgreSQL 9.5.4
DB on a 6 disk SSD RAID


I hope I got all the info needed. Really hope someone with more expertise and 
skills than me can point me in the right direction.

Cheers and thanks


--
Henrik Cednert
cto | compositor




Re: Using GROUPING SETS with more than one set disables predicate pushdown?

2017-11-21 Thread Tom Lane
Zakhar Shapurau  writes:
> Using GROUPING SETS with more than one set disables predicate pushdown?

It looks like this is a case that no one's gotten round to yet.
The comment in the relevant code is

 * In some cases we may want to transfer a HAVING clause into WHERE. We
 * cannot do so if the HAVING clause contains aggregates (obviously) or
 * volatile functions (since a HAVING clause is supposed to be executed
 * only once per group).  We also can't do this if there are any nonempty
 * grouping sets; moving such a clause into WHERE would potentially change
 * the results, if any referenced column isn't present in all the grouping
 * sets.  (If there are only empty grouping sets, then the HAVING clause
 * must be degenerate as discussed below.)

Presumably, we could examine the grouping sets to identify column(s)
present in all sets, and then allow the optimization for clauses that
reference only such columns.  Or maybe I'm misreading the comment
(but then it needs clarification).

regards, tom lane



Re: [PERFORM] POWA doesn't show queries executed

2017-11-21 Thread Marco Nietz

Hi,

powa relies on extensions (pg_stat_statements, pg_qualstats) that needs 
to be installed in every database you want to monitor. Maybe you just 
installed them only into postgres database?!


Best regards
Marco


Am 18.11.2017 um 02:52 schrieb Neto pr:

Dear all

I have successfully installed POWA (http://dalibo.github.io/powa), 
including all required extensions, see the following Printscreen of its 
operation of end email.


But when executing queries in psql- comand line, this queries are not 
monitored by powa. I have checked that only Postgresql internal catalog 
queries are shown. .
I need the Optimize Query functionality and mainly the suggestion of 
indexes.
But that does not work, by clicking on the optimize query option, 
returns zero suggestions.


See below that I created a scenario, with a table with a large amount of 
data, to check if the tool would suggest some index, and when making a 
complex query, no index is suggested.


Someone uses POWA, knows if they have to configure something so that the 
queries are monitored and show suggestions ??


-- Printscreens of my environment partially 
working:--


https://sites.google.com/site/eletrolareshop/repositorio/powa1.jpeg
https://sites.google.com/site/eletrolareshop/repositorio/powa2.jpeg
https://sites.google.com/site/eletrolareshop/repositorio/powa3.jpeg

---
 scenario to verify the suggestion of indices  


postgres=# create table city_habitant (number_habitant text);
CREATE TABLE
postgres=# insert into  city_habitant (number_habitant) select 'São 
Paulo' from (select generate_series (1, 400)) a;

INSERT 0 400
postgres=# insert into  city_habitant (number_habitant) select 'Rio de 
Janeiro' from (select generate_series (1, 800)) a;

INSERT 0 800
postgres=# insert into  city_habitant (number_habitant) select 'Recife' 
from (select generate_series (1, 600)) a;

INSERT 0 600
postgres=# insert into  city_habitant (number_habitant) select 'Santos' 
from (select generate_series (1, 200)) a;

INSERT 0 200
postgres=# insert into  city_habitant (number_habitant) select 'Chui' 
from (select generate_series (1, 6)) a;

INSERT 0 6
postgres=# SELECT number_habitant, count(number_habitant) FROM 
  city_habitant GROUP BY number_habitant;

  number_habitant   |  count
---+--
  Rio de Janeiro    | 800
  Recife                 | 600
  Santos                | 200
  São Paulo          | 400
  Chui                   |  6
(5 rows)

 
	Livre de vírus. www.avast.com 
. 



<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>




RE: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Igor Neyman

From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 9:29 AM
To: [email protected]
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

Hello

We use a system in filmproduction called DaVinci Resolve. It uses a pgsql 
database when you work in a collaborative workflow and multiple people share 
projects. Previously it was using pgsql 8.4 but for a new major upgrade they 
recommend an upgrade to 9.5. Probably also to some macOS limitation/support and 
that 9.x is required for macOS >10.11.

They (BlackMagic Design) provide three tools for the migration.
1. For for dumping everything form the old 8.4 database
2. One for upgrading from 8.4 to 9.5
3. One for restoring the backup in step 1 in 9.5

All that went smoothly and working in the systems also works smoothly and as 
good as previously, maybe even a bit better/faster.

What's not working smoothly is my daily pg_dump's though. I don't have a 
reference to what's a big and what's a small database since I'm no db-guy and 
don't really maintain nor work with it on a daily basis. Pretty much only this 
system we use that has a db system like this. Below is a list of what we dump.

930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup


The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the 
very same pg_dump takes 644 minutes and 40 seconds. To it takes about three 
times as long now and I have no idea to why. Nothing in the system or hardware 
other than the pgsql upgrade have change.

I dump the db's with a custom script and this is the line I use to get the DB's:
DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only 
--command="SELECT datname from pg_database WHERE NOT datistemplate")

After that I iterate over them with a for loop and dump with:
${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs 
--format=custom --verbose --file=${pg_dump_filename}_${database}.backup 
${database} | tee -a ${log_pg_dump}_${database}.log

When observing the system during the dump it LOOKS like it did in 8.4. pg_dump 
is using 100% of one core and from what I can see it does this through out the 
operation. But it's still so much slower. I read about the parallell option 
in pg_dump for 9.5 but sadly I cannot dump like that because the application in 
question can (probably) not import that format on it's own and I would have to 
use pgrestore or something. Which in theory is fine but sometimes one of the 
artists have to import the db backup. So need to keep it simple.

The system is:
MacPro 5,1
2x2.66 GHz Quad Core Xeon
64 GB RAM
macOS 10.11.6
PostgreSQL 9.5.4
DB on a 6 disk SSD RAID


I hope I got all the info needed. Really hope someone with more expertise and 
skills than me can point me in the right direction.

Cheers and thanks


--
Henrik Cednert
cto | compositor
According to pg_dump command in your script you are dumping your databases in 
custom format:

--format=custom

These backups could only be restored using pg_restore (or something that wraps 
pg_restore).
So, you can safely add parallel option.  It should not affect your restore 
procedure.

Regards,
Igor Neyman



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
Ahh! Nice catch Igor. Thanks. =)

Will try and see if resolve can read that back in.

Still very curious about the 3x slowdown in 9.5 pg_dump though.


--
Henrik Cednert
cto | compositor

Filmlance International
On 21 Nov 2017, at 17:25, Igor Neyman 
mailto:[email protected]>> wrote:


From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 9:29 AM
To: 
[email protected]
Subject: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

Hello

We use a system in filmproduction called DaVinci Resolve. It uses a pgsql 
database when you work in a collaborative workflow and multiple people share 
projects. Previously it was using pgsql 8.4 but for a new major upgrade they 
recommend an upgrade to 9.5. Probably also to some macOS limitation/support and 
that 9.x is required for macOS >10.11.

They (BlackMagic Design) provide three tools for the migration.
1. For for dumping everything form the old 8.4 database
2. One for upgrading from 8.4 to 9.5
3. One for restoring the backup in step 1 in 9.5

All that went smoothly and working in the systems also works smoothly and as 
good as previously, maybe even a bit better/faster.

What's not working smoothly is my daily pg_dump's though. I don't have a 
reference to what's a big and what's a small database since I'm no db-guy and 
don't really maintain nor work with it on a daily basis. Pretty much only this 
system we use that has a db system like this. Below is a list of what we dump.

930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.backup
516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup


The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 the 
very same pg_dump takes 644 minutes and 40 seconds. To it takes about three 
times as long now and I have no idea to why. Nothing in the system or hardware 
other than the pgsql upgrade have change.

I dump the db's with a custom script and this is the line I use to get the DB's:
DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align --tuples-only 
--command="SELECT datname from pg_database WHERE NOT datistemplate")

After that I iterate over them with a for loop and dump with:
${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs 
--format=custom --verbose --file=${pg_dump_filename}_${database}.backup 
${database} | tee -a ${log_pg_dump}_${database}.log

When observing the system during the dump it LOOKS like it did in 8.4. pg_dump 
is using 100% of one core and from what I can see it does this through out the 
operation. But it's still so much slower. I read about the parallell option 
in pg_dump for 9.5 but sadly I cannot dump like that because the application in 
question can (probably) not import that format on it's own and I would have to 
use pgrestore or something. Which in theory is fine but sometimes one of the 
artists have to import the db backup. So need to keep it simple.

The system is:
MacPro 5,1
2x2.66 GHz Quad Core Xeon
64 GB RAM
macOS 10.11.6
PostgreSQL 9.5.4
DB on a 6 disk SSD RAID


I hope I got all the info needed. Really hope someone with more expertise and 
skills than me can point me in the right direction.

Cheers and thanks


--
Henrik Cednert
cto | compositor
According to pg_dump command in your script you are dumping your databases in 
custom format:

--format=custom

These backups could only be restored using pg_restore (or something that wraps 
pg_restore).
So, you can safely add parallel option.  It should not affect your restore 
procedure.

Regards,
Igor Neyman



RE: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Igor Neyman

From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 11:27 AM
To: [email protected]
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

Ahh! Nice catch Igor. Thanks. =)

Will try and see if resolve can read that back in.

Still very curious about the 3x slowdown in 9.5 pg_dump though.


--
Henrik Cednert
cto | compositor

Filmlance International

Basically, you are dumping 40GB of data.
I'd say even 212 minutes under 8.4 version was too slow.
What kind of RAID is it? RAID1/RAID10/RAID5?

Regards,
Igor Neyman


Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
RAID6. Doing disk test I have 1000MB/sec write and 1200MB/sec read.

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

On 21 Nov 2017, at 17:34, Igor Neyman 
mailto:[email protected]>> wrote:


From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 11:27 AM
To: 
[email protected]
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

Ahh! Nice catch Igor. Thanks. =)

Will try and see if resolve can read that back in.

Still very curious about the 3x slowdown in 9.5 pg_dump though.


--
Henrik Cednert
cto | compositor

Filmlance International

Basically, you are dumping 40GB of data.
I’d say even 212 minutes under 8.4 version was too slow.
What kind of RAID is it? RAID1/RAID10/RAID5?

Regards,
Igor Neyman



RE: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Igor Neyman

From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 11:37 AM
To: [email protected]
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade


Attention: This email was sent from someone outside of Perceptron. Always 
exercise caution when opening attachments or clicking links from unknown 
senders or when receiving unexpected emails.

RAID6. Doing disk test I have 1000MB/sec write and 1200MB/sec read.

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

_

Okay, I was kind of wrong about 40GB.  That’s the size of your compressed 
backup files, not the size of your databases.
May be your dbs are “bloated”?
You could try VACUUM FULL on your databases, when there is no other activity.

Igor Neyman


Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Shaul Dar
Guys,

Sorry to bother you but can anyone help me unsubscribe from this list?
I followed the instructions in the original email and got an error
message...
Thanks,

-- Shaul

On Tue, Nov 21, 2017 at 6:25 PM, Igor Neyman  wrote:

>
>
> *From:* Henrik Cednert (Filmlance) [mailto:[email protected]]
> *Sent:* Tuesday, November 21, 2017 9:29 AM
> *To:* [email protected]
> *Subject:* pg_dump 3 times as slow after 8.4 -> 9.5 upgrade
>
>
>
> Hello
>
>
>
> We use a system in filmproduction called DaVinci Resolve. It uses a pgsql
> database when you work in a collaborative workflow and multiple people
> share projects. Previously it was using pgsql 8.4 but for a new major
> upgrade they recommend an upgrade to 9.5. Probably also to some macOS
> limitation/support and that 9.x is required for macOS >10.11.
>
>
>
> They (BlackMagic Design) provide three tools for the migration.
>
> 1. For for dumping everything form the old 8.4 database
>
> 2. One for upgrading from 8.4 to 9.5
>
> 3. One for restoring the backup in step 1 in 9.5
>
>
>
> All that went smoothly and working in the systems also works smoothly and
> as good as previously, maybe even a bit better/faster.
>
>
>
> What's not working smoothly is my daily pg_dump's though. I don't have a
> reference to what's a big and what's a small database since I'm no db-guy
> and don't really maintain nor work with it on a daily basis. Pretty much
> only this system we use that has a db system like this. Below is a list of
> what we dump.
>
>
>
> 930M Nov 18 13:31 filmserver03_2017-11-18_132043_dailies_2017_01.backup
> 2.2K Nov 18 13:20 filmserver03_2017-11-18_132043_postgres.backup
> 522K Nov 18 13:20 filmserver03_2017-11-18_132043_resolve.backup
> 23G Nov 18 19:37 filmserver03_2017-11-18_132043_resolve_2017_01.backup
> 5.1G Nov 18 20:54 filmserver03_2017-11-18_132043_resolve_2017_02.backup
> 10G Nov 18 23:34 filmserver03_2017-11-18_132043_resolve_filmserver02.
> backup
> 516K Nov 18 23:35 filmserver03_2017-11-18_132043_temp_backup_test.backup
> 1.9G Nov 19 00:05 filmserver03_2017-11-18_132043_temp_dev_resolve14.backup
>
>
> The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with
> 9.5 the very same pg_dump takes 644 minutes and 40 seconds. To it takes
> about three times as long now and I have no idea to why. Nothing in the
> system or hardware other than the pgsql upgrade have change.
>
>
>
> I dump the db's with a custom script and this is the line I use to get the
> DB's:
>
> DATABASES=$(${BINARY_PATH}/psql --user=postgres -w --no-align
> --tuples-only --command="SELECT datname from pg_database WHERE NOT
> datistemplate")
>
>
>
> After that I iterate over them with a for loop and dump with:
>
> ${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password
> --blobs --format=custom --verbose 
> --file=${pg_dump_filename}_${database}.backup
> ${database} | tee -a ${log_pg_dump}_${database}.log
>
>
>
> When observing the system during the dump it LOOKS like it did in 8.4.
> pg_dump is using 100% of one core and from what I can see it does this
> through out the operation. But it's still so much slower. I read about
> the parallell option in pg_dump for 9.5 but sadly I cannot dump like that
> because the application in question can (probably) not import that format
> on it's own and I would have to use pgrestore or something. Which in theory
> is fine but sometimes one of the artists have to import the db backup. So
> need to keep it simple.
>
>
>
> The system is:
>
> MacPro 5,1
>
> 2x2.66 GHz Quad Core Xeon
>
> 64 GB RAM
>
> macOS 10.11.6
>
> PostgreSQL 9.5.4
>
> DB on a 6 disk SSD RAID
>
>
>
>
>
> I hope I got all the info needed. Really hope someone with more expertise
> and skills than me can point me in the right direction.
>
>
>
> Cheers and thanks
>
>
>
>
> --
> Henrik Cednert
> cto | compositor
>
> According to pg_dump command in your script you are dumping your databases
> in custom format:
>
>
>
> --format=custom
>
>
>
> These backups could only be restored using pg_restore (or something that
> wraps pg_restore).
>
> So, you can safely add parallel option.  It should not affect your restore
> procedure.
>
>
>
> Regards,
>
> Igor Neyman
>
>
>


Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
I VACUUM every sunday so that is done already. =/

Not sure I have the proper params though since I'm not used to db's but have 
followed other's "how to's", but these are the lines in my script for that;

${BINARY_PATH}/vacuumdb --analyze --host=localhost --username=postgres --echo 
--verbose --no-password ${database} | tee -a ${log_pg_optimize}_${database}.log
${BINARY_PATH}/reindexdb --host=localhost --username=postgres --no-password 
--echo ${database} | tee -a ${log_pg_optimize}_${database}.log


--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

On 21 Nov 2017, at 17:44, Igor Neyman 
mailto:[email protected]>> wrote:


From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 11:37 AM
To: 
[email protected]
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade


Attention: This email was sent from someone outside of Perceptron. Always 
exercise caution when opening attachments or clicking links from unknown 
senders or when receiving unexpected emails.


RAID6. Doing disk test I have 1000MB/sec write and 1200MB/sec read.

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

_

Okay, I was kind of wrong about 40GB.  That’s the size of your compressed 
backup files, not the size of your databases.
May be your dbs are “bloated”?
You could try VACUUM FULL on your databases, when there is no other activity.

Igor Neyman



RE: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Igor Neyman

From: Henrik Cednert (Filmlance) [mailto:[email protected]]
Sent: Tuesday, November 21, 2017 11:48 AM
To: [email protected]
Subject: Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

I VACUUM every sunday so that is done already. =/

Not sure I have the proper params though since I'm not used to db's but have 
followed other's "how to's", but these are the lines in my script for that;

${BINARY_PATH}/vacuumdb --analyze --host=localhost --username=postgres --echo 
--verbose --no-password ${database} | tee -a ${log_pg_optimize}_${database}.log
${BINARY_PATH}/reindexdb --host=localhost --username=postgres --no-password 
--echo ${database} | tee -a ${log_pg_optimize}_${database}.log


--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

___

To do vacuum full you need to add –full option to your vacuumdb command:

${BINARY_PATH}/vacuumdb --full --analyze --host=localhost --username=postgres 
--echo --verbose --no-password ${database} | tee -a 
${log_pg_optimize}_${database}.log

Just be aware that “vacuum full” locks tables unlike just analyze”.  So, like I 
said, no other acivity during this process.

Regards,
Igor



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Tom Lane
"Henrik Cednert (Filmlance)"  writes:
> The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 
> the very same pg_dump takes 644 minutes and 40 seconds. To it takes about 
> three times as long now and I have no idea to why. Nothing in the system or 
> hardware other than the pgsql upgrade have change.

Can you get a profile of where the machine is spending its time during the
dump run?  On Linux I'd recommend "perf", but on macOS, hmm ...
You could use Activity Monitor, but as far as I can see that just captures
short-duration snapshots, which might not be representative of a 10-hour
run.  XCode's Instruments feature would probably be better about giving
a full picture, but it has a steep learning curve.

regards, tom lane



Re: Using GROUPING SETS with more than one set disables predicate pushdown?

2017-11-21 Thread Andres Freund


On November 21, 2017 6:49:26 AM PST, Tom Lane  wrote:
>Zakhar Shapurau  writes:
>
>Presumably, we could examine the grouping sets to identify column(s)
>present in all sets, and then allow the optimization for clauses that
>reference only such columns.  Or maybe I'm misreading the comment
>(but then it needs clarification).

By memory that sounds about right. IIRC we'd some slightly more elaborate logic 
when GS were introduced, but had to take it out as buggy, and it was too late 
in the development cycle to come up with something better.

Andres

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Robert Haas
On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane  wrote:
> "Henrik Cednert (Filmlance)"  writes:
>> The last pg_dump with 8.4 took 212 minutes and 49 seconds.And now with 9.5 
>> the very same pg_dump takes 644 minutes and 40 seconds. To it takes about 
>> three times as long now and I have no idea to why. Nothing in the system or 
>> hardware other than the pgsql upgrade have change.
>
> Can you get a profile of where the machine is spending its time during the
> dump run?  On Linux I'd recommend "perf", but on macOS, hmm ...
> You could use Activity Monitor, but as far as I can see that just captures
> short-duration snapshots, which might not be representative of a 10-hour
> run.  XCode's Instruments feature would probably be better about giving
> a full picture, but it has a steep learning curve.

macOS's "sample" is pretty easy to use and produces text format output
that is easy to email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Tom Lane
Robert Haas  writes:
> On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane  wrote:
>> Can you get a profile of where the machine is spending its time during the
>> dump run?  On Linux I'd recommend "perf", but on macOS, hmm ...
>> You could use Activity Monitor, but as far as I can see that just captures
>> short-duration snapshots, which might not be representative of a 10-hour
>> run.  XCode's Instruments feature would probably be better about giving
>> a full picture, but it has a steep learning curve.

> macOS's "sample" is pretty easy to use and produces text format output
> that is easy to email.

Ah, good idea.  But note that only traces one process, so you'd need to
first determine whether it's pg_dump or the backend that's eating most
of the CPU.  Or sample both of them.

regards, tom lane



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
Hello

Running it with format "directory" produced something I cannot import form the 
host application. So I aborted that.

Running it now and recording with Instruments. Guess I'll have to leave it 
cooking for the full procedure but I've added an initial one to pastebin.
https://pastebin.com/QHRYUQhb

Sent this with screenshot attached first but don't think the list supports 
that... So here's a screenshot from instruments after running for a few mins.
https://www.dropbox.com/s/3vr5yzt4zs5svck/pg_dump_profile.png?dl=0

Cheers

--
Henrik Cednert
cto | compositor

Filmlance International


On 21 Nov 2017, at 19:46, Tom Lane 
mailto:[email protected]>> wrote:

Robert Haas mailto:[email protected]>> writes:
On Tue, Nov 21, 2017 at 12:01 PM, Tom Lane 
mailto:[email protected]>> wrote:
Can you get a profile of where the machine is spending its time during the
dump run?  On Linux I'd recommend "perf", but on macOS, hmm ...
You could use Activity Monitor, but as far as I can see that just captures
short-duration snapshots, which might not be representative of a 10-hour
run.  XCode's Instruments feature would probably be better about giving
a full picture, but it has a steep learning curve.

macOS's "sample" is pretty easy to use and produces text format output
that is easy to email.

Ah, good idea.  But note that only traces one process, so you'd need to
first determine whether it's pg_dump or the backend that's eating most
of the CPU.  Or sample both of them.

regards, tom lane



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Tom Lane
"Henrik Cednert (Filmlance)"  writes:
> I'm not sure if I can attach screenshots here. Trying, screenshot from 
> instruments after running for a few mins.

It looks like practically all of pg_dump's time is going into deflate(),
ie zlib.  I don't find that terribly surprising in itself, but it offers
no explanation for why you'd see a slowdown --- zlib isn't even our
code, nor has it been under active development for a long time, so
presumably 8.4 and 9.5 would have used the same version.  Perhaps you
were doing the 8.4 dump without compression enabled?

regards, tom lane



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
Hi Tom

I'm honestly not sure about anything. =) I use the exact same flags as with 8.4 
for the dump:

${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs 
--format=custom --verbose --file=${pg_dump_filename}_${database}.backup 
${database}

So unless the default behaviour have changed in 9.x I'd say I don't use 
compression. I will try to force it to no compression and see if it's different.

Sadly the instruments session stopped recording when I logged out of the system 
yesterday. Doh. =/

Cheers

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

On 21 Nov 2017, at 22:01, Tom Lane 
mailto:[email protected]>> wrote:

"Henrik Cednert (Filmlance)" 
mailto:[email protected]>> writes:
I'm not sure if I can attach screenshots here. Trying, screenshot from 
instruments after running for a few mins.

It looks like practically all of pg_dump's time is going into deflate(),
ie zlib.  I don't find that terribly surprising in itself, but it offers
no explanation for why you'd see a slowdown --- zlib isn't even our
code, nor has it been under active development for a long time, so
presumably 8.4 and 9.5 would have used the same version.  Perhaps you
were doing the 8.4 dump without compression enabled?

regards, tom lane



Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade

2017-11-21 Thread Henrik Cednert (Filmlance)
Ha! So forcing compression to 0 i went from 644 minutes to 87 minutes. And this 
time I backed it to a afp share and from the looks of it I hit the roof on that 
eth interface. Size of backup went from 50GB to 260 GB though, hehehe.

So something seems to have changed regarding default compression level between 
8.x and 9.6 when doing a custom format dump. I will time all the different 
levels and see if I can find out more.

WHat's the normal way to deal with compression? Dump uncompressed and use 
something that threads better to compress the dump?

Cheers

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

On 22 Nov 2017, at 04:48, Henrik Cednert (Filmlance) 
mailto:[email protected]>> wrote:


This sender failed our fraud detection checks and may not be who they appear to 
be. Learn about spoofing
Feedback
Hi Tom

I'm honestly not sure about anything. =) I use the exact same flags as with 8.4 
for the dump:

${BINARY_PATH}/pg_dump --host=localhost --user=postgres --no-password --blobs 
--format=custom --verbose --file=${pg_dump_filename}_${database}.backup 
${database}

So unless the default behaviour have changed in 9.x I'd say I don't use 
compression. I will try to force it to no compression and see if it's different.

Sadly the instruments session stopped recording when I logged out of the system 
yesterday. Doh. =/

Cheers

--
Henrik Cednert
cto | compositor

Filmlance International
mobile [ + 46 (0)704 71 89 54 ]
skype  [ cednert ]

On 21 Nov 2017, at 22:01, Tom Lane 
mailto:[email protected]>> wrote:

"Henrik Cednert (Filmlance)" 
mailto:[email protected]>> writes:
I'm not sure if I can attach screenshots here. Trying, screenshot from 
instruments after running for a few mins.

It looks like practically all of pg_dump's time is going into deflate(),
ie zlib.  I don't find that terribly surprising in itself, but it offers
no explanation for why you'd see a slowdown --- zlib isn't even our
code, nor has it been under active development for a long time, so
presumably 8.4 and 9.5 would have used the same version.  Perhaps you
were doing the 8.4 dump without compression enabled?

regards, tom lane