Re: How can sort performance be so different

2019-02-06 Thread Laurenz Albe
Merlin Moncure wrote:
> yeah, probably.  Having said that, I'm really struggling that it can
> take take several minutes to sort such a small number of rows even
> with location issues.  I can sort rocks faster than that :-).
> 
> Switching between various european collations, I'm seeing subsecond
> sort responses for 44k records on my test box.  I don't have the laos
> collation installed unfortunately.   Are you seeing kind of penalty in
> other conversions?

I find that it makes a lot of difference what you sort:

CREATE TABLE sort(t text);

INSERT INTO sort SELECT 'ຕົວອັກສອນລາວ... ງ່າຍຂື້ນ' || i FROM generate_series(1, 
10) AS i;

SET work_mem = '1GB';

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "C";
[...]
 Execution Time: 288.752 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "lo_LA.utf8";
[...]
 Execution Time: 47006.683 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort ORDER BY t COLLATE "en_US.utf8";
[...]
 Execution Time: 73962.934 ms


CREATE TABLE sort2(t text);

INSERT INTO sort2 SELECT 'this is plain old English' || i FROM 
generate_series(1, 10) AS i;

SET work_mem = '1GB';

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "C";
[...]
 Execution Time: 237.615 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "lo_LA.utf8";
[...]
 Execution Time: 2467.848 ms

EXPLAIN (ANALYZE, BUFFERS) SELECT t FROM sort2 ORDER BY t COLLATE "en_US.utf8";
[...]
 Execution Time: 2927.667 ms


This is on my x86_64 Fedora 29 system, kernel 4.20.6, glibc 2.28.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Mariel Cherkassky
Hi,
I have a table with a bytea column and its size is huge and thats why
postgres created a toasted table for that column. The original table
contains about 1K-10K rows but the toasted can contain up to 20M rows. I
assigned the next two settings for the toasted table :

 alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);

  alter table orig_table set (toast.autovacuum_vacuum_threshold  =1);


Therefore I hoped that after deletion of 10K rows from the toasted table
autovacuum will launch vacuum on the toasted table.

>From the logs I see that sometimes the autovacuum is running once in a few
hours (3-4 hours) and sometimes it runs even every few minutes.

Now I wanted to check if only depends on the thresholds and on the
frequency of the deletes/updates on the table ? In some cases the
autovacuum is taking a few hours (4+) it finishes and then immediatly is
starting to run vacuum again on  the table :

2019-01-29 *07:10:58* EST  14083  LOG:  automatic vacuum of table
"db.pg_toast.pg_toast_14430": index scans: 3

pages: 1672 removed, 7085056 remain

tuples: 6706885 removed, 2023847 remain

buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied

avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s

system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec

This run took 19119 sec ~ 5 hours


2019-01-29 *10:05:45* EST  11985  LOG:  automatic vacuum of table
"db.pg_toast.pg_toast_14430": index scans: 2

pages: 2752 removed, 7082304 remain

tuples: 3621620 removed, 1339540 remain

buffer usage: 2655076 hits, 3506964 misses, 423 dirtied

avg read rate: 2.638 MiB/s, avg write rate: 2.508 MiB/s

system usage: CPU 71.22s/37.65u sec elapsed 10384.93 sec


this run took 10384 sec ~ 2.88 hours.


the diff between the summaries is 3 hours and the second run took 2.88
hours which means that the autovacuum launched vacuum on the table a few
minutes after the first vacuum has finished.


In addition, as I said sometimes if runs very often :

2019-02-04 09:26:23 EST  14735  LOG:  automatic vacuum of table
"db.pg_toast.pg_toast_14430": index scans: 1

pages: 1760 removed, 11149568 remain

tuples: 47870 removed, 4929452 remain

buffer usage: 200575 hits, 197394 misses, 24264 dirtied

avg read rate: 5.798 MiB/s, avg write rate: 0.713 MiB/s

system usage: CPU 1.55s/1.38u sec elapsed 265.96 sec


2019-02-04 09:32:57 EST  26171  LOG:  automatic vacuum of table
"db.pg_toast.pg_toast_14430": index scans: 1

pages: 2144 removed, 11147424 remain

tuples: 55484 removed, 4921526 remain

buffer usage: 196811 hits, 209267 misses, 34471 dirtied

avg read rate: 5.459 MiB/s, avg write rate: 0.899 MiB/s

system usage: CPU 1.73s/1.54u sec elapsed 299.50 sec


Now the question is how to handle or tune it ? Is there any change that I
need to increase the cost_limit / cost_delay ?


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Laurenz Albe
On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> Hi,
> I have a table with a bytea column and its size is huge and thats why 
> postgres created a toasted table for that column.
> The original table contains about 1K-10K rows but the toasted can contain up 
> to 20M rows.
> I assigned the next two settings for the toasted table : 
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =1);
> 
> Therefore I hoped that after deletion of 10K rows from the toasted table 
> autovacuum will launch vacuum on the toasted table.
> From the logs I see that sometimes the autovacuum is running once in a few 
> hours (3-4 hours) and sometimes it runs even every few minutes.
> Now I wanted to check if only depends on the thresholds and on the frequency 
> of the deletes/updates on the table ?
> In some cases the autovacuum is taking a few hours (4+) it finishes and then 
> immediatly is starting to run vacuum again on  the table : 
> 
> Now the question is how to handle or tune it ? Is there any change that I 
> need to increase the cost_limit / cost_delay ?

Maybe configuring autovacuum to run faster will help:

alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);

Or, more extreme:

alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread David Rowley
On Thu, 7 Feb 2019 at 00:17, Laurenz Albe  wrote:
>
> On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> > Now the question is how to handle or tune it ? Is there any change that I 
> > need to increase the cost_limit / cost_delay ?
>
> Maybe configuring autovacuum to run faster will help:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);
>
> Or, more extreme:
>
> alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);

Going by the block hits/misses/dirtied and the mentioned vacuum times,
it looks like auto-vacuum is set to the standard settings and if so it
spent about 100% of its time sleeping on the job.

It might be a better idea to consider changing the vacuum settings
globally rather than just for one table.

Running a vacuum_cost_limit of 200 is likely something you'd not want
to ever do with modern hardware... well maybe unless you just bought
the latest Raspberry PI, or something.  You should be tuning that
value to something that runs your vacuums to a speed you're happy with
but leaves enough IO and CPU for queries running on the database.

If you see that all auto-vacuum workers are busy more often than not,
then they're likely running too slowly and should be set to run more
quickly.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Mariel Cherkassky
Hey,
As I said, I set the next settings for the toasted table :

 alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);

  alter table orig_table set (toast.autovacuum_vacuum_threshold  =1);

Can you explain a little bit more why you decided that the autovacuum spent
it time on sleeping ?

I see the autovacuum statistics from the logs, how can I check that the
workers are busy very often ?

My vacuum limit is 200 right now, basically If vacuum runs on my toasted
table and reached 200 but it didnt finish to clean all the dead tuples,
after the nap, should it continue cleaning it or wait until the
vacuum_threshold hit again ?

‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-15:05 מאת ‪David Rowley‬‏ <‪
[email protected]‬‏>:‬

> On Thu, 7 Feb 2019 at 00:17, Laurenz Albe 
> wrote:
> >
> > On Wed, 2019-02-06 at 12:29 +0200, Mariel Cherkassky wrote:
> > > Now the question is how to handle or tune it ? Is there any change
> that I need to increase the cost_limit / cost_delay ?
> >
> > Maybe configuring autovacuum to run faster will help:
> >
> > alter table orig_table set (toast.autovacuum_vacuum_cost_limit = 2000);
> >
> > Or, more extreme:
> >
> > alter table orig_table set (toast.autovacuum_vacuum_cost_delay = 0);
>
> Going by the block hits/misses/dirtied and the mentioned vacuum times,
> it looks like auto-vacuum is set to the standard settings and if so it
> spent about 100% of its time sleeping on the job.
>
> It might be a better idea to consider changing the vacuum settings
> globally rather than just for one table.
>
> Running a vacuum_cost_limit of 200 is likely something you'd not want
> to ever do with modern hardware... well maybe unless you just bought
> the latest Raspberry PI, or something.  You should be tuning that
> value to something that runs your vacuums to a speed you're happy with
> but leaves enough IO and CPU for queries running on the database.
>
> If you see that all auto-vacuum workers are busy more often than not,
> then they're likely running too slowly and should be set to run more
> quickly.
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread dangal
Would it be nice to start changing those values ​​found in the default
postgres.conf so low?



--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html



Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Mariel Cherkassky
which one you mean ? I changed the threshold and the scale for the specific
table...

‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-15:36 מאת ‪dangal‬‏ <‪
[email protected]‬‏>:‬

> Would it be nice to start changing those values ​​found in the default
> postgres.conf so low?
>
>
>
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
>
>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread MichaelDBA

Hi all,

In the myriad of articles written about autovacuum tuning, I really like 
this article by Tomas Vondra of 2ndQuadrant:

https://blog.2ndquadrant.com/autovacuum-tuning-basics/

It is a concise article that touches on all the major aspects of 
autovacuuming tuning: thresholds, scale factors, throttling, etc.


Regards and happy vacuuming to yas!
Michael Vitale


Mariel Cherkassky 
Wednesday, February 6, 2019 8:41 AM
which one you mean ? I changed the threshold and the scale for the 
specific table...


dangal 
Wednesday, February 6, 2019 8:36 AM
Would it be nice to start changing those values ​​found in the default
postgres.conf so low?



--
Sent from: 
http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


David Rowley 
Wednesday, February 6, 2019 8:05 AM

Going by the block hits/misses/dirtied and the mentioned vacuum times,
it looks like auto-vacuum is set to the standard settings and if so it
spent about 100% of its time sleeping on the job.

It might be a better idea to consider changing the vacuum settings
globally rather than just for one table.

Running a vacuum_cost_limit of 200 is likely something you'd not want
to ever do with modern hardware... well maybe unless you just bought
the latest Raspberry PI, or something. You should be tuning that
value to something that runs your vacuums to a speed you're happy with
but leaves enough IO and CPU for queries running on the database.

If you see that all auto-vacuum workers are busy more often than not,
then they're likely running too slowly and should be set to run more
quickly.





Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Jeff Janes
On Wed, Feb 6, 2019 at 5:29 AM Mariel Cherkassky <
[email protected]> wrote:


> Now the question is how to handle or tune it ? Is there any change that I
> need to increase the cost_limit / cost_delay ?
>

Sometimes vacuum has more work to do, so it takes more time to do it.

There is no indication of a problem. Or at least, you haven't described
one. So, there is nothing to handle or to tune.

If there is a problem, those log entries might help identify it.  But in
the absence of a problem, they are just log spam.

Cheers,

Jeff


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Mariel Cherkassky
Well, basically I'm trying to tune it because the table still keep growing.
I thought that by setting the scale and the threshold it will be enough but
its seems that it wasnt. I attached some of the logs output to hear what
you guys think about it ..

‫בתאריך יום ד׳, 6 בפבר׳ 2019 ב-16:12 מאת ‪Jeff Janes‬‏ <‪
[email protected]‬‏>:‬

> On Wed, Feb 6, 2019 at 5:29 AM Mariel Cherkassky <
> [email protected]> wrote:
>
>
>> Now the question is how to handle or tune it ? Is there any change that I
>> need to increase the cost_limit / cost_delay ?
>>
>
> Sometimes vacuum has more work to do, so it takes more time to do it.
>
> There is no indication of a problem. Or at least, you haven't described
> one. So, there is nothing to handle or to tune.
>
> If there is a problem, those log entries might help identify it.  But in
> the absence of a problem, they are just log spam.
>
> Cheers,
>
> Jeff
>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread Jeff Janes
On Wed, Feb 6, 2019 at 9:42 AM Mariel Cherkassky <
[email protected]> wrote:

> Well, basically I'm trying to tune it because the table still keep
> growing. I thought that by setting the scale and the threshold it will be
> enough but its seems that it wasnt. I attached some of the logs output to
> hear what you guys think about it ..
>

Are all four log entries from well after you made the change?  My first
inclination is to think that the first 2 are from either before the change,
or just after the change when it is still settling into the new regime.
Also, is the table still continuing to grow, or is at a new steady-state of
bloat which isn't growing but also isn't shrinking back to where you want
it to be?  More aggressive vacuuming alone should stop the bloat, but is
not likely to reverse it.

I habitually set vacuum_cost_page_hit and vacuum_cost_page_miss to zero.
Page reads are self-limiting (vacuum is single threaded, so you can't have
more than one read (times autovacuum_max_workers) going on at a time) so I
don't see a need to throttle them intentionally as well--unless your entire
db is sitting on one spindle.  Based on the high ratio of read rates to
write rates in the last two log entries, this change alone should be enough
greatly speed up the run time of the vacuum.

If you need to speed it up beyond that, I don't think it matters much
whether you decrease cost_delay or increase cost_limit, it is the ratio
that mostly matters.

And if these latter measures do work, you should consider undoing changes
to autovacuum_vacuum_scale_factor.  Reading the entire index just to remove
10,000 rows from the table is a lot of extra work that might be
unnecessary. Although that extra work might not be on anyone's critical
path.

>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-06 Thread David Rowley
On Thu, 7 Feb 2019 at 02:34, Mariel Cherkassky
 wrote:
> As I said, I set the next settings for the toasted table :
>
>  alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
>
>   alter table orig_table set (toast.autovacuum_vacuum_threshold  =1);

These settings don't control how fast auto-vacuum runs, just when it should run.

> Can you explain a little bit more why you decided that the autovacuum spent 
> it time on sleeping ?

Yeah, if you look at the following settings.

 vacuum_cost_limit   | 200
 vacuum_cost_page_dirty  | 20
 vacuum_cost_page_hit| 1
 vacuum_cost_page_miss   | 10
 autovacuum_vacuum_cost_delay| 20ms

I've tagged on the default setting for each of these.  Both vacuum and
auto-vacuum keep score of how many points they've accumulated while
running. 20 points for dirtying a page, 10 for a read that's not found
to be in shared_buffers, 1 for reading a buffer from shared buffers.
When vacuum_cost_limit points is reached (or
autovacuum_vacuum_cost_limit if not -1) auto-vacuum sleeps for
autovacuum_vacuum_cost_delay, normal manual vacuums sleep for
vacuum_cost_delay.

In one of the log entries you saw:

> buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied
> avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s
> system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec

Doing a bit of maths to see how much time that vacuum should have slept for:

postgres=# select (4808221 * 1 + 6404148 * 10 + 6152603 * 20) / 200.0
* 20 / 1000;
  ?column?

 19190.1761

That's remarkably close to the actual time of 19119.55 sec. If you do
the same for the other 3 vacuums then you'll see the same close match.

> I see the autovacuum statistics from the logs, how can I check that the 
> workers are busy very often ?

It would be nice if there was something better, but periodically doing:

SELECT count(*) FROM pg_stat_activity where query like 'autovacuum%';

will work.

> My vacuum limit is 200 right now, basically If vacuum runs on my toasted 
> table and reached 200 but it didnt finish to clean all the dead tuples, after 
> the nap, should it continue cleaning it or wait until the vacuum_threshold 
> hit again ?

You're confusing nap time is something else, Maybe you're confusing
that with speed of vacuum?  Napping is just the time auto-vacuum will
wait between checking for new tables to work on.  Having the
auto-vacuum run so slowly is a probable cause of still having dead
tuples after the vacuum... likely because they became dead after
vacuum started.

I'd recommend reading the manual or Tomas Vondra's blog about vacuum
costs. It's not overly complex, once you understand what each of the
vacuum settings does.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services