RE: scans on table fail to be excluded by partition bounds

2019-06-25 Thread Steven Winfield
> ts=# explain SELECT * FROM eric_enodeb_cell_metrics WHERE start_time
> BETWEEN '2019-01-01 04:00' AND '2019-01-01 05:00' OR start_time BETWEEN
> '2019-01-02 04:00' AND '2019-01-02 05:00' 

Maybe it's because of the implicit usage of the local timezone when the strings 
are cast to (timestamp with time zone) in the values you give for start_time 
here?
What happens if you specify it using "TIMESTAMP WITH TIME ZONE '2019-01-01 
04:00-05'", etc.?

Steve.





Re: materialized view refresh of a foreign table

2019-06-25 Thread Rick Otten
On Sun, Jun 23, 2019 at 10:21 AM Rick Otten 
wrote:

> I'm not sure where else to look, so I'm asking here for tips.
>
> I have a table in a remote (Heroku-managed) postgresql database (PG 10.7).
>
> On the other end, (PG 11.3) I have a foreign table configured with a
> materialized view in front of it.
>
> Up until Thursday evening, it was taking about 12 - 15 seconds to refresh,
> it is only 15,000 rows with 41 columns.   Since Thursday evening it has
> been taking 15 _minutes_ or more to refresh.  Nothing changed on my end
> that I'm aware of.  It completes, it just takes forever.
>
>
I believe I've solved this mystery.  Thanks for hearing me out.  Just the
opportunity to summarize everything I'd tried helped me discover the root
cause.

In the middle of the table there is a 'text' column.   Since last Thursday
there were a number of rows that were populated with very long strings.
(lots of text in that column).   This appears to have completely bogged
down the materialized view refresh.  Since we weren't using that column in
our analytics database at this time, I simply removed it from the
materialized view.  If we do end up needing it, I'll give it its own
materialized view and/or look at chopping up the text into just the bits we
need.


Perplexing, regular decline in performance

2019-06-25 Thread Hugh Ranalli
I'm hoping people can help me figure out where to look to solve an odd
PostgreSQL performance problem.

A bit of background: We have a client with a database of approximately 450
GB, that has a couple of tables storing large amounts of text, including
full HTML pages from the Internet. Last fall, they began experiencing
dramatic and exponentially decreasing performance. We track certain query
times, so we know how much time is being spent in calls to the database for
these functions. When this began, the times went from about an average of
approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900
ms, figures we had never seen before, within 4 days, with no appreciable
change in usage. It was at this point that we restarted the database server
and times returned to the 400 ms range, but never back to their
long-running original levels. From this point onward, we had to restart the
database (originally the server, but eventually just the database process)
every 3-4 days, otherwise the application became unusable.

As they were still on PostgreSQL 8.2, we persuaded them to finally
undertake our long-standing recommendation to upgrade, as there was no
possibility of support on that platform. That upgrade to 11.2 was completed
successfully in mid-May, and although times have not returned to their
original levels (they now average approximately 250 ms), the application
overall seems much more responsive and faster (application servers were not
changed, other than minor changes --full text search, explicit casts,
etc.-- to conform to PostgreSQL 11's requirements).

What we continued to notice was a milder but still definite trend of
increased query times, during the course of each week, from the mid to high
200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had
noticed that as the number of "raw_page" columns in a particular table
grew, performance would decline. They wrote a script that once a week locks
the table, deletes the processed large columns (they are not needed after
processing), copies the remaining data to a backup table, truncates the
original table, then copies it back. When this script runs we see an
immediate change in performance, from 380 ms in the hour before the drop,
to 250 ms in the hour of the drop. As rows with these populated columns are
added during the course of a week, the performance drops, steadily, until
the next week's cleaning operation. Each week the performance increase is
clear and significant.

What is perplexing is (and I have triple checked), that this table is *not*
referenced in any way in the queries that we time (it is referenced by
ongoing administrative and processing queries). The operation that cleans
it frees up approximately 15-20 GB of space each week. Our system
monitoring shows this change in free disk space, but this is 20 GB out of
approximately 300 GB of free space (free space is just under 40% of volume
size), so disk space does not seem to be an issue. The table in question is
about 21 GB in size, with about 20 GB in toast data, at its largest.

Even odder, the queries we time *do* reference a much larger table, which
contains very similar data, and multiple columns of it. It is 355 GB in
size, with 318 GB in toast data. It grows continually, with no cleaning.

If anyone has any suggestions as to what sort of statistics to look at, or
why this would be happening, they would be greatly appreciated.

Thanks in advance,
Hugh

--
Hugh Ranalli
Principal Consultant
White Horse Technology Consulting
e: [email protected]
c: +01-416-994-7957


Re: Perplexing, regular decline in performance

2019-06-25 Thread Benjamin Scherrey
Have you done a VACUUM ANALYZE FULL on your database? This needs to be done
periodically to inform the server of the statistics of how the data and
relations are distributed across the database. Without this bad assumptions
by the planner can cause degradation of performance. Also, if you are using
the default settings in postgres.conf then understand those are established
to use the absolute minimum amount of resources possible which means not
taking advantage of available memory or CPUs that may be available in your
environment that would make the database server more performant.

Please investigate these and then report back any details of what you've
done to try to improve performance.

  best regards,

  -- Ben Scherrey

https://www.postgresql.org/docs/11/sql-vacuum.html

On Tue, Jun 25, 2019 at 10:49 PM Hugh Ranalli  wrote:

> I'm hoping people can help me figure out where to look to solve an odd
> PostgreSQL performance problem.
>
> A bit of background: We have a client with a database of approximately 450
> GB, that has a couple of tables storing large amounts of text, including
> full HTML pages from the Internet. Last fall, they began experiencing
> dramatic and exponentially decreasing performance. We track certain query
> times, so we know how much time is being spent in calls to the database for
> these functions. When this began, the times went from about an average of
> approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900
> ms, figures we had never seen before, within 4 days, with no appreciable
> change in usage. It was at this point that we restarted the database server
> and times returned to the 400 ms range, but never back to their
> long-running original levels. From this point onward, we had to restart the
> database (originally the server, but eventually just the database process)
> every 3-4 days, otherwise the application became unusable.
>
> As they were still on PostgreSQL 8.2, we persuaded them to finally
> undertake our long-standing recommendation to upgrade, as there was no
> possibility of support on that platform. That upgrade to 11.2 was completed
> successfully in mid-May, and although times have not returned to their
> original levels (they now average approximately 250 ms), the application
> overall seems much more responsive and faster (application servers were not
> changed, other than minor changes --full text search, explicit casts,
> etc.-- to conform to PostgreSQL 11's requirements).
>
> What we continued to notice was a milder but still definite trend of
> increased query times, during the course of each week, from the mid to high
> 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had
> noticed that as the number of "raw_page" columns in a particular table
> grew, performance would decline. They wrote a script that once a week locks
> the table, deletes the processed large columns (they are not needed after
> processing), copies the remaining data to a backup table, truncates the
> original table, then copies it back. When this script runs we see an
> immediate change in performance, from 380 ms in the hour before the drop,
> to 250 ms in the hour of the drop. As rows with these populated columns are
> added during the course of a week, the performance drops, steadily, until
> the next week's cleaning operation. Each week the performance increase is
> clear and significant.
>
> What is perplexing is (and I have triple checked), that this table is
> *not* referenced in any way in the queries that we time (it is referenced
> by ongoing administrative and processing queries). The operation that
> cleans it frees up approximately 15-20 GB of space each week. Our system
> monitoring shows this change in free disk space, but this is 20 GB out of
> approximately 300 GB of free space (free space is just under 40% of volume
> size), so disk space does not seem to be an issue. The table in question is
> about 21 GB in size, with about 20 GB in toast data, at its largest.
>
> Even odder, the queries we time *do* reference a much larger table, which
> contains very similar data, and multiple columns of it. It is 355 GB in
> size, with 318 GB in toast data. It grows continually, with no cleaning.
>
> If anyone has any suggestions as to what sort of statistics to look at, or
> why this would be happening, they would be greatly appreciated.
>
> Thanks in advance,
> Hugh
>
> --
> Hugh Ranalli
> Principal Consultant
> White Horse Technology Consulting
> e: [email protected]
> c: +01-416-994-7957
>


Re: Perplexing, regular decline in performance

2019-06-25 Thread Justin Pryzby
On Tue, Jun 25, 2019 at 10:55:22PM +0700, Benjamin Scherrey wrote:
> Have you done a VACUUM ANALYZE FULL on your database? This needs to be done
> periodically to inform the server of the statistics of how the data and
> relations are distributed across the database.

I think this is wrong.

VACUUM and ANALYZE are good, but normally happen automatically by autovacuum.

VACUUM FULL takes an exclusive lock on the table, and rewrites its data and
indices from scratch.  It's not normally necessary at all.  It's probably most
useful to recover from badly-bloated table if autovacuum didn't run often
enough, in which case the permanent solution is to change autovacuum settings
to be more aggressive.

Don't confuse with VACUUM FREEZE, which doesn't require exclusive lock, and
normally not necessary if autovacuum is working properly.

Justin




Re: Perplexing, regular decline in performance

2019-06-25 Thread Benjamin Scherrey
I didn't say do it all the time, I said if he hasn't done it already he
should try that as a way of ensuring the database server's understanding of
the data as it stands is correct. Otherwise there isn't enough information
to suggest other solutions as there is no description of the operating
system or resources available to the database server itself. Regardless, if
you have better suggestions please serve them up. :-)

  -- Ben Scherrey

On Tue, Jun 25, 2019 at 11:01 PM Justin Pryzby  wrote:

> On Tue, Jun 25, 2019 at 10:55:22PM +0700, Benjamin Scherrey wrote:
> > Have you done a VACUUM ANALYZE FULL on your database? This needs to be
> done
> > periodically to inform the server of the statistics of how the data and
> > relations are distributed across the database.
>
> I think this is wrong.
>
> VACUUM and ANALYZE are good, but normally happen automatically by
> autovacuum.
>
> VACUUM FULL takes an exclusive lock on the table, and rewrites its data and
> indices from scratch.  It's not normally necessary at all.  It's probably
> most
> useful to recover from badly-bloated table if autovacuum didn't run often
> enough, in which case the permanent solution is to change autovacuum
> settings
> to be more aggressive.
>
> Don't confuse with VACUUM FREEZE, which doesn't require exclusive lock, and
> normally not necessary if autovacuum is working properly.
>
> Justin
>


Re: Perplexing, regular decline in performance

2019-06-25 Thread Justin Pryzby
On Tue, Jun 25, 2019 at 11:49:03AM -0400, Hugh Ranalli wrote:
> I'm hoping people can help me figure out where to look to solve an odd
> PostgreSQL performance problem.

What kernel?  Version?  OS?

If Linux, I wonder if transparent hugepages or KSM are enabled ?  It seems
possible that truncating the table is clearing enough RAM to mitigate the
issue, similar to restarting the DB.
tail /sys/kernel/mm/ksm/run 
/sys/kernel/mm/transparent_hugepage/khugepaged/defrag 
/sys/kernel/mm/transparent_hugepage/enabled 
/sys/kernel/mm/transparent_hugepage/defrag
https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com

11.2 would have parallel query, and enabled by default.  Are there other
settings you've changed (or not changed)?
https://wiki.postgresql.org/wiki/Server_Configuration

It's possible that the "administrative" queries are using up lots of your
shared_buffers, which are (also/more) needed by the customer-facing queries.  I
would install pg_buffercache to investigate.  Or, just pause the admin queries
and see if that the issue goes away during that interval ?

SELECT 1.0*COUNT(1)/sum(count(1))OVER(), COUNT(1), COUNT(nullif(isdirty,'f')), 
datname, COALESCE(c.relname, b.relfilenode::text), d.relname TOAST, 
1.0*COUNT(nullif(isdirty,'f'))/count(1) dirtyfrac, avg(usagecount) FROM 
pg_buffercache b JOIN pg_database db ON b.reldatabase=db.oid LEFT JOIN pg_class 
c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON 
c.oid=d.reltoastrelid GROUP BY 4,5,6 ORDER BY 1 DESC LIMIT 9; 

Could you send query plan for the slow (customer-facing) queries?
https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN

> A bit of background: We have a client with a database of approximately 450
> GB, that has a couple of tables storing large amounts of text, including
> full HTML pages from the Internet. Last fall, they began experiencing
> dramatic and exponentially decreasing performance. We track certain query
> times, so we know how much time is being spent in calls to the database for
> these functions. When this began, the times went from about an average of
> approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900
> ms, figures we had never seen before, within 4 days, with no appreciable
> change in usage. It was at this point that we restarted the database server
> and times returned to the 400 ms range, but never back to their
> long-running original levels. From this point onward, we had to restart the
> database (originally the server, but eventually just the database process)
> every 3-4 days, otherwise the application became unusable.
> 
> As they were still on PostgreSQL 8.2, we persuaded them to finally
> undertake our long-standing recommendation to upgrade, as there was no
> possibility of support on that platform. That upgrade to 11.2 was completed
> successfully in mid-May, and although times have not returned to their
> original levels (they now average approximately 250 ms), the application
> overall seems much more responsive and faster (application servers were not
> changed, other than minor changes --full text search, explicit casts,
> etc.-- to conform to PostgreSQL 11's requirements).
> 
> What we continued to notice was a milder but still definite trend of
> increased query times, during the course of each week, from the mid to high
> 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had
> noticed that as the number of "raw_page" columns in a particular table
> grew, performance would decline. They wrote a script that once a week locks
> the table, deletes the processed large columns (they are not needed after
> processing), copies the remaining data to a backup table, truncates the
> original table, then copies it back. When this script runs we see an
> immediate change in performance, from 380 ms in the hour before the drop,
> to 250 ms in the hour of the drop. As rows with these populated columns are
> added during the course of a week, the performance drops, steadily, until
> the next week's cleaning operation. Each week the performance increase is
> clear and significant.
> 
> What is perplexing is (and I have triple checked), that this table is *not*
> referenced in any way in the queries that we time (it is referenced by
> ongoing administrative and processing queries). The operation that cleans
> it frees up approximately 15-20 GB of space each week. Our system
> monitoring shows this change in free disk space, but this is 20 GB out of
> approximately 300 GB of free space (free space is just under 40% of volume
> size), so disk space does not seem to be an issue. The table in question is
> about 21 GB in size, with about 20 GB in toast data, at its largest.
> 
> Even odder, the queries we time *do* reference a much larger table, which
> contains very similar data, and multiple columns of it. It is 355 GB in
> size, with 318 GB in toast data. It grows continually, with no c