Details after Load Peak was: OT: Performance of VM

2018-02-06 Thread Thomas Güttler



Am 05.02.2018 um 14:26 schrieb Andreas Kretschmer:



Am 05.02.2018 um 14:14 schrieb Thomas Güttler:
What do you suggest to get some reliable figures? 


sar is often recommended, see 
https://blog.2ndquadrant.com/in-the-defense-of-sar/.

Can you exclude other reasons like vacuum / vacuum freeze?


In the current case it was a problem in the hypervisor.

But I want to be prepared for the next time.

The tool sar looks good. This way I can generate a chart where I can see peaks. 
Nice.

 But one thing is still unclear. Imagine I see a peak in the chart. The peak
was some hours ago. AFAIK sar has only the aggregated numbers.

But I need to know details if I want to answer the question "Why?". The peak
has gone and ps/top/iotop don't help me anymore.

Any idea?

Regards,
  Thomas Güttler





--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines



Re: Details after Load Peak was: OT: Performance of VM

2018-02-06 Thread Alan Hodgson
On Tue, 2018-02-06 at 15:31 +0100, Thomas Güttler wrote:
> 
 But one thing is still unclear. Imagine I see a peak in the chart. The peak
> was some hours ago. AFAIK sar has only the aggregated numbers.
> 
> But I need to know details if I want to answer the question "Why?". The peak
> has gone and ps/top/iotop don't help me anymore.
> 

The typical solution is to store stats on everything you can think of
with munin, cacti, ganglia, or similar systems.

I know with ganglia at least, in addition to all the many details it
already tracks on a system and the many plugins already available for
it, you can write your own plugins or simple agents, so you can keep
stats on anything you can code around.

Munin's probably the easiest to try out, though.

Re: failing to use index on UNION of matviews (Re: postgresql 10.1 wrong plan in when using partitions bug)

2018-02-06 Thread Justin Pryzby
On Sun, Feb 04, 2018 at 11:04:56AM -0500, Rick Otten wrote:
> On Sun, Feb 4, 2018 at 10:35 AM, Tom Lane  wrote:
> 
> > Rick Otten  writes:
> > > I'm wrestling with a very similar problem too - except instead of
> > official
> > > partitions I have a views on top of a bunch (50+) of unioned materialized
> > > views, each "partition" with 10M - 100M rows.  On 9.6.6 the queries would
> > > use the indexes on each materialized view.  On 10.1, every materialized
> > > view is sequence scanned.

I think it'd be useful to see the plan from explain analyze, on both the
"parent" view and a child, with and without SET enable_seqscan=off,

Justin



Re: failing to use index on UNION of matviews (Re: postgresql 10.1 wrong plan in when using partitions bug)

2018-02-06 Thread Rick Otten
Ooo.  I wasn't aware of that option.  (Learn something new every day!)

Setting enable_seqscan=off takes one of the shorter queries I was working
with from about 3 minutes to 300ms.   This is a comparable performance
improvement to where I put a materialized view (with indexes) on top of the
materialized views instead of using a simple view on top of the
materialized views.  I'll have to try it with the query that takes 12 hours.

I built a test case, but can't get it to reproduce what I'm seeing on my
production database (it keeps choosing the indexes).  I'm still fiddling
with that test case so I can easily share it.  I'm also back to trying to
figure out what is different between my laptop database and the test case I
built and the real world query with the real data, and pondering the worst
query itself to see if some sort of re-write will help.



On Tue, Feb 6, 2018 at 1:18 PM, Justin Pryzby  wrote:

> On Sun, Feb 04, 2018 at 11:04:56AM -0500, Rick Otten wrote:
> > On Sun, Feb 4, 2018 at 10:35 AM, Tom Lane  wrote:
> >
> > > Rick Otten  writes:
> > > > I'm wrestling with a very similar problem too - except instead of
> > > official
> > > > partitions I have a views on top of a bunch (50+) of unioned
> materialized
> > > > views, each "partition" with 10M - 100M rows.  On 9.6.6 the queries
> would
> > > > use the indexes on each materialized view.  On 10.1, every
> materialized
> > > > view is sequence scanned.
>
> I think it'd be useful to see the plan from explain analyze, on both the
> "parent" view and a child, with and without SET enable_seqscan=off,
>
> Justin
>


Re: failing to use index on UNION of matviews (Re: postgresql 10.1 wrong plan in when using partitions bug)

2018-02-06 Thread Rick Otten
On Tue, Feb 6, 2018 at 3:02 PM, Rick Otten  wrote:

> Ooo.  I wasn't aware of that option.  (Learn something new every day!)
>
> Setting enable_seqscan=off takes one of the shorter queries I was working
> with from about 3 minutes to 300ms.   This is a comparable performance
> improvement to where I put a materialized view (with indexes) on top of the
> materialized views instead of using a simple view on top of the
> materialized views.  I'll have to try it with the query that takes 12 hours.
>
> I built a test case, but can't get it to reproduce what I'm seeing on my
> production database (it keeps choosing the indexes).  I'm still fiddling
> with that test case so I can easily share it.  I'm also back to trying to
> figure out what is different between my laptop database and the test case I
> built and the real world query with the real data, and pondering the worst
> query itself to see if some sort of re-write will help.
>
>
>
> On Tue, Feb 6, 2018 at 1:18 PM, Justin Pryzby 
> wrote:
>
>> On Sun, Feb 04, 2018 at 11:04:56AM -0500, Rick Otten wrote:
>> > On Sun, Feb 4, 2018 at 10:35 AM, Tom Lane  wrote:
>> >
>> > > Rick Otten  writes:
>> > > > I'm wrestling with a very similar problem too - except instead of
>> > > official
>> > > > partitions I have a views on top of a bunch (50+) of unioned
>> materialized
>> > > > views, each "partition" with 10M - 100M rows.  On 9.6.6 the queries
>> would
>> > > > use the indexes on each materialized view.  On 10.1, every
>> materialized
>> > > > view is sequence scanned.
>>
>> I think it'd be useful to see the plan from explain analyze, on both the
>> "parent" view and a child, with and without SET enable_seqscan=off,
>>
>> Justin
>>
>
>
Sorry, I didn't mean to "top reply".  My bad.


Re: effective_io_concurrency on EBS/gp2

2018-02-06 Thread Justin Pryzby
On Wed, Jan 31, 2018 at 04:34:18PM -0300, Claudio Freire wrote:
> In my experience playing with prefetch, e_i_c>0 interferes with kernel
> read-ahead. What you've got there would make sense if what postgres
> thinks will be random I/O ends up being sequential. With e_i_c=0, the
> kernel will optimize the hell out of it, because it's a predictable
> pattern. But with e_i_c=1, the kernel's optimization gets disabled but
> postgres isn't reading much ahead, so you get the worst possible case.

On Thu, Feb 01, 2018 at 03:39:07PM -0300, Claudio Freire wrote:
> Problem is, if you're scanning a highly correlated index, the
> mechanism is counterproductive. I had worked on some POC patches for
> correcting that, I guess I could work something out, but it's
> low-priority for me. Especially since it's actually a kernel "bug" (or
> shortcoming), that could be fixed in the kernel rather than worked
> around by postgres.

On Sun, Feb 04, 2018 at 11:27:25PM -0300, Claudio Freire wrote:
> ... Dense scans have large portions of contiguous fetches, a pattern that is
> quite adversely affected by the current prefetch mechanism in linux.
> 
> ... There's a rather simple workaround for this, pg should just avoid issuing
> prefetch orders for sequential block patterns, since those are already much
> better handled by the kernel itself.

Thinking out loud.. if prefetch were a separate process, I imagine this
wouldn't be an issue ; is it possible the parallel worker code could take on
responsibility of prefetching (?)

Justin