Re: Many DataFileRead - IO waits
On Fri, Feb 28, 2020 at 2:00 PM legrand legrand
wrote:
> Hello,
> I'm not able to use your perfs diagrams,
> but it seems to me that not using 3rd column of that index (int_otherid2)
> generates an IO problem.
>
> Could you give us the result of
>
> explain (analyze,buffers) SELECT
>
> tabledata.uuid_id,tabledata.int_id,tabledata.timestamp_date,tabledata.int_otherid,tabledata.float_value,tabledata.int_otherid2,tabledata.int_otherid3,tabledata.int_rowver
> FROM tabledata
> WHERE timestamp_date <= '2020-02-24 03:05:00.013'::timestamp without time
> zone
> ND int_otherid3 = '3ad2b707-a068-42e8-b0f2-6c8570953760'
> AND tabledata.int_id=8149
> ORDER BY timestamp_date DESC
> LIMIT 1
>
> and this for each value of int_otherid3 ?
> and tell us if you are able to change the sql ?
>
> Thanks
> Regards
> PAscal
>
>
>
Thanks for the suggestion. Yes I could change the sql and when using only
one filter for int_otherid2 it does use all 3 columns as the index key.
explain (analyze,buffers) SELECT
uuid_id,int_id,timestamp_date,int_otherid,float_value,int_otherid2,int_otherid3,int_rowver
FROM tabledata WHERE dtdatetime <= '2020-01-20 03:05:00.013' AND
gDiagnosticId IN ('3c99d61b-21a1-42ea-92a8-3cc88d79f3f1') AND
ivehicleid=8149 ORDER BY dtdatetime DESC LIMIT 1
QUERY PLAN
--
Limit (cost=0.71..85.13 rows=1 width=84) (actual time=300.820..300.821
rows=1 loops=1)
Buffers: shared hit=17665 read=1
-> Index Scan Backward using
ix_tabledata_intid_timestampdate_intotherid3_intotherid2 on tabledata
(cost=0.71..41960.39 rows=497 width=84) (actual time=300.808..300.809
rows=1 loops=1)
Index Cond: ((int_id = 8149) AND (timestamp_date <= '2020-01-20
03:05:00.013'::timestamp without time zone) AND (int_otherid2 =
'3c99d61b-21a1-42ea-92a8-3cc88d79f3f1'::uuid))
Buffers: shared hit=17665 read=1
Planning time: 58.769 ms
Execution time: 300.895 ms
(7 rows)
I still haven't been able to explain why this changed all of a sudden (I am
working on reproducing this error in a test environment) but this could be
a good workaround. I might be able to just make 6 calls or maybe rewrite
the original query some other way in order to get it to use all 3 keys of
the index. I'll have to do some more testing
Thanks again.
Re: Many DataFileRead - IO waits
On Sat, Feb 29, 2020 at 11:22 AM Jeff Janes wrote: > On Thu, Feb 27, 2020 at 11:33 AM Ben Snaidero > wrote: > > >> I have the following query that was on average running in ~2ms suddenly >> jump up to on average ~25ms. >> > > What are you averaging over? The plan you show us is slow enough that if > you were averaging over the last 1000 executions, that one execution could > skew the entire average just by itself. When individual execution times > can vary over 4 powers of 10, I don't think averages are a very good way of > analyzing things. > > > >> This query is called millions of time per day and there were cases of the >> query taking 20-30 seconds. Below is the explain analyze of one such >> example. >> When seeing this issue, the server was under some CPU pressure but even >> with that, I would not think it should get as slow as shown below as we are >> using SSDs and none of the windows disk counters (IOPS, queue length) show >> any value that would be of concern. >> > > What is the average and expected random read latency on your SSDs? Have > you benchmarked them (outside of the database system) to see if they are > performing as expected? > > >> Rows Removed by Filter: 91686 >> Buffers: shared hit=12102 read=13259 written=111 >> > > Do the faster executions have fewer rows removed by the filter (and fewer > buffers read), or are they just faster despite having about the same values? > > > >> We took a perfview during the issue and below is the call stack from a >> process running this query, two call paths are shown. >> > > I've never used perfview. But if I try to naively interpret it similar to > gdb backtrace, it doesn't make much sense to me. InitBufTable is only > called by "postmaster" while starting the database, how could it be part of > call paths during regular operations? Are these views of the slow-running > back end itself, or of some other postgresql process which was idle at the > time the snapshot was taken? > > Cheers, > > Jeff > Query statistics were averaged over ~3million calls so I don't think a single outlier would skew the results too much. The perfview call stack is similar to gdb backtrace. I am 99% sure that this call path is from the backend running this query as we queried pg_stat_activity at the time of the perfview and cross-referenced the PIDs. That said I am going to try building on windows with debug symbols enabled and see if I can use gdb to debug and confirm.
Re: Many DataFileRead - IO waits
> Thanks for the suggestion. Yes I could change the sql and when using only
> one filter for int_otherid2 it does use all 3 columns as the index key.
explain (analyze,buffers) SELECT
uuid_id,int_id,timestamp_date,int_otherid,float_value,int_otherid2,int_otherid3,int_rowver
FROM tabledata WHERE dtdatetime <= '2020-01-20 03:05:00.013' AND
gDiagnosticId IN ('3c99d61b-21a1-42ea-92a8-3cc88d79f3f1') AND
ivehicleid=8149 ORDER BY dtdatetime DESC LIMIT 1
QUERY PLAN
--
Limit (cost=0.71..85.13 rows=1 width=84) (actual time=300.820..300.821
rows=1 loops=1)
Buffers: shared hit=17665 read=1
-> Index Scan Backward using
ix_tabledata_intid_timestampdate_intotherid3_intotherid2 on tabledata
(cost=0.71..41960.39 rows=497 width=84) (actual time=300.808..300.809
rows=1 loops=1)
Index Cond: ((int_id = 8149) AND (timestamp_date <= '2020-01-20
03:05:00.013'::timestamp without time zone) AND (int_otherid2 =
'3c99d61b-21a1-42ea-92a8-3cc88d79f3f1'::uuid))
Buffers: shared hit=17665 read=1
Planning time: 58.769 ms
Execution time: 300.895 ms
(7 rows)
> I still haven't been able to explain why this changed all of a sudden (I
> am
> working on reproducing this error in a test environment) but this could be
> a good workaround. I might be able to just make 6 calls or maybe rewrite
> the original query some other way in order to get it to use all 3 keys of
> the index. I'll have to do some more testing
Parsing of 58 ms and 300 ms for 17665 memory blocks read is very very bad
...
Are those shared buffers in memory or SWAPPED ?
Is the server CPU bounded or limited ?
May be you should dump some data for a test case on an other platform
(any desktop) to get a comparison point
Regards
PAscal
--
Sent from:
https://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
