slow query with inline function on AWS RDS with RDS 24x large
below query is slow even with no data
explain ANALYZE
WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
SELECT ro.order_id,
ro.date_time,
round(ro.order_amount, 2) AS order_amount,
b.branch_id,
b.branch_name,
st_x(b.location) AS from_x,
st_y(b.location) AS from_y,
b.user_id AS branch_user_id,
b.contact_info,
r.restaurant_id,
c.city_id,
c.city_name,
c.city_name_ar,
st_linefromtext'LINESTRING('::text || st_x(b.location)) ||
' '::text) || st_y(b.location)) || ','::text) ||
st_x(ro.location_geometry)) || ' '::text) ||
st_y(ro.location_geometry)) || ')'::text, 28355) AS from_to,
to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
ro.customer_comment,
'N'::text AS is_new_customer,
ro.picked_up_time,
ro.driver_assigned_date_time,
oom.offer_amount,
oom.offer_type_code AS offer_type,
ro.uk_vat
FROM business, restaurant_order ro
JOIN branch b ON b.branch_id = ro.branch_id
JOIN restaurant r ON r.restaurant_id = b.restaurant_id
JOIN city c ON c.city_id = b.city_id
LEFT JOIN order_offer_map oom using (order_id)
WHERE ro.date_time >= business.start_date AND ro.date_time<=
f_now_immutable_with_tz();
Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291)
(actual time=1056.926..1056.934 rows=0 loops=1)
Hash Cond: (ro.order_id = oom.order_id)
-> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209)
(actual time=1056.926..1056.932 rows=0 loops=1)
Hash Cond: (ro.branch_id = b.branch_id)
-> Nested Loop (cost=5427.94..3546726.47 rows=19275986
width=108) (actual time=1036.809..1036.810 rows=0 loops=1)
-> Function Scan on start_date (cost=0.00..0.01 rows=1
width=8) (actual time=0.006..0.008 rows=1 loops=1)
-> Bitmap Heap Scan on restaurant_order ro
(cost=5427.94..3353966.60 rows=19275986 width=108) (actual
time=1036.793..1036.793 rows=0 loops=1)
Recheck Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:05:32.784199+00'::timestamp with time zone))
Rows Removed by Index Recheck: 5039976
Heap Blocks: lossy=275230
-> Bitmap Index Scan on rest_ord_date_brin
(cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
rows=2917120 loops=1)
Index Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:05:32.784199+00'::timestamp with time zone))
-> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual
time=20.106..20.109 rows=20949 loops=1)
Buckets: 32768 (originally 8192) Batches: 1 (originally
1) Memory Usage: 3112kB
-> Hash Join (cost=343.29..1083.35 rows=5866
width=109) (actual time=1.620..14.539 rows=20949 loops=1)
Hash Cond: (b.restaurant_id = r.restaurant_id)
-> Hash Join (cost=2.26..726.91 rows=5866
width=109) (actual time=0.029..8.597 rows=20949 loops=1)
Hash Cond: (b.city_id = c.city_id)
-> Seq Scan on branch b (cost=0.00..668.49
rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
-> Hash (cost=1.56..1.56 rows=56 width=29)
(actual time=0.020..0.021 rows=56 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on city c
(cost=0.00..1.56 rows=56 width=29) (actual time=0.004..0.010 rows=56
loops=1)
-> Hash (cost=233.42..233.42 rows=8609 width=8)
(actual time=1.575..1.575 rows=8609 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 465kB
-> Index Only Scan using
"restaurant_idx$$_274b003d" on restaurant r (cost=0.29..233.42
rows=8609 width=8) (actual time=0.006..0.684 rows=8609 loops=1)
Heap Fetches: 0
-> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never executed)
-> Seq Scan on order_offer_map oom (cost=0.00..33000.09
rows=1273009 width=13) (never executed)
Planning Time: 1.180 ms
Execution Time: 1057.535 ms
could some one explain why it is slow, if I insert 50k records the
execution time reaches 20 seconds
Re: slow query with inline function on AWS RDS with RDS 24x large
pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan napsal:
>
> below query is slow even with no data
>
>
> explain ANALYZE
>
> WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
> SELECT ro.order_id,
> ro.date_time,
> round(ro.order_amount, 2) AS order_amount,
> b.branch_id,
> b.branch_name,
> st_x(b.location) AS from_x,
> st_y(b.location) AS from_y,
> b.user_id AS branch_user_id,
> b.contact_info,
> r.restaurant_id,
>c.city_id,
>c.city_name,
>c.city_name_ar,
>st_linefromtext'LINESTRING('::text || st_x(b.location)) || '
> '::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry)) ||
> ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS from_to,
>to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
>ro.customer_comment,
>'N'::text AS is_new_customer,
>ro.picked_up_time,
>ro.driver_assigned_date_time,
>oom.offer_amount,
> oom.offer_type_code AS offer_type,
>ro.uk_vat
> FROM business, restaurant_order ro
>
> JOIN branch b ON b.branch_id = ro.branch_id
> JOIN restaurant r ON r.restaurant_id = b.restaurant_id
> JOIN city c ON c.city_id = b.city_id
> LEFT JOIN order_offer_map oom using (order_id)
> WHERE ro.date_time >= business.start_date AND ro.date_time<=
> f_now_immutable_with_tz();
>
>
>
> Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual
> time=1056.926..1056.934 rows=0 loops=1)
> Hash Cond: (ro.order_id = oom.order_id)
> -> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209) (actual
> time=1056.926..1056.932 rows=0 loops=1)
> Hash Cond: (ro.branch_id = b.branch_id)
> -> Nested Loop (cost=5427.94..3546726.47 rows=19275986 width=108)
> (actual time=1036.809..1036.810 rows=0 loops=1)
> -> Function Scan on start_date (cost=0.00..0.01 rows=1
> width=8) (actual time=0.006..0.008 rows=1 loops=1)
> -> Bitmap Heap Scan on restaurant_order ro
> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
> time=1036.793..1036.793 rows=0 loops=1)
> Recheck Cond: ((date_time >= start_date.start_date) AND
> (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
> Rows Removed by Index Recheck: 5039976
> Heap Blocks: lossy=275230
> -> Bitmap Index Scan on rest_ord_date_brin
> (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
> rows=2917120 loops=1)
> Index Cond: ((date_time >= start_date.start_date)
> AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
> -> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual
> time=20.106..20.109 rows=20949 loops=1)
> Buckets: 32768 (originally 8192) Batches: 1 (originally 1)
> Memory Usage: 3112kB
> -> Hash Join (cost=343.29..1083.35 rows=5866 width=109)
> (actual time=1.620..14.539 rows=20949 loops=1)
> Hash Cond: (b.restaurant_id = r.restaurant_id)
> -> Hash Join (cost=2.26..726.91 rows=5866 width=109)
> (actual time=0.029..8.597 rows=20949 loops=1)
> Hash Cond: (b.city_id = c.city_id)
> -> Seq Scan on branch b (cost=0.00..668.49
> rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
> -> Hash (cost=1.56..1.56 rows=56 width=29)
> (actual time=0.020..0.021 rows=56 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 12kB
> -> Seq Scan on city c (cost=0.00..1.56
> rows=56 width=29) (actual time=0.004..0.010 rows=56 loops=1)
> -> Hash (cost=233.42..233.42 rows=8609 width=8) (actual
> time=1.575..1.575 rows=8609 loops=1)
> Buckets: 16384 Batches: 1 Memory Usage: 465kB
> -> Index Only Scan using
> "restaurant_idx$$_274b003d" on restaurant r (cost=0.29..233.42 rows=8609
> width=8) (actual time=0.006..0.684 rows=8609 loops=1)
> Heap Fetches: 0
> -> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never executed)
> -> Seq Scan on order_offer_map oom (cost=0.00..33000.09
> rows=1273009 width=13) (never executed)
> Planning Time: 1.180 ms
> Execution Time: 1057.535 ms
>
> could some one explain why it is slow, if I insert 50k records the execution
> time reaches 20 seconds
>
>
-> Bitmap Heap Scan on restaurant_order ro
(cost=5427.94..3353966.60 rows=19275986 width=108) (actual
time=1036.793..1036.793 rows=0 loops=1)
Recheck Cond: ((date_time >= start_date.start_date) AND
(date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
Rows Removed by Index Recheck: 5039976
Heap Blocks: lossy=275230
->
Re: slow query with inline function on AWS RDS with RDS 24x large
BRIN index is only on the date_time column, I even tried with btree index
with no performance gains.
On Fri, Jun 4, 2021 at 11:23 AM Pavel Stehule
wrote:
>
>
> pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan napsal:
>
>>
>> below query is slow even with no data
>>
>>
>> explain ANALYZE
>>
>> WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
>> SELECT ro.order_id,
>> ro.date_time,
>> round(ro.order_amount, 2) AS order_amount,
>> b.branch_id,
>> b.branch_name,
>> st_x(b.location) AS from_x,
>> st_y(b.location) AS from_y,
>> b.user_id AS branch_user_id,
>> b.contact_info,
>> r.restaurant_id,
>>c.city_id,
>>c.city_name,
>>c.city_name_ar,
>>st_linefromtext'LINESTRING('::text || st_x(b.location)) || '
>> '::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry))
>> || ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS from_to,
>>to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
>>ro.customer_comment,
>>'N'::text AS is_new_customer,
>>ro.picked_up_time,
>>ro.driver_assigned_date_time,
>>oom.offer_amount,
>> oom.offer_type_code AS offer_type,
>>ro.uk_vat
>> FROM business, restaurant_order ro
>>
>> JOIN branch b ON b.branch_id = ro.branch_id
>> JOIN restaurant r ON r.restaurant_id = b.restaurant_id
>> JOIN city c ON c.city_id = b.city_id
>> LEFT JOIN order_offer_map oom using (order_id)
>> WHERE ro.date_time >= business.start_date AND ro.date_time<=
>> f_now_immutable_with_tz();
>>
>>
>>
>> Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual
>> time=1056.926..1056.934 rows=0 loops=1)
>> Hash Cond: (ro.order_id = oom.order_id)
>> -> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209) (actual
>> time=1056.926..1056.932 rows=0 loops=1)
>> Hash Cond: (ro.branch_id = b.branch_id)
>> -> Nested Loop (cost=5427.94..3546726.47 rows=19275986 width=108)
>> (actual time=1036.809..1036.810 rows=0 loops=1)
>> -> Function Scan on start_date (cost=0.00..0.01 rows=1
>> width=8) (actual time=0.006..0.008 rows=1 loops=1)
>> -> Bitmap Heap Scan on restaurant_order ro
>> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
>> time=1036.793..1036.793 rows=0 loops=1)
>> Recheck Cond: ((date_time >= start_date.start_date) AND
>> (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>> Rows Removed by Index Recheck: 5039976
>> Heap Blocks: lossy=275230
>> -> Bitmap Index Scan on rest_ord_date_brin
>> (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
>> rows=2917120 loops=1)
>> Index Cond: ((date_time >= start_date.start_date)
>> AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>> -> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual
>> time=20.106..20.109 rows=20949 loops=1)
>> Buckets: 32768 (originally 8192) Batches: 1 (originally 1)
>> Memory Usage: 3112kB
>> -> Hash Join (cost=343.29..1083.35 rows=5866 width=109)
>> (actual time=1.620..14.539 rows=20949 loops=1)
>> Hash Cond: (b.restaurant_id = r.restaurant_id)
>> -> Hash Join (cost=2.26..726.91 rows=5866 width=109)
>> (actual time=0.029..8.597 rows=20949 loops=1)
>> Hash Cond: (b.city_id = c.city_id)
>> -> Seq Scan on branch b (cost=0.00..668.49
>> rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
>> -> Hash (cost=1.56..1.56 rows=56 width=29)
>> (actual time=0.020..0.021 rows=56 loops=1)
>> Buckets: 1024 Batches: 1 Memory Usage: 12kB
>> -> Seq Scan on city c (cost=0.00..1.56
>> rows=56 width=29) (actual time=0.004..0.010 rows=56 loops=1)
>> -> Hash (cost=233.42..233.42 rows=8609 width=8)
>> (actual time=1.575..1.575 rows=8609 loops=1)
>> Buckets: 16384 Batches: 1 Memory Usage: 465kB
>> -> Index Only Scan using
>> "restaurant_idx$$_274b003d" on restaurant r (cost=0.29..233.42 rows=8609
>> width=8) (actual time=0.006..0.684 rows=8609 loops=1)
>> Heap Fetches: 0
>> -> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never executed)
>> -> Seq Scan on order_offer_map oom (cost=0.00..33000.09
>> rows=1273009 width=13) (never executed)
>> Planning Time: 1.180 ms
>> Execution Time: 1057.535 ms
>>
>> could some one explain why it is slow, if I insert 50k records the execution
>> time reaches 20 seconds
>>
>>
> -> Bitmap Heap Scan on restaurant_order ro
> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
> time=1036.793..1036.793 rows=0 loops=1)
>
Re: slow query with inline function on AWS RDS with RDS 24x large
Hi
pá 4. 6. 2021 v 10:32 odesílatel Ayub Khan napsal:
> BRIN index is only on the date_time column, I even tried with btree index
> with no performance gains.
>
-> Bitmap Heap Scan on restaurant_order ro
(cost=5427.94..3353966.60 rows=19275986 width=108) (actual
time=1036.793..1036.793 rows=0 loops=1)
Recheck Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:05:32.784199+00'::timestamp with time zone))
Rows Removed by Index Recheck: 5039976
Heap Blocks: lossy=275230
When the most rows are removed in recheck, then the effectivity of the
index is not good
Pavel
>
> On Fri, Jun 4, 2021 at 11:23 AM Pavel Stehule
> wrote:
>
>>
>>
>> pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan napsal:
>>
>>>
>>> below query is slow even with no data
>>>
>>>
>>> explain ANALYZE
>>>
>>> WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
>>> SELECT ro.order_id,
>>> ro.date_time,
>>> round(ro.order_amount, 2) AS order_amount,
>>> b.branch_id,
>>> b.branch_name,
>>> st_x(b.location) AS from_x,
>>> st_y(b.location) AS from_y,
>>> b.user_id AS branch_user_id,
>>> b.contact_info,
>>> r.restaurant_id,
>>>c.city_id,
>>>c.city_name,
>>>c.city_name_ar,
>>>st_linefromtext'LINESTRING('::text || st_x(b.location)) || '
>>> '::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry))
>>> || ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS
>>> from_to,
>>>to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
>>>ro.customer_comment,
>>>'N'::text AS is_new_customer,
>>>ro.picked_up_time,
>>>ro.driver_assigned_date_time,
>>>oom.offer_amount,
>>> oom.offer_type_code AS offer_type,
>>>ro.uk_vat
>>> FROM business, restaurant_order ro
>>>
>>> JOIN branch b ON b.branch_id = ro.branch_id
>>> JOIN restaurant r ON r.restaurant_id = b.restaurant_id
>>> JOIN city c ON c.city_id = b.city_id
>>> LEFT JOIN order_offer_map oom using (order_id)
>>> WHERE ro.date_time >= business.start_date AND ro.date_time<=
>>> f_now_immutable_with_tz();
>>>
>>>
>>>
>>> Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual
>>> time=1056.926..1056.934 rows=0 loops=1)
>>> Hash Cond: (ro.order_id = oom.order_id)
>>> -> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209) (actual
>>> time=1056.926..1056.932 rows=0 loops=1)
>>> Hash Cond: (ro.branch_id = b.branch_id)
>>> -> Nested Loop (cost=5427.94..3546726.47 rows=19275986 width=108)
>>> (actual time=1036.809..1036.810 rows=0 loops=1)
>>> -> Function Scan on start_date (cost=0.00..0.01 rows=1
>>> width=8) (actual time=0.006..0.008 rows=1 loops=1)
>>> -> Bitmap Heap Scan on restaurant_order ro
>>> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
>>> time=1036.793..1036.793 rows=0 loops=1)
>>> Recheck Cond: ((date_time >= start_date.start_date) AND
>>> (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>>> Rows Removed by Index Recheck: 5039976
>>> Heap Blocks: lossy=275230
>>> -> Bitmap Index Scan on rest_ord_date_brin
>>> (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
>>> rows=2917120 loops=1)
>>> Index Cond: ((date_time >= start_date.start_date)
>>> AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time
>>> zone))
>>> -> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual
>>> time=20.106..20.109 rows=20949 loops=1)
>>> Buckets: 32768 (originally 8192) Batches: 1 (originally 1)
>>> Memory Usage: 3112kB
>>> -> Hash Join (cost=343.29..1083.35 rows=5866 width=109)
>>> (actual time=1.620..14.539 rows=20949 loops=1)
>>> Hash Cond: (b.restaurant_id = r.restaurant_id)
>>> -> Hash Join (cost=2.26..726.91 rows=5866 width=109)
>>> (actual time=0.029..8.597 rows=20949 loops=1)
>>> Hash Cond: (b.city_id = c.city_id)
>>> -> Seq Scan on branch b (cost=0.00..668.49
>>> rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
>>> -> Hash (cost=1.56..1.56 rows=56 width=29)
>>> (actual time=0.020..0.021 rows=56 loops=1)
>>> Buckets: 1024 Batches: 1 Memory Usage:
>>> 12kB
>>> -> Seq Scan on city c (cost=0.00..1.56
>>> rows=56 width=29) (actual time=0.004..0.010 rows=56 loops=1)
>>> -> Hash (cost=233.42..233.42 rows=8609 width=8)
>>> (actual time=1.575..1.575 rows=8609 loops=1)
>>> Buckets: 16384 Batches: 1 Memory Usage: 465kB
>>> -> Index Only Scan using
>>> "restaurant_idx$$_274b003d" on resta
Re: slow query with inline function on AWS RDS with RDS 24x large
You are right, I dropped BRIN and created btree and the performance on 0
rows matching criteria table is good, below is the plan with BTREE. I will
test by inserting lot of data.
Hash Join (cost=50186.91..3765911.10 rows=5397411 width=291) (actual
time=1.501..1.504 rows=0 loops=1)
Hash Cond: (b.restaurant_id = r.restaurant_id)
-> Hash Left Join (cost=49845.88..2078197.48 rows=5397411 width=216)
(actual time=0.079..0.081 rows=0 loops=1)
Hash Cond: (ro.order_id = oom.order_id)
-> Hash Join (cost=933.18..2007856.35 rows=5397411 width=209)
(actual time=0.078..0.080 rows=0 loops=1)
Hash Cond: (b.city_id = c.city_id)
-> Hash Join (cost=930.92..1956181.11 rows=19276467
width=188) (actual time=0.048..0.050 rows=0 loops=1)
Hash Cond: (ro.branch_id = b.branch_id)
-> Nested Loop (cost=0.56..1904639.80 rows=19276467
width=108) (actual time=0.048..0.048 rows=0 loops=1)
-> Function Scan on start_date (cost=0.00..0.01
rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=1)
-> Index Scan using rest_ord_date_brin on
restaurant_order ro (cost=0.56..1711875.12 rows=19276467 width=108)
(actual time=0.042..0.042 rows=0 loops=1)
Index Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:48:45.377833+00'::timestamp with time zone))
-> Hash (cost=668.49..668.49 rows=20949 width=88)
(never executed)
-> Seq Scan on branch b (cost=0.00..668.49
rows=20949 width=88) (never executed)
-> Hash (cost=1.56..1.56 rows=56 width=29) (actual
time=0.026..0.027 rows=56 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on city c (cost=0.00..1.56 rows=56
width=29) (actual time=0.009..0.016 rows=56 loops=1)
-> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never
executed)
-> Seq Scan on order_offer_map oom (cost=0.00..33000.09
rows=1273009 width=13) (never executed)
-> Hash (cost=233.42..233.42 rows=8609 width=8) (actual
time=1.403..1.403 rows=8609 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 465kB
-> Index Only Scan using "restaurant_idx$$_274b003d" on restaurant
r (cost=0.29..233.42 rows=8609 width=8) (actual time=0.007..0.634
rows=8609 loops=1)
Heap Fetches: 0
Planning Time: 1.352 ms
Execution Time: 1.571 ms
On Fri, Jun 4, 2021 at 11:41 AM Pavel Stehule
wrote:
> Hi
>
>
> pá 4. 6. 2021 v 10:32 odesílatel Ayub Khan napsal:
>
>> BRIN index is only on the date_time column, I even tried with btree index
>> with no performance gains.
>>
>
> -> Bitmap Heap Scan on restaurant_order ro
> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
> time=1036.793..1036.793 rows=0 loops=1)
> Recheck Cond: ((date_time >= start_date.start_date) AND
> (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
> Rows Removed by Index Recheck: 5039976
> Heap Blocks: lossy=275230
>
> When the most rows are removed in recheck, then the effectivity of the index
> is not good
>
> Pavel
>
>
>
>
>>
>> On Fri, Jun 4, 2021 at 11:23 AM Pavel Stehule
>> wrote:
>>
>>>
>>>
>>> pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan napsal:
>>>
below query is slow even with no data
explain ANALYZE
WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
SELECT ro.order_id,
ro.date_time,
round(ro.order_amount, 2) AS order_amount,
b.branch_id,
b.branch_name,
st_x(b.location) AS from_x,
st_y(b.location) AS from_y,
b.user_id AS branch_user_id,
b.contact_info,
r.restaurant_id,
c.city_id,
c.city_name,
c.city_name_ar,
st_linefromtext'LINESTRING('::text || st_x(b.location)) || '
'::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry))
|| ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS
from_to,
to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
ro.customer_comment,
'N'::text AS is_new_customer,
ro.picked_up_time,
ro.driver_assigned_date_time,
oom.offer_amount,
oom.offer_type_code AS offer_type,
ro.uk_vat
FROM business, restaurant_order ro
JOIN branch b ON b.branch_id = ro.branch_id
JOIN restaurant r ON r.restaurant_id = b.restaurant_id
JOIN city c ON c.city_id = b.city_id
LEFT JOIN order_offer_map oom using (order_id)
WHERE ro.date_time >= business.start_date AND ro.date_time<=
f_now_immutable_with_tz();
Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual
time=1056.926..1056.934 rows=0 loops=1)
Hash Cond
PgSQL 12 on WinSrv ~3x faster than on Linux
Dear All, I have demo stand on Hyper-V (2xIntel E5-2650v2, 256GB RAM, Intel SSDs in RAID): * One VM with 1C application Server * 2 database VMs with same database imported into each PostgreSQL (~56Gb, "1C accounting 3.0" config): 1. CentOS 7 with postgresql_12.6_6.1C_x86_64 (distribution with special patches for running 1C), VM is updated with LIS for Hyper-V 2. Windows Server 2019 with same postgresql_12.6_6.1C_x86_64 My real life test is to "register" 10 _same_ documents (провести документы) in each of 1C/PostgreSQL DBs. Both PostgreSQL DBs are identical and just before test imported to PostgreSQL via application server (DT import). On Windows Server test procedure takes 20-30 seconds, on Linux it takes 1m-1m10seconds. PostgreSQL VMs are running on same Hypervisor with same resources assigned to each of them. Tuning PostgreSQL config and/or CentOS don't make any difference. Contrary on Windows VM we have almost 3x better performance with stock PostgreSQL config. Any ideas what's wrong? For me such a big difference on identical databases/queries looks strange. -- Taras Savchuk
Re: PgSQL 12 on WinSrv ~3x faster than on Linux
On Fri, 4 Jun 2021 at 23:53, Taras Savchuk wrote: > My real life test is to "register" 10 _same_ documents (провести документы) > in each of 1C/PostgreSQL DBs. Both PostgreSQL DBs are identical and just > before test imported to PostgreSQL via application server (DT import). > On Windows Server test procedure takes 20-30 seconds, on Linux it takes > 1m-1m10seconds. PostgreSQL VMs are running on same Hypervisor with same > resources assigned to each of them. > Tuning PostgreSQL config and/or CentOS don't make any difference. Contrary on > Windows VM we have almost 3x better performance with stock PostgreSQL config. > > Any ideas what's wrong? For me such a big difference on identical > databases/queries looks strange. It's pretty difficult to say. You've not provided any useful details about the workload you're running. If this "register 10 _same_ documents" thing requires running some query, then you might want to look at EXPLAIN (ANALYZE, BUFFERS) for that query. You might want to consider doing SET track_io_timing = on; Perhaps Linux is having to read more buffers from disk than Windows. David.
Re: PgSQL 12 on WinSrv ~3x faster than on Linux
also if you can setup an external timer \timing , along with explain analyse to get total time, it would help if everything else is same. I have seen some threads thar mention added startup cost for parallel workers on windows but not on Linux. But I do not want to mix those threads here, but just FYI. On Fri, Jun 4, 2021, 6:12 PM David Rowley wrote: > On Fri, 4 Jun 2021 at 23:53, Taras Savchuk wrote: > > My real life test is to "register" 10 _same_ documents (провести > документы) in each of 1C/PostgreSQL DBs. Both PostgreSQL DBs are identical > and just before test imported to PostgreSQL via application server (DT > import). > > On Windows Server test procedure takes 20-30 seconds, on Linux it takes > 1m-1m10seconds. PostgreSQL VMs are running on same Hypervisor with same > resources assigned to each of them. > > Tuning PostgreSQL config and/or CentOS don't make any difference. > Contrary on Windows VM we have almost 3x better performance with stock > PostgreSQL config. > > > > Any ideas what's wrong? For me such a big difference on identical > databases/queries looks strange. > > It's pretty difficult to say. You've not provided any useful details > about the workload you're running. > > If this "register 10 _same_ documents" thing requires running some > query, then you might want to look at EXPLAIN (ANALYZE, BUFFERS) for > that query. You might want to consider doing SET track_io_timing = > on; Perhaps Linux is having to read more buffers from disk than > Windows. > > David. > > >
Re: PgSQL 12 on WinSrv ~3x faster than on Linux
David Rowley writes: > On Fri, 4 Jun 2021 at 23:53, Taras Savchuk wrote: >> Any ideas what's wrong? For me such a big difference on identical >> databases/queries looks strange. > It's pretty difficult to say. You've not provided any useful details > about the workload you're running. > If this "register 10 _same_ documents" thing requires running some > query, then you might want to look at EXPLAIN (ANALYZE, BUFFERS) for > that query. You might want to consider doing SET track_io_timing = > on; Perhaps Linux is having to read more buffers from disk than > Windows. The first thing that comes to mind for me is fsync working correctly (i.e. actually waiting for the disk write) in Linux but not in Windows. On a weird VM stack like you've got, it's not hard for that sort of thing to go wrong. Needless to say, if that's the issue then the apparent performance win is coming at the cost of crash safety. pg_test_fsync might help detect such a problem. regards, tom lane
RE: PgSQL 12 on WinSrv ~3x faster than on Linux
> The first thing that comes to mind for me is fsync working correctly (i.e. > actually waiting for the disk write) in Linux but not in Windows. > On a weird VM stack like you've got, it's not hard for that sort of thing to > go > wrong. Needless to say, if that's the issue then the apparent performance > win is coming at the cost of crash safety. > > pg_test_fsync might help detect such a problem. > > regards, tom lane > fsync performance on win is much better (results are below). Also network performance for VMs on same HV for win-win is 40% better than for win-linux (5,97Gbps vs 3,6Gbps). Regarding weird VM stack - we're running both win and linux VMs and Hyper-V works reasonable well except... this issue ) Win: C:\Program Files\PostgreSQL\12.6-6.1C\bin>pg_test_fsync.exe 5 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync 7423,645 ops/sec 135 usecs/op fdatasync n/a fsync 1910,611 ops/sec 523 usecs/op fsync_writethrough 1987,900 ops/sec 503 usecs/op open_sync n/a Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync 3827,254 ops/sec 261 usecs/op fdatasync n/a fsync 1920,720 ops/sec 521 usecs/op fsync_writethrough 1863,852 ops/sec 537 usecs/op open_sync n/a Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write n/a 2 * 8kB open_sync writes n/a 4 * 4kB open_sync writes n/a 8 * 2kB open_sync writes n/a 16 * 1kB open_sync writes n/a Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close 144,065 ops/sec6941 usecs/op write, close, fsync 148,751 ops/sec6723 usecs/op Non-sync'ed 8kB writes: write 165,484 ops/sec6043 usecs/op Linux: [root@pgsql12 ~]# /usr/pgsql-12/bin/pg_test_fsync 5 seconds per test O_DIRECT supported on this platform for open_datasync and open_sync. Compare file sync methods using one 8kB write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync 2947.296 ops/sec 339 usecs/op fdatasync 2824.271 ops/sec 354 usecs/op fsync 1885.924 ops/sec 530 usecs/op fsync_writethrough n/a open_sync 1816.312 ops/sec 551 usecs/op Compare file sync methods using two 8kB writes: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync 1458.849 ops/sec 685 usecs/op fdatasync 2712.756 ops/sec 369 usecs/op fsync 1769.353 ops/sec 565 usecs/op fsync_writethrough n/a open_sync 902.626 ops/sec1108 usecs/op Compare open_sync with different write sizes: (This is designed to compare the cost of writing 16kB in different write open_sync sizes.) 1 * 16kB open_sync write 1798.811 ops/sec 556 usecs/op 2 * 8kB open_sync writes 887.727 ops/sec1126 usecs/op 4 * 4kB open_sync writes 494.843 ops/sec2021 usecs/op 8 * 2kB open_sync writes 233.659 ops/sec4280 usecs/op 16 * 1kB open_sync writes 117.417 ops/sec8517 usecs/op Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) write, fsync, close1673.781 ops/sec 597 usecs/op write, close, fsync1727.787 ops/sec 579 usecs/op Non-sync'ed 8kB writes: write200638.271 ops/sec 5 usecs/op -- Taras
