LEFT JOIN LATERAL optimisation at plan time
Hi, For a traditional LEFT JOIN, in case the SELECT does not mention a field from a joined table being unique , the planner removes the join. Eg: SELECT a, b --,c FROM table1 LEFT JOIN (select a, c from table2 group by a) joined USING (a) However this behavior is not the same for LATERAL JOINS SELECT a, b --,c FROM table1 LEFT JOIN LATERAL (select a, c from table2 where table1.a = table2.a group by a) joined ON TRUE In this case, the planner still consider the joined table. My guess is it could remove it . Any thought ? -- nicolas
pg_pub_decrypt: 10x performance hit with gpg v2
Hi, pg_pub_decrypt() is ~10x slower when the priv/pub keys have been generated with gnupg version 2.x instead of version 1.x. What I do is: - Create keys with gpg - Export priv/pub keys - Store keys in binary form in a bytea - Create 32 byte random data and encrypt it with pg_pub_encrypt() - \timing on - Decrypt with pg_pub_decrypt(). I see ~8ms with v1 keys vs. ~100ms with v2 keys. I am using defaults everywhere, when generating keys as well as encrypting with pg_pub_encrypt(). Outside postgresql, I've tested random file encryption/decryption with gpg 2.x and with both the v1 keys against the v2 keys (both in the gpg keyring) and cannot detect significant differences. What can I do to track that issue further down. Thanks
Re: LEFT JOIN LATERAL optimisation at plan time
Nicolas Paris writes: > For a traditional LEFT JOIN, in case the SELECT does not mention a field > from a joined table being unique , the planner removes the join. Eg: > SELECT a, b --,c > FROM table1 > LEFT JOIN (select a, c from table2 group by a) joined USING (a) > However this behavior is not the same for LATERAL JOINS > SELECT a, b --,c > FROM table1 > LEFT JOIN LATERAL (select a, c from table2 where table1.a = table2.a group by > a) joined ON TRUE The way you've set that up, the constraint required to deduce uniqueness (i.e. the table1.a = table2.a clause) is hidden inside a non-trivial subquery; and, where it's placed, it isn't actually guaranteeing anything so far as the inner query is concerned, ie the select from table2 could easily return multiple rows. I'm not too surprised that the outer planner level doesn't make this deduction. > In this case, the planner still consider the joined table. My guess is > it could remove it . It looks to me like it would require a substantial amount of additional code and plan-time effort to find cases like this. I'm not convinced that the cost-benefit ratio is attractive. Maybe in some hypothetical future where we're able to flatten sub-selects even though they contain GROUP BY, it would get easier/cheaper to detect this case. But that's just pie in the sky at the moment. regards, tom lane
Why the sql is not executed in parallel mode
Why the sql is not executed in parallel mode, does the sql has some problem?
with sql1 as
(select a.*
from snaps a
where a.f_date between to_date('2018-03-05', '-MM-dd') and
to_date('2018-03-11', '-MM-dd')
),
sql2 as
(select '1' as pId, PM_TO as pValue, type_code as typeCode, version_no as
versionNo,
bs as bs, l.order_rule as orderRule
from sql1, qfpl l
where PM_TO is not null
and l.pid = 1
union all
select '2' as pId,
PRTO as pValue,
type_code as typeCode, version_no as versionNo,
bs as bs, l.order_rule as orderRule
from sql1, qfpl l
where PRTO is not null
and l.pid = 2
union all
select '3' as pId,
PRATO as pValue,
type_code as typeCode, version_no as versionNo,
bs as bs, l.order_rule as orderRule
from sql1, qfpl l
where PRATO is not null
and l.pid = 3
),
sql4 as (
select typeCode, pId, orderRule, versionNo,
row_number() over(partition by pId, typeCode order by pValue) as rnn
from sql2
),
sql5 as (
select sql4.typeCode as typeCode,
sql4.pId as pId,
sql4.orderRule as orderRule,
t.pValue as pValue,
sql4.versionNo as versionNo
from sql4,
(select sql2.typeCode,sql2.pId,sql2.orderRule,
(case when sql2.orderRule = 1 then
PERCENTILE_DISC(0.05) WITHIN GROUP(ORDER BY sql2.pValue)
else
PERCENTILE_DISC(0.95) WITHIN GROUP(ORDER BY sql2.pValue)
end) as pValue,
(case when sql2.orderRule = 1 then
(case when round(count(1) * 0.05) - 1 < 0 then 1
else round(count(1) * 0.05)
end)
else
(case when round(count(1) * 0.95) - 1 < 0 then 1
else round(count(1) * 0.95)
end)
end) as rnn
from sql2
group by sql2.typeCode, sql2.pId, sql2.orderRule) t
where sql4.typeCode = t.typeCode
and sql4.pId = t.pId
and sql4.orderRule = t.orderRule
and sql4.rnn = t.rnn
),
sql6 as (
select sql2.pId, sql2.typeCode as typeCode, count(1) as fCount
from sql2, sql5
where sql2.pId = sql5.pId
and sql2.typeCode = sql5.typeCode
and ((sql2.orderRule = 2 and sql2.pValue >= sql5.pValue) or
(sql2.orderRule = 1 and sql2.pValue <= sql5.pValue))
and sql2.pId != '22'
group by sql2.pId, sql2.typeCode
union
select sql5.pId, sql5.typeCode, 0 as fCount
from sql5
where sql5.pId = '22'
group by sql5.pId, sql5.typeCode
)
select sql5.pId,
sql5.typeCode,
(case when sql5.pId = '22' then
(select p.d_chn
from qlp p
where p.version_no = sql5.versionNo
and p.cno = sql5.pValue
and (p.typeCode = sql5.typeCode or p.typeCode is null))
else
sql5.pValue || ''
end) pValue,
sql6.fCount,
(case when d.delta = 'Y' then d.dy_val
else d.y_val
end) yVal,
(case when d.is_delta = 'Y' then d.dr_val
else d.r_val
end) rVal,
f.p_no pNo,
f.p_name ||(case when f.unit = '' then ''
else '('|| f.unit ||')'
end) pName,
f.pe_name || (case when f.unit = '' then ''
else '(' || f.unit || ')'
end) peName,
c.fp_name fpName,
f.order_rule as orderRule,
f.pflag pFlag,
f.pdesc as pDesc
from sql5, sql6, qfpl f, qpa d,qfp c
where sql5.pId = sql6.pId
and sql5.typeCode = sql6.typeCode
and sql5.pId = f.pid||''
and f.deleted = 0
and f.pid = d.pid
and sql5.typeCode = d.typeCode
and f.fp_id = c.fp_id
order by f.t_sort, c.fp_id,f.p_no
Re: Why the sql is not executed in parallel mode
On Wed, Sep 19, 2018 at 1:53 PM jimmy wrote: > > Why the sql is not executed in parallel mode, does the sql has some problem? > with sql1 as Hello Jimmy, WITH is the problem. From the manual[1]: "The following operations are always parallel restricted. Scans of common table expressions (CTEs). ...". That means that these CTEs can only be scanned in the leader process. If you rewrite the query using sub selects it might do better. FWIW there is a project to make WITH work like subselects automatically in a future release of PostgreSQL: https://www.postgresql.org/message-id/flat/[email protected] [1] https://www.postgresql.org/docs/10/static/parallel-safety.html -- Thomas Munro http://www.enterprisedb.com
