RE: Query is slow when run for first time; subsequent execution is fast

2018-09-04 Thread jimmy
On windows, how to put an entry in my db startup script to run this query 
(pg_prewarm) immediately after startng the server, and let the query warm the 
cache itself.
After starting the server, I want to know what is  the server, and it is the 
database I restarted or windows system?
Thank you.  


>Hi,
>On 17 Jan 2018 12:55, "POUSSEL, Guillaume" 
>
>wrote:
>Are you on Windows or Linux? I’m on Windows and wondering if the issue is
>the same on Linux?
>I have experienced this on Mac and Linux machines.
>You can try pg_prewarm, on pg_statistic table and its index.  But I'd
>probably just put an entry in my db startup script to run this query
>immediately after startng the server, and let the query warm the cache
>itself.




>I will try this suggestion and get back on the thread. Is pg_statistic the
>only table to be pre cached? Pls let me know if any other table/index needs
>to be pre warmed.
>
>
>Btw, I don't running a "select * from pg_statistic" will fill the shared
>buffer. Only 256 kb of data will be cached during sequential scans. I will
>try pg_prewarm
>
>
>Why do you restart your database often
>
>
>Postgres is bundled with our application and deployed by our client.
>Starting / stopping the server is not under my control.
>
>
>Regards,
>Nanda

Why the sql is not executed in parallel mode

2018-09-18 Thread jimmy
Why the sql is not executed in parallel mode, does the sql has some problem?
with sql1 as
(select a.*
   from snaps a
  where a.f_date between to_date('2018-03-05', '-MM-dd') and
to_date('2018-03-11', '-MM-dd')
 ),
sql2 as
(select '1' as pId, PM_TO as pValue, type_code as typeCode, version_no as 
versionNo,
bs as bs, l.order_rule as orderRule
   from sql1, qfpl l
  where PM_TO is not null
 and l.pid = 1
 union all
 select '2' as pId,
PRTO as pValue,
type_code as typeCode, version_no as versionNo,
bs as bs, l.order_rule as orderRule
   from sql1, qfpl l
  where PRTO is not null
 and l.pid = 2
 union all
 select '3' as pId,
PRATO as pValue,
type_code as typeCode, version_no as versionNo,
bs as bs, l.order_rule as orderRule
   from sql1, qfpl l
  where PRATO is not null
 and l.pid = 3
 ),
sql4 as (
select typeCode, pId, orderRule, versionNo,
row_number() over(partition by pId, typeCode order by pValue) as rnn
 from sql2
),
sql5 as (
select sql4.typeCode as typeCode,
 sql4.pId as pId,
 sql4.orderRule as orderRule,
 t.pValue as pValue,
 sql4.versionNo as versionNo
from sql4,
(select sql2.typeCode,sql2.pId,sql2.orderRule,
 (case when sql2.orderRule = 1 then
PERCENTILE_DISC(0.05) WITHIN GROUP(ORDER BY sql2.pValue)
  else
PERCENTILE_DISC(0.95) WITHIN GROUP(ORDER BY sql2.pValue)
end) as pValue,
 (case when sql2.orderRule = 1 then
 (case when round(count(1) * 0.05) - 1 < 0 then 1
 else round(count(1) * 0.05)
 end)
  else
 (case when round(count(1) * 0.95) - 1 < 0 then 1
 else round(count(1) * 0.95)
 end)
  end) as rnn
 from sql2
 group by sql2.typeCode, sql2.pId, sql2.orderRule)  t
where sql4.typeCode = t.typeCode
and sql4.pId = t.pId
 and sql4.orderRule = t.orderRule
 and sql4.rnn = t.rnn
),
sql6 as (
select sql2.pId, sql2.typeCode as typeCode, count(1) as fCount
from sql2, sql5
   where sql2.pId = sql5.pId
 and sql2.typeCode = sql5.typeCode
 and ((sql2.orderRule = 2 and sql2.pValue >= sql5.pValue) or
 (sql2.orderRule = 1 and sql2.pValue <= sql5.pValue))
 and sql2.pId != '22'
   group by sql2.pId, sql2.typeCode
   union 
   select sql5.pId, sql5.typeCode, 0 as fCount
 from sql5
where sql5.pId = '22'
group by sql5.pId, sql5.typeCode
)
select sql5.pId,
sql5.typeCode,
(case when sql5.pId = '22' then
   (select p.d_chn
  from qlp p
 where p.version_no = sql5.versionNo
   and p.cno = sql5.pValue
   and (p.typeCode = sql5.typeCode or p.typeCode is null))
  else 
sql5.pValue || ''
  end) pValue,
sql6.fCount,
(case when d.delta = 'Y' then d.dy_val
else d.y_val
end) yVal,
(case when d.is_delta = 'Y' then d.dr_val
else d.r_val
end) rVal,
f.p_no pNo,
f.p_name ||(case when f.unit = '' then ''
else '('|| f.unit ||')'
 end) pName,
f.pe_name || (case when f.unit = '' then ''
   else '(' || f.unit || ')'
 end) peName,
c.fp_name fpName,
f.order_rule as orderRule,
f.pflag pFlag,
f.pdesc as pDesc
   from sql5, sql6, qfpl f, qpa d,qfp c
  where sql5.pId = sql6.pId
and sql5.typeCode = sql6.typeCode
and sql5.pId = f.pid||''
and f.deleted = 0
and f.pid = d.pid
and sql5.typeCode = d.typeCode
and f.fp_id = c.fp_id
   order by f.t_sort, c.fp_id,f.p_no

EXISTS by itself vs SELECT EXISTS much slower in query.

2021-11-08 Thread Jimmy A
A description of what you are trying to achieve and what results you
expect.:
I have two equivalent queries, one with an EXISTS clause by itself and one
wrapped in a (SELECT EXISTS) and the "naked" exists is much slower.
I would expect both to be the same speed / have same execution plan.

-- slow
explain (analyze, buffers)
SELECT
parent.*,
EXISTS (SELECT * FROM child WHERE child.parent_id=parent.parent_id) AS
child_exists
FROM parent
ORDER BY parent_id LIMIT 10;

-- fast
explain (analyze, buffers)
SELECT
parent.*,
(SELECT EXISTS (SELECT * FROM child WHERE
child.parent_id=parent.parent_id)) AS child_exists
FROM parent
ORDER BY parent_id LIMIT 10;

-- slow
https://explain.depesz.com/s/DzcK

-- fast
https://explain.depesz.com/s/EftS

Setup:
CREATE TABLE parent(parent_id BIGSERIAL PRIMARY KEY, name text);
CREATE TABLE child(child_id BIGSERIAL PRIMARY KEY, parent_id bigint
references parent(parent_id), name text);

-- random name and sequential primary key for 100 thousand parents.
INSERT INTO parent
SELECT
nextval('parent_parent_id_seq'),
md5(random()::text)
FROM generate_series(1, 10);

-- 1 million children.
-- set every odd id parent to have children. even id parent gets none.
INSERT INTO child
SELECT
   nextval('child_child_id_seq'),
   ((generate_series/2*2) % 10)::bigint + 1,
   md5(random()::text)
FROM generate_series(1, 100);

CREATE INDEX ON child(parent_id);
VACUUM ANALYZE parent, child;

Both queries return the same results - I have taken a md5 of both queries
without the LIMIT clause to confirm.
Tables have been vacuumed and analyzed.
No other queries are being executed.
Reproducible with LIMIT 1 or LIMIT 100 or LIMIT 500.
Changing work_mem makes no difference.

-[ RECORD 1 ]--+-
relname| parent
relpages   | 935
reltuples  | 10
relallvisible  | 935
relkind| r
relnatts   | 2
relhassubclass | f
reloptions |
pg_table_size  | 7700480
-[ RECORD 2 ]--+-
relname| child
relpages   | 10310
reltuples  | 1e+06
relallvisible  | 10310
relkind| r
relnatts   | 3
relhassubclass | f
reloptions |
pg_table_size  | 84516864

PostgreSQL version number you are running:
PostgreSQL 13.4 on arm-apple-darwin20.5.0, compiled by Apple clang version
12.0.5 (clang-1205.0.22.9), 64-bit

How you installed PostgreSQL:
Using homebrew for mac.
brew install postgres

Changes made to the settings in the postgresql.conf file:  see Server
Configuration for a quick way to list them all.
checkpoint_completion_target | 0.9  | configuration file
checkpoint_timeout   | 30min| configuration file
client_encoding  | UTF8 | client
cpu_tuple_cost   | 0.03 | configuration file
effective_cache_size | 4GB  | configuration file
log_directory| log  | configuration file
log_min_duration_statement   | 25ms | configuration file
log_statement| none | configuration file
log_temp_files   | 0| configuration file
log_timezone | America/Anchorage| configuration file
maintenance_work_mem | 512MB| configuration file
max_parallel_maintenance_workers | 2| configuration file
max_parallel_workers | 4| configuration file
max_parallel_workers_per_gather  | 4| configuration file
max_stack_depth  | 2MB  | environment
variable
max_wal_size | 10GB | configuration file
max_worker_processes | 4| configuration file
min_wal_size | 80MB | configuration file
random_page_cost | 1.1  | configuration file
shared_buffers   | 512MB| configuration file
shared_preload_libraries | auto_explain | configuration file
track_io_timing  | on   | configuration file
vacuum_cost_limit| 1000 | configuration file
wal_buffers  | 64MB | configuration file
wal_compression  | on   | configuration file
work_mem | 128MB| configuration file

Operating system and version:
macOS Big Sur 11.2.3
I have confirmed this to happen on ubuntu linux however.

What program you're using to connect to PostgreSQL:
psql

Is there anything relevant or unusual in the PostgreSQL server logs?:
no

Hardware specs:
MacBook Air10,1 M1
8GB RAM
APPLE SSD AP0512Q 500.28GB


setup.sql
Description: Binary data


Re: EXISTS by itself vs SELECT EXISTS much slower in query.

2021-11-09 Thread Jimmy A
I see, I never knew that.

Indeed there is a hashed subplan for the EXISTS by itself. So that explains
it.

Thanks Tom.


On Mon, Nov 8, 2021 at 12:35 PM Tom Lane  wrote:

> Jimmy A  writes:
> > I have two equivalent queries, one with an EXISTS clause by itself and
> one
> > wrapped in a (SELECT EXISTS) and the "naked" exists is much slower.
> > I would expect both to be the same speed / have same execution plan.
>
> That is a dangerous assumption.  In general, wrapping (SELECT ...) around
> something has a significant performance impact, because it pushes Postgres
> to try to decouple the sub-select's execution from the outer query.
> As an example,
>
> postgres=# select x, random() from generate_series(1,3) x;
>  x |   random
> ---+-
>  1 | 0.08595356832524814
>  2 |  0.6444265043474005
>  3 |  0.6878852071694332
> (3 rows)
>
> postgres=# select x, (select random()) from generate_series(1,3) x;
>  x |   random
> ---+
>  1 | 0.7028987801136708
>  2 | 0.7028987801136708
>  3 | 0.7028987801136708
> (3 rows)
>
> That's not a bug: it's expected that the second query will evaluate
> random() only once.
>
> In the case at hand, I suspect you're getting a "hashed subplan"
> in one query and not the other.  The depesz.com display doesn't
> really show that, but EXPLAIN VERBOSE would.
>
> regards, tom lane
>