RE: Postgres not using index on views

2020-04-07 Thread Rick Vincent
Hi Justin,

You said, " Is there a reason why you don't store the extracted value in its 
own column ?"

RV>> It simply is the way the application stores the data.  For Oracle we are 
storing in XML and JSON format, for postgres, due do limitations of XML api, we 
are storing in VARCHAR.  We can't break it out into columns very easily because 
of the legacy application.

You said, "It still did a seq scan on the table, so I'm not sure what this has 
to do with index scans ?"

RV>> On Oracle it will use the primary key index because it detects that all of 
the columns in the select clause are indexable.  With Postgres, it might be 
doing a seq scan but on a 180 rows, a select on the underlying table is many 
times faster than the same select on the view.  It seems all of the view 
columns are being triggered which makes it incredibly slow.

Thanks,
Rick



-Original Message-
From: Justin Pryzby 
Sent: Tuesday, April 7, 2020 6:59 AM
To: Rick Vincent 
Cc: [email protected]; Manoj Kumar ; 
Herve Aubert 
Subject: Re: Postgres not using index on views

On Mon, Apr 06, 2020 at 02:19:59PM +, Rick Vincent wrote:
> I am seeing a performance problem with postgresql v 11.7 on views, and I am 
> wondering if anyone can tell me why or has any suggestion.
>
> A table is created as:
>
> CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY,
> XMLRECORD VARCHAR)
>
> And contains only 180 rows.
>
> Doing an explain plan on the view created over this gives:
>
> EXPLAIN ANALYZE
> select RECID from "V_FBNK_CUSTOMER"
>
>
> Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19014.60 rows=180 width=7) 
> (actual time=459.601..78642.189 rows=180 loops=1)
>   ->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180
> width=14575) (actual time=459.600..78641.950 rows=180 loops=1)
>
> Yet an Explain plan on the underlying table( on select RECID from 
> "FBNK_CUSTOMER") gives:
>
> Seq Scan on "FBNK_CUSTOMER"  (cost=0.00..22.80 rows=180 width=7)
> (actual time=0.004..0.272 rows=180 loops=1)

It still did a seq scan on the table, so I'm not sure what this has to do with 
index scans ?

> The following query takes an extremely long time for only 180 rows, and what 
> this means is that we would have to index anything appearing in the where 
> clause for every table in order to use views because the views seem not to 
> consider the select clause.  Why is that and does anyone know a way around 
> this?

Is there a reason why you don't store the extracted value in its own column ?
And maybe keep it up to date using an insert/update trigger on the xmlrecord 
column.

--
Justin

The information in this e-mail and any attachments is confidential and may be 
legally privileged. It is intended solely for the addressee or addressees. Any 
use or disclosure of the contents of this e-mail/attachments by a not intended 
recipient is unauthorized and may be unlawful. If you have received this e-mail 
in error please notify the sender. Please note that any views or opinions 
presented in this e-mail are solely those of the author and do not necessarily 
represent those of TEMENOS. We recommend that you check this e-mail and any 
attachments against viruses. TEMENOS accepts no liability for any damage caused 
by any malicious code or virus transmitted by this e-mail.




RE: Postgres not using index on views

2020-04-07 Thread Rick Vincent
Hi Tom,

The function is defined as below, so no use of VOLATILE.  Let me know if you 
need any other information.  I am hoping the below will further clarify the 
issue.

CREATE OR REPLACE FUNCTION extractValueJS (sVar text, nfm INTEGER, nvm INTEGER)
RETURNS VARCHAR as $$
declare
sRet text := '';
nSize int := 0;
retVal int := 0;
cVar text[] := regexp_split_to_array(sVar,'');
idx int := 1;
nStart int := 0;
nEnd int := 0;
begin
etc...
return sRet;
end;
$$ LANGUAGE plpgsql;

After reading you link.

Here is a better explain plan:

Explain on the table:

EXPLAIN (analyze,BUFFERS)
 select RECID from "FBNK_CUSTOMER"
Seq Scan on "FBNK_CUSTOMER"  (cost=0.00..22.80 rows=180 width=7) (actual 
time=0.011..0.073 rows=180 loops=1)
  Buffers: shared hit=21
Planning Time: 0.056 ms
Execution Time: 0.091 ms

Explain on the view:

EXPLAIN (analyze,BUFFERS)
 select RECID from "V_FBNK_CUSTOMER"

Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19014.60 rows=180 width=7) 
(actual time=455.727..76837.097 rows=180 loops=1)
  Buffers: shared hit=204
  ->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180 width=14575) 
(actual time=455.726..76836.791 rows=180 loops=1)
Buffers: shared hit=204
Planning Time: 1.109 ms
Execution Time: 76838.505 ms

Explain on view with a column:

EXPLAIN (analyze,BUFFERS)
 SELECT RECID FROM "V_FBNK_CUSTOMER" WHERE "TESTER" = '5.00' ORDER BY RECID
Sort  (cost=19015.06..19015.06 rows=1 width=7) (actual 
time=76033.475..76033.475 rows=1 loops=1)
  Sort Key: "V_FBNK_CUSTOMER".recid
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=21
  ->  Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19015.05 rows=1 width=7) 
(actual time=66521.952..76033.434 rows=1 loops=1)
Filter: (("V_FBNK_CUSTOMER"."TESTER")::text = '5.00'::text)
Rows Removed by Filter: 179
Buffers: shared hit=21
->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180 
width=14575) (actual time=462.949..76033.096 rows=180 loops=1)
  Buffers: shared hit=21
Planning Time: 0.819 ms
Execution Time: 76033.731 ms

But on the underlying table and not the view but just using the one view column 
called TESTER:

EXPLAIN (analyze,BUFFERS)
 SELECT RECID FROM "FBNK_CUSTOMER" WHERE extractValueJS(XMLRECORD, 179, 9) = 
'5.00' ORDER BY RECID
Sort  (cost=68.26..68.27 rows=1 width=7) (actual time=220.403..220.404 rows=1 
loops=1)
  Sort Key: recid
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=21
  ->  Seq Scan on "FBNK_CUSTOMER"  (cost=0.00..68.25 rows=1 width=7) (actual 
time=193.000..220.397 rows=1 loops=1)
Filter: ((extractvaluejs((xmlrecord)::text, 179, 9))::text = 
'5.00'::text)
Rows Removed by Filter: 179
Buffers: shared hit=21
Planning Time: 0.045 ms
Execution Time: 220.418 ms

Other info:

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, 
relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE 
relname='FBNK_CURRENCY';

relname relpagesreltuples   relallvisible   relkind relnatts
relhassubclass  reloptions  pg_table_size
FBNK_CURRENCY   6   93  0   r   2   false   NULL81920

Version is:
PostgreSQL 11.7 (Debian 11.7-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

It is a postgres docker image.

Thanks,
Rick

-Original Message-
From: Tom Lane 
Sent: Tuesday, April 7, 2020 7:09 AM
To: Justin Pryzby 
Cc: Rick Vincent ; [email protected]; 
Manoj Kumar ; Herve Aubert 
Subject: Re: Postgres not using index on views

Justin Pryzby mailto:[email protected]>> writes:
> On Mon, Apr 06, 2020 at 02:19:59PM +, Rick Vincent wrote:
>> The following query takes an extremely long time for only 180 rows, and what 
>> this means is that we would have to index anything appearing in the where 
>> clause for every table in order to use views because the views seem not to 
>> consider the select clause.  Why is that and does anyone know a way around 
>> this?

> Is there a reason why you don't store the extracted value in its own column ?

The planner seems to be quite well aware that the slower query is going to be 
slower, since the estimated costs are much higher.  Since it's not choosing to 
optimize into a faster form, I wonder whether it's constrained by semantic 
requirements.  In particular, I'm suspicious that some of those functions you 
have in the view are marked "volatile", preventing them from being optimized 
away.

Beyond that guess, though, there's really not enough info here to say.
The info we usually ask for to debug slow-query problems is explained at

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards, tom lane


The information in this e-mail and any attachments is confidential and may be 
legally privileged. It is intended solely for the addressee or addressees. Any 
use or disclosure of the contents of this e-mail/attachments b

Re: Postgres not using index on views

2020-04-07 Thread Thomas Kellerer
Rick Vincent schrieb am 07.04.2020 um 11:08:
> The function is defined as below, so no use of VOLATILE.

If you don't specify anything, the default is VOLATILE.

So your function *is* volatile.
 
> CREATE OR REPLACE FUNCTION extractValueJS (sVar text, nfm INTEGER, nvm 
> INTEGER)
> RETURNS VARCHAR as $$
> declare
> sRet text := '';
> nSize int := 0;
> retVal int := 0;
> cVar text[] := regexp_split_to_array(sVar,'');
> idx int := 1;
> nStart int := 0;
> nEnd int := 0;
> begin
> etc...
>     return sRet;
> end;
> $$ LANGUAGE plpgsql;

You haven't shown us your actual code, but if you can turn that into a 
"language sql" function (defined as immutable, or at least stable), I would 
expect it to be way more efficient.

Thomas




Re: Postgres not using index on views

2020-04-07 Thread Thomas Kellerer
> RV>> It simply is the way the application stores the data. For Oracle
> we are storing in XML and JSON format, for postgres, due do
> limitations of XML api, we are storing in VARCHAR.

Why not use JSON in Postgres then?
Postgres' JSON functions are at least as powerful as Oracle's  (if not better 
in a lot of areas).

Would be interesting to see what XML function/feature from Oracle you can't 
replicate/migrate to Postgres.

Another option might be to upgrade to Postgres 12 and define those columns as 
generated columns as part of the table, rather than a view.
Then you only pay the performance penalty of the extracValueJS() function when 
you update the table, not for every select.

Thomas




Re: slow query

2020-04-07 Thread Michael Christofides
That plan looks like it might have been cropped in places, and the
formatting is making it tricky to help.

Could you try again, pasting the plan into https://explain.depesz.com/ to
make it easier to review?

On Fri, Apr 3, 2020 at 5:18 PM dangal  wrote:

> Justin thank you very much for your answer, as you can also see the number
> of
> rows differs a lot
> I attach the complete explain, do not attach it because it is large
>
> "HashAggregate  (cost=12640757.46..12713163.46 rows=385 width=720) (actual
> time=1971962.023..1971962.155 rows=306 loops=1)"
> "  Output: bi.bus_ent_inst_name_num, bi.att_value_num_7, bi.att_value_10,
> ((SubPlan 1)), ((SubPlan 2)), a2.ent_inst_att_str_value, ba.att_value_1,
> depto2.att_value_1, loc2.att_value_1, att_call.ent_inst_att_str_value,
> att_nro.ent_inst_att_str_value, att_bis (...)"
> "  Group Key: bi.bus_ent_inst_name_num, bi.att_value_num_7,
> bi.att_value_10,
> (SubPlan 1), (SubPlan 2), a2.ent_inst_att_str_value, ba.att_value_1,
> depto2.att_value_1, loc2.att_value_1, att_call.ent_inst_att_str_value,
> att_nro.ent_inst_att_str_value, att_bis. (...)"
> "  Buffers: shared hit=5817744 read=1034292 dirtied=790, local hit=2"
> "  ->  Nested Loop  (cost=4347.52..12640740.13 rows=385 width=720)
> (actual time=1906401.083..1971959.176 rows=306 loops=1)"
> "Output: bi.bus_ent_inst_name_num, bi.att_value_num_7,
> bi.att_value_10, (SubPlan 1), (SubPlan 2), a2.ent_inst_att_str_value,
> ba.att_value_1, depto2.att_value_1, loc2.att_value_1,
> att_call.ent_inst_att_str_value, att_nro.ent_inst_att_str_value, att_b
> (...)"
> "Buffers: shared hit=5817744 read=1034292 dirtied=790, local hit=2"
> "->  Hash Join  (cost=4346.94..12228344.41 rows=1427 width=704)
> (actual time=1906372.468..1964409.907 rows=306 loops=1)"
> "  Output: bi.bus_ent_inst_name_num, bi.att_value_num_7,
> bi.att_value_10, ba.bus_ent_inst_id_auto, ba.att_value_1,
> att_call.ent_inst_att_str_value, att_nro.ent_inst_att_str_value,
> att_bis.ent_inst_att_str_value, att_pad.ent_inst_att_str_value, a (...)"
> "  Hash Cond: (ba.att_value_num_1 =
> (bi.bus_ent_inst_name_num)::numeric)"
> "  Buffers: shared hit=5814458 read=1033324 dirtied=790, local
> hit=2"
> "  ->  Hash Right Join  (cost=4339.65..12172907.42
> rows=886647 width=158) (actual time=1906344.617..1963668.889 rows=3362294
> loops=1)"
> "Output: ba.bus_ent_inst_id_auto, ba.att_value_1,
> ba.att_value_num_1, att_call.ent_inst_att_str_value,
> att_nro.ent_inst_att_str_value, att_bis.ent_inst_att_str_value,
> att_pad.ent_inst_att_str_value, att_manz.ent_inst_att_str_value, att_a
> (...)"
> "Hash Cond: ((att_barr.env_id = ba.env_id) AND
> (att_barr.bus_ent_inst_id = ba.bus_ent_inst_id_auto))"
> "Buffers: shared hit=5814458 read=1033324 dirtied=790"
> "->  Index Only Scan using ix_bus_ent_inst_attr_03 on
> public.bus_ent_inst_attribute att_barr  (cost=0.83..1024093.06 rows=4508264
> width=24) (actual time=10.435..52888.091 rows=4244011 loops=1)"
> "  Output: att_barr.att_id,
> att_barr.ent_inst_att_str_value, att_barr.env_id, att_barr.bus_ent_inst_id,
> att_barr.reg_status"
> "  Index Cond: (att_barr.att_id = 1115)"
> "  Heap Fetches: 120577"
> "  Buffers: shared hit=503194 read=31197 dirtied=5"
> "->  Hash  (cost=11101039.12..11101039.12 rows=886647
> width=146) (actual time=1906329.888..1906329.888 rows=3362294 loops=1)"
> "  Output: ba.bus_ent_inst_id_auto, ba.att_value_1,
> ba.env_id, ba.att_value_num_1, att_call.ent_inst_att_str_value,
> att_nro.ent_inst_att_str_value, att_bis.ent_inst_att_str_value,
> att_pad.ent_inst_att_str_value, att_manz.ent_inst_att (...)"
> "  Buckets: 4194304 (originally 1048576)  Batches:
> 1
> (originally 1)  Memory Usage: 396824kB"
> "  Buffers: shared hit=5311264 read=1002127
> dirtied=785"
> "  ->  Hash Right Join
> (cost=10328938.09..11101039.12 rows=886647 width=146) (actual
> time=1867557.718..1904218.946 rows=3362294 loops=1)"
> "Output: ba.bus_ent_inst_id_auto,
> ba.att_value_1, ba.env_id, ba.att_value_num_1,
> att_call.ent_inst_att_str_value, att_nro.ent_inst_att_str_value,
> att_bis.ent_inst_att_str_value, att_pad.ent_inst_att_str_value,
> att_manz.ent_in (...)"
> "Hash Cond: ((att_apt.env_id = ba.env_id)
> AND (att_apt.bus_ent_inst_id = ba.bus_ent_inst_id_auto))"
> "Buffers: shared hit=5311264 read=1002127
> dirtied=785"
> "->  Index Only Scan using
> ix_bus_ent_inst_attr_03 on public.bus_ent_inst_attribute att_apt
> (cost=0.83..746958.06 rows=3287982 width=24) (actual time=0.091..32788.731
> rows=3491599 loops=1)"
> " 

PostgreSQL DBA consulting

2020-04-07 Thread daya airody
hi folks,

we are looking for a PostgreSQL DBA to help us in tuning our database.

Could you please recommend somebody in your network?
thanks,
--daya--


Re: PostgreSQL DBA consulting

2020-04-07 Thread Justin Pryzby
On Tue, Apr 07, 2020 at 05:20:47PM +0530, daya airody wrote:
> we are looking for a PostgreSQL DBA to help us in tuning our database.

You can start here:
https://www.postgresql.org/support/professional_support/

-- 
Justin