Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Hi,
One of the service layer app is inserting Millions of records in a table
but one row at a time. Although COPY is the fastest way to import a file in
a table. Application has a requirement of processing a row and inserting it
into a table. Is there any way this INSERT can be tuned by increasing
parameters? It is taking almost 10 hours for just 2.2 million rows in a
table. Table does not have any indexes or triggers.

Regards,
Aditya.


Re: Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Hi Bruce,
Correct rows are wider. One of the columns is text and one is bytea.

Regards,
Aditya.

On Sat, Mar 5, 2022 at 12:08 AM Bruce Momjian  wrote:

> On Sat, Mar  5, 2022 at 12:01:52AM +0530, aditya desai wrote:
> > Hi,
> > One of the service layer app is inserting Millions of records in a table
> but
> > one row at a time. Although COPY is the fastest way to import a file in a
> > table. Application has a requirement of processing a row and inserting
> it into
> > a table. Is there any way this INSERT can be tuned by increasing
> parameters? It
> > is taking almost 10 hours for just 2.2 million rows in a table. Table
> does not
> > have any indexes or triggers.
>
> Well, sections 14.4 and 14.5 might help:
>
> https://www.postgresql.org/docs/14/performance-tips.html
>
> Your time seems very slow --- are the rows very wide?
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: Any way to speed up INSERT INTO

2022-03-04 Thread aditya desai
Thanks all for your inputs. We will try to implement inserts in single
transaction. I feel that is the best approach.

Thanks,
AD.

On Saturday, March 5, 2022, Bruce Momjian  wrote:

> On Fri, Mar  4, 2022 at 01:42:39PM -0500, Tom Lane wrote:
> > aditya desai  writes:
> > > One of the service layer app is inserting Millions of records in a
> table
> > > but one row at a time. Although COPY is the fastest way to import a
> file in
> > > a table. Application has a requirement of processing a row and
> inserting it
> > > into a table. Is there any way this INSERT can be tuned by increasing
> > > parameters? It is taking almost 10 hours for just 2.2 million rows in a
> > > table. Table does not have any indexes or triggers.
> >
> > Using a prepared statement for the INSERT would help a little bit.
>
> Yeah, I thought about that but it seems it would only minimally help.
>
> > What would help more, if you don't expect any insertion failures,
> > is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
> > around each batch of 100 or 1000 or so insertions).  There's not
> > going to be any magic bullet that lets you get away without changing
> > the app, though.
>
> Yeah, he/she could insert via multiple rows too:
>
> CREATE TABLE test (x int);
> INSERT INTO test VALUES (1), (2), (3);
>
> > It's quite possible that network round trip costs are a big chunk of your
> > problem, in which case physically grouping multiple rows into each INSERT
> > command (... or COPY ...) is the only way to fix it.  But I'd start with
> > trying to reduce the transaction commit overhead.
>
> Agreed, turning off synchronous_commit for that those queries would be
> my first approach.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: Any way to speed up INSERT INTO

2022-03-08 Thread aditya desai
Hi Tom,
I added BEGIN and COMMIT as shown below around insert and executed it from
pgadmin for 100,000 rows. It ran in just 1 min.

BEGIN;
INSERT INTO TABLE VALUES();
INSERT INTO TABLE VALUES();
.
,
COMMIT;

However when I run above from psql by passing it to psql(As shown below) as
a file. It still takes a lot of time. Am I doing anything wrong? How can I
run this from pgadmin within a minute?

psql -h host -U user -p Port -d database < INSERT_FILE.sql

PSQL is still printing as below.
INSERT 0 1
INSERT 0 1


Regards,
Aditya.


On Sat, Mar 5, 2022 at 12:12 AM Tom Lane  wrote:

> aditya desai  writes:
> > One of the service layer app is inserting Millions of records in a table
> > but one row at a time. Although COPY is the fastest way to import a file
> in
> > a table. Application has a requirement of processing a row and inserting
> it
> > into a table. Is there any way this INSERT can be tuned by increasing
> > parameters? It is taking almost 10 hours for just 2.2 million rows in a
> > table. Table does not have any indexes or triggers.
>
> Using a prepared statement for the INSERT would help a little bit.
> What would help more, if you don't expect any insertion failures,
> is to group multiple inserts per transaction (ie put BEGIN ... COMMIT
> around each batch of 100 or 1000 or so insertions).  There's not
> going to be any magic bullet that lets you get away without changing
> the app, though.
>
> It's quite possible that network round trip costs are a big chunk of your
> problem, in which case physically grouping multiple rows into each INSERT
> command (... or COPY ...) is the only way to fix it.  But I'd start with
> trying to reduce the transaction commit overhead.
>
> regards, tom lane
>


Re: Any way to speed up INSERT INTO

2022-03-08 Thread aditya desai
Ok Will check. But from pgadmin it takes 1min and by psql it is taking 20
mins for 100,000 rows with BEGIN; COMMIT;

Thanks,
Aditya.

On Tue, Mar 8, 2022 at 8:23 PM Bruce Momjian  wrote:

> On Tue, Mar  8, 2022 at 06:36:17PM +0530, aditya desai wrote:
> > Hi Tom,
> > I added BEGIN and COMMIT as shown below around insert and executed it
> from
> > pgadmin for 100,000 rows. It ran in just 1 min.
> >
> > BEGIN;
> > INSERT INTO TABLE VALUES();
> > INSERT INTO TABLE VALUES();
> > .
> > ,
> > COMMIT;
> >
> > However when I run above from psql by passing it to psql(As shown below)
> as a
> > file. It still takes a lot of time. Am I doing anything wrong? How can I
> run
> > this from pgadmin within a minute?
> >
> > psql -h host -U user -p Port -d database < INSERT_FILE.sql
> >
> > PSQL is still printing as below.
> > INSERT 0 1
> > INSERT 0 1
>
> Uh, they should be the same.  You can turn on log_statement=all on the
> server and look at what queries are being issued in each case.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: View taking time to show records

2022-03-25 Thread aditya desai
Hi,
1. Have you tried creating indexes on columns for which it is showing
sequential scans?
2. In my experience if the view is referring some other view inside it, it
is advisable to directly query on tables instead on child view.
3. This table 'so_vendor_address_base' definitely needs indexing to remove
sequentials scans.

Regards,
AD.


On Fri, Mar 25, 2022 at 3:35 PM Kumar, Mukesh 
wrote:

> Hi Team and All ,
>
>
>
> Greeting for the day.
>
>
>
> We have recently migrated from Oracle to PostgreSQL on version 11.4 on
> azure postgres PaaS instance.
>
>
>
> There is 1 query which is taking approx. 10 secs in Oracle and when we ran
> the same query it is taking approx. 1 min
>
>
>
> Can anyone suggest to improve the query as from application end 1 min time
> is not accepted by client.
>
>
>
> Please find the query and explain analyze report from below link
>
>
>
> https://explain.depesz.com/s/RLJn#stats
>
>
>
>
>
> Thanks and Regards,
>
> Mukesh Kumar
>
>
>


FATAL: canceling authentication due to timeout

2022-04-29 Thread aditya desai
Hi,
We are trying to COPY a few tables from Oracle to Postgres and getting the
following error. Data gets partially copied.  Table does not have any huge
data; there are 4 numeric columns and 1 vahchar column. Could you please
help?

FATAL:canceling authentication due to timeout


Regards,
Aditya.


Selecting RAM and CPU based on max_connections

2022-05-20 Thread aditya desai
Hi,
One of our applications needs 3000 max_connections to the database.
Connection pooler like pgbouncer or pgpool is not certified within the
organization yet. So they are looking for setting up high configuration
Hardware with CPU and Memory. Can someone advise how much memory and CPU
they will need if they want max_conenction value=3000.

Regards,
Aditya.


Re: Selecting RAM and CPU based on max_connections

2022-05-20 Thread aditya desai
Thanks! I will run these suggestions with App team.

On Fri, May 20, 2022 at 4:01 PM Laurenz Albe 
wrote:

> On Fri, 2022-05-20 at 12:15 +0200, Andreas Kretschmer wrote:
> > On 20 May 2022 10:27:50 CEST, aditya desai  wrote:
> > > One of our applications needs 3000 max_connections to the database.
> > > Connection pooler like pgbouncer or pgpool is not certified within the
> > > organization yet. So they are looking for setting up high configuration
> > > Hardware with CPU and Memory. Can someone advise how much memory and
> CPU
> > > they will need if they want max_conenction value=3000.
> >
> > Pgbouncer would be the best solution. CPU: number of concurrent
> connections.
> > RAM: shared_buffer + max_connections * work_mem + maintenance_mem +
> operating system + ...
>
> Right.  And then hope and pray that a) the database doesn't get overloaded
> and b) you don't hit any of the database-internal bottlenecks caused by
> many
> connections.
>
> I also got the feeling that the Linux kernel's memory accounting somehow
> lags.
> I have seen cases where every snapshot of "pg_stat_activity" I took showed
> only a few active connections (but each time different ones), but the
> amount of allocated memory exceeded what the currently active sessions
> could
> consume.  I may have made a mistake, and I have no reproducer, but I would
> be curious to know if there is an explanation for that.
> (I am aware that "top" shows shared buffers multiple times).
>
> Yours,
> Laurenz Albe
>


Oracle_FDW table performance issue

2022-07-11 Thread aditya desai
Hi,
I have one Oracle fdw table which is giving performance issue when joined
local temp table gives performance issue.

select * from oracle_fdw_table where transaction_id in ( select
transaction_id from temp_table)  54 seconds. Seeing HASH SEMI JOIN  in
EXPLAIN PLAN. temp_table has only 74 records.


select * from from oracle_fdw_table where transaction_id in (
1,2,3,.,75)--- 23ms.


Could you please help me understand this drastic behaviour change?

Regards,
Aditya.


Re: Oracle_FDW table performance issue

2022-07-11 Thread aditya desai
Hi Justin,
Sorry unable to send a query plan from a closed network. Here the stats are
updated on the Oracle table.


It seems like when joining the local tables it is not filtering data on
Oracle and bringing data to postgres. It is filtering when we actually pass
the values.


Regards,
Aditya.

On Mon, Jul 11, 2022 at 5:43 PM Justin Pryzby  wrote:

> On Mon, Jul 11, 2022 at 05:38:34PM +0530, aditya desai wrote:
> > Hi,
> > I have one Oracle fdw table which is giving performance issue when joined
> > local temp table gives performance issue.
> >
> > select * from oracle_fdw_table where transaction_id in ( select
> > transaction_id from temp_table)  54 seconds. Seeing HASH SEMI JOIN
> in
> > EXPLAIN PLAN. temp_table has only 74 records.
>
> You'd have to share the plan
>
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
>
> Do the tables have updated stats ?
>


Re: Oracle_FDW table performance issue

2022-07-11 Thread aditya desai
Understood thanks!! Will try to build dynamiq query to send ids
across instead of join.

On Mon, Jul 11, 2022 at 8:56 PM Laurenz Albe 
wrote:

> On Mon, 2022-07-11 at 17:38 +0530, aditya desai wrote:
> > I have one Oracle fdw table which is giving performance issue when joined
> > local temp table gives performance issue.
> >
> > select * from oracle_fdw_table where transaction_id in ( select
> transaction_id from temp_table)
> >   54 seconds. Seeing HASH SEMI JOIN  in EXPLAIN PLAN. temp_table has
> only 74 records.
> >
> > select * from from oracle_fdw_table where transaction_id in (
> 1,2,3,.,75)--- 23ms.
> >
> > Could you please help me understand this drastic behaviour change?
>
> The first query joins a local table with a remote Oracle table.  The only
> way for
> such a join to avoid fetching the whole Oracle table would be to have the
> foreign scan
> on the inner side of a nested loop join.  But that would incur many round
> trips to Oracle
> and is therefore perhaps not a great plan either.
>
> In the second case, the whole IN list is shipped to the remote side.
>
> In short, the queries are quite different, and I don't think it is
> possible to get
> the first query to perform as well as the second.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>


pg_wal filling up while running huge updates

2022-08-05 Thread aditya desai
Hi,
We are doing an oracle to postgres migration(5 TB+ data). We are encoding
and decoding BLOB data after migration and for that we are running updates
on tables having BLOB/CLOB data. When we execute this pg_wal is filling up.

Do you have any general guidelines for migrating a 5TB + database from
Oracle to Postgres? Any specific guidelines around archiving logs?

Regards,
Aditya.


ALTER STATEMENT getting blocked

2023-01-19 Thread aditya desai
Hi,
We have a Postgres 11.16 DB which is continuously connected to informatica
and data gets read from it continuously.

When we have to ALTER TABLE.. ADD COLUMN.. it gets blocked by the SELECTs
on the table mentioned by process above.

Is there any way to ALTER the table concurrently without  getting blocked?
Any parameter or option? Can someone give a specific command?

Regards,
Aditya.


Re: ALTER STATEMENT getting blocked

2023-01-22 Thread aditya desai
Thanks All. Let me check this and get back to you.

On Fri, Jan 20, 2023 at 2:36 AM MichaelDBA  wrote:

> Do something like this to get it without being behind other
> transactions...You either get in and get your work done or try again
>
> DO language plpgsql $$
> BEGIN
> FOR get_lock IN 1 .. 100 LOOP
>   BEGIN
> ALTER TABLE mytable ;
> EXIT;
>   END;
> END LOOP;
> END;
> $$;
>
>
>
> Tom Lane wrote on 1/19/2023 12:45 PM:
>
> aditya desai   writes:
>
> We have a Postgres 11.16 DB which is continuously connected to informatica
> and data gets read from it continuously.
>
> When we have to ALTER TABLE.. ADD COLUMN.. it gets blocked by the SELECTs
> on the table mentioned by process above.
>
> Is there any way to ALTER the table concurrently without  getting blocked?
> Any parameter or option? Can someone give a specific command?
>
> ALTER TABLE requires exclusive lock to do that, so it will queue up
> behind any existing table locks --- but then new lock requests will
> queue up behind its request.  So this'd only happen if your existing
> reading transactions don't terminate.  Very long-running transactions
> are unfriendly to other transactions for lots of reasons including
> this one; see if you can fix your application to avoid that.  Or
> manually cancel the blocking transaction(s) after the ALTER begins
> waiting.
>
>   regards, tom lane
>
>
>
>
>
> Regards,
>
> Michael Vitale
>
> [email protected] 
>
> 703-600-9343
>
>
>
>


LIKE CLAUSE on VIEWS

2023-01-22 Thread aditya desai
Hi,
Is there any way to improve performance of LIKE clause on VIEWS.

select * From request_vw where upper(status) like '%CAPTURED%' - 28 seconds.

select * from  request_vw where status='CAPTURED'

Application team is reluctant to change queries from the Application side
to = instead of LIKE.

Also as this is VIEW TRIGRAM nor normal indexes don't get used.


Regards,
Aditya.


Connection forcibly closed remote server error.

2023-02-15 Thread aditya desai
Hi,
We are getting this error when transferring data using COPY command or
running workflow for huge data. We are using Password Authentication(LDAP)

"Connection forcibly closed remote server"

Can someone help how we can avoid this?


Regards,
Aditya.


Re: Connection forcibly closed remote server error.

2023-02-15 Thread aditya desai
Forgot to mention. The error comes intermittently. It is not consistent.
Observation is that it comes when a connection has to process a larger data
set.

On Wed, Feb 15, 2023 at 5:43 PM aditya desai  wrote:

> Hi,
> We are getting this error when transferring data using COPY command or
> running workflow for huge data. We are using Password Authentication(LDAP)
>
> "Connection forcibly closed remote server"
>
> Can someone help how we can avoid this?
>
>
> Regards,
> Aditya.
>


Re: Connection forcibly closed remote server error.

2023-02-15 Thread aditya desai
Hi Jeff,
Apologies. Here is how the message actually looks like.

could not receive data from client: An existing connection was forcibly
closed by the remote host.

All links from Google pointing towards Connection Pooling. However it has
been implemented from the application side.

Regards,
Aditya.

On Thu, Feb 16, 2023 at 12:07 AM Jeff Janes  wrote:

> On Wed, Feb 15, 2023 at 7:13 AM aditya desai  wrote:
>
>> Hi,
>> We are getting this error when transferring data using COPY command or
>> running workflow for huge data. We are using Password Authentication(LDAP)
>>
>> "Connection forcibly closed remote server"
>>
>
> Are you sure that that is the exact wording? It doesn't sound like grammar
> that would be used for an error message.  Or did you perhaps translate it
> to English from a localized error message?
>
> Is that error reported by the client, or in the server log?  Whichever end
> that is, what does the other end say?
>
> Cheers,
>
> Jeff
>
>


INSERT statement going in IPC Wait_event

2023-03-01 Thread aditya desai
Hi All,
Unfortunately I am unable to share a query  plan or query.

I have a SQL which is getting called from a web service. At a certain point
where it inserts data in the table . Process is going in a hung state.
pg_stat_activity shows wait_even='IPC' , wait_even_type=MessageQueueSend.
In Webservice log we see I/O error occurred message.

Surprisingly when I run it from PSQL or pgadmin it runs fine.

Has anyone come across this issue? Could you please help?

Regards,
Aditya.


Connection drops on postgres 11.16

2023-05-11 Thread aditya desai
Hi,
We are getting intermittent connection errors on Postgres 11.16,  in
informatica as well as Python jobs that run queries on Postgres.

In informatica logs we see below error.

ODBC PostgreSQL Wire Protocol driver]SSL I/O Error.[DataDirect][ODBC lib]
Connection in use

While in Postgres logs we see below errors.

An existing connection was forcibly closed by the remote host.



Regards,
Aditya.


AWS RDS PostgreSQL CPU Spiking to 100%

2020-09-08 Thread aditya desai
Hi,
We have an application where one of the APIs calling queries(attached) is
spiking the CPU to 100% during load testing.
However, queries are making use of indexes(Bitmap Index and Bitmap Heap
scan though). When run separately on DB queries hardly take less than 200
ms. Is CPU spiking due to Bitmap Heap Scan?
These queries are being called thousands of times. Application team says
they have handled connection pooling from the Application side. So there is
no connection pooling here from DB side. Current db instance size is
"db.m4.4xlarge"
64 GB RAM 16 vCPU".
The Application dev team has primary keys and foreign keys on tables so
they are unable to partition the tables as well due to limitations of
postgres partitioning. Columns in WHERE clauses are not constant in all
queries to decide partition keys.

1. Does DB need more CPU considering this kind of load?
2. Can the query be tuned further? It is already using indexes(Bitmap
though).
3. Will connection pooling resolve the CPU Spike issues?

Also pasting Query and plans below.

--exampleCount 1. Without
internalexamplecode---

lmp_examples=> explain analyze with exampleCount as ( select
examplestatuscode from example j where 1=1 and j.countrycode = 'AD'   and
j.facilitycode in ('ABCD') and j.internalexamplecode in
('005','006','007','005') and ((j.examplestartdatetime  between '2020-05-18
00:00:00' and '2020-08-19 00:00:00' ) or j.examplestartdatetime IS NULL )
group by j.examplestatuscode)
lmp_examples-> select js.examplestatuscode,COALESCE(count(*),0) stat_count
from exampleCount jc right outer join examplestatus js on
jc.examplestatuscode=js.examplestatuscode group by js.examplestatuscode ;


 QUERY PLAN



 HashAggregate  (cost=79353.80..79353.89 rows=9 width=12) (actual
time=88.847..88.850 rows=9 loops=1)
   Group Key: js.examplestatuscode
   CTE examplecount
 ->  HashAggregate  (cost=79352.42..79352.46 rows=4 width=4) (actual
time=88.803..88.805 rows=5 loops=1)
   Group Key: j.examplestatuscode
   ->  Bitmap Heap Scan on example j  (cost=1547.81..79251.08
rows=40538 width=4) (actual time=18.424..69.658 rows=62851 loops=1)
 Recheck Cond: countrycode)::text = 'AD'::text) AND
((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
ANY ('{005,006,007,005}'::text[])) AND (examplestartdatetime >= '2020-05-18
00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
'2020-08-19 00:00:00'::timestamp without time zone)) OR
(examplestartdatetime IS NULL))
 Filter: (((countrycode)::text = 'AD'::text) AND
((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
ANY ('{005,006,007,005}'::text[])))
 Rows Removed by Filter: 3
 Heap Blocks: exact=18307
 ->  BitmapOr  (cost=1547.81..1547.81 rows=40538 width=0)
(actual time=15.707..15.707 rows=0 loops=1)
   ->  Bitmap Index Scan on example_list9_idx
(cost=0.00..1523.10 rows=40538 width=0) (actual time=15.702..15.702
rows=62851 loops=1)
 Index Cond: (((countrycode)::text =
'AD'::text) AND ((facilitycode)::text = 'ABCD'::text) AND
((internalexamplecode)::text = ANY ('{005,006,007,005}'::text[])) AND
(examplestartdatetime >= '2020-05-18 00:00:00'::timestamp without time
zone) AND (examplestartdatetime <= '2020-08-19 00:00:00'::timestamp without
time zone))
   ->  Bitmap Index Scan on example_list10_idx
(cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=3 loops=1)
 Index Cond: (examplestartdatetime IS NULL)
   ->  Hash Left Join  (cost=0.13..1.29 rows=9 width=4) (actual
time=88.831..88.840 rows=9 loops=1)
 Hash Cond: ((js.examplestatuscode)::text =
(jc.examplestatuscode)::text)
 ->  Seq Scan on examplestatus js  (cost=0.00..1.09 rows=9 width=4)
(actual time=0.004..0.007 rows=9 loops=1)
 ->  Hash  (cost=0.08..0.08 rows=4 width=16) (actual
time=88.817..88.817 rows=5 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 9kB
   ->  CTE Scan on examplecount jc  (cost=0.00..0.08 rows=4
width=16) (actual time=88.807..88.812 rows=5 loops=1)
 Planning Time: 0.979 ms
 Execution Time: 89.036 ms
(23 rows)


exampleCount 2. With
internalexamplecode-


lmp_examples=> explain analyze with exampleCount as ( select
examplestatuscode,count(1) stat_count from example j where 1=1 and
j.countrycode = 'AD'   and j.facilitycode in ('ABCD') and
j.internalexamplecode in ('005','0

How to encrypt database password in pgpass or unix file to run batch jobs through shell script

2020-09-25 Thread aditya desai
Hi,
We have Amazon RDS Postgres. Currently we are using .pgpass file and
running psql from different EC2 instances to connect to DB. But the
password in this file is not encrypted. What are our options to encrypt the
password? Or do passwordless connection from EC2 to database? Lambda
functions have limitations of running only for 15 minutes.

How can we setup different authentication methods for AWS RDS Postgres as
we don't have access pg_hba.conf?

Regards,
Aditya.


Re: AWS RDS PostgreSQL CPU Spiking to 100%

2020-09-28 Thread aditya desai
>
>
> Hi,
> We have an application where one of the APIs calling queries(attached) is
> spiking the CPU to 100% during load testing.
> However, queries are making use of indexes(Bitmap Index and Bitmap Heap
> scan though). When run separately on DB queries hardly take less than 200
> ms. Is CPU spiking due to Bitmap Heap Scan?
> These queries are being called thousands of times. Application team says
> they have handled connection pooling from the Application side. So there is
> no connection pooling here from DB side. Current db instance size is 
> "db.m4.4xlarge"
> 64 GB RAM 16 vCPU".
> The Application dev team has primary keys and foreign keys on tables so
> they are unable to partition the tables as well due to limitations of
> postgres partitioning. Columns in WHERE clauses are not constant in all
> queries to decide partition keys.
>
> 1. Does DB need more CPU considering this kind of load?
> 2. Can the query be tuned further? It is already using indexes(Bitmap
> though).
> 3. Will connection pooling resolve the CPU Spike issues?
>
> Also pasting Query and plans below.
>
> --exampleCount 1. Without
> internalexamplecode---
>
> lmp_examples=> explain analyze with exampleCount as ( select
> examplestatuscode from example j where 1=1 and j.countrycode = 'AD'   and
> j.facilitycode in ('ABCD') and j.internalexamplecode in
> ('005','006','007','005') and ((j.examplestartdatetime  between '2020-05-18
> 00:00:00' and '2020-08-19 00:00:00' ) or j.examplestartdatetime IS NULL )
> group by j.examplestatuscode)
> lmp_examples-> select js.examplestatuscode,COALESCE(count(*),0) stat_count
> from exampleCount jc right outer join examplestatus js on
> jc.examplestatuscode=js.examplestatuscode group by js.examplestatuscode ;
>
>
>QUERY PLAN
>
>
>
> 
>  HashAggregate  (cost=79353.80..79353.89 rows=9 width=12) (actual
> time=88.847..88.850 rows=9 loops=1)
>Group Key: js.examplestatuscode
>CTE examplecount
>  ->  HashAggregate  (cost=79352.42..79352.46 rows=4 width=4) (actual
> time=88.803..88.805 rows=5 loops=1)
>Group Key: j.examplestatuscode
>->  Bitmap Heap Scan on example j  (cost=1547.81..79251.08
> rows=40538 width=4) (actual time=18.424..69.658 rows=62851 loops=1)
>  Recheck Cond: countrycode)::text = 'AD'::text) AND
> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
> ANY ('{005,006,007,005}'::text[])) AND (examplestartdatetime >= '2020-05-18
> 00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
> '2020-08-19 00:00:00'::timestamp without time zone)) OR
> (examplestartdatetime IS NULL))
>  Filter: (((countrycode)::text = 'AD'::text) AND
> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
> ANY ('{005,006,007,005}'::text[])))
>  Rows Removed by Filter: 3
>  Heap Blocks: exact=18307
>  ->  BitmapOr  (cost=1547.81..1547.81 rows=40538 width=0)
> (actual time=15.707..15.707 rows=0 loops=1)
>->  Bitmap Index Scan on example_list9_idx
> (cost=0.00..1523.10 rows=40538 width=0) (actual time=15.702..15.702
> rows=62851 loops=1)
>  Index Cond: (((countrycode)::text =
> 'AD'::text) AND ((facilitycode)::text = 'ABCD'::text) AND
> ((internalexamplecode)::text = ANY ('{005,006,007,005}'::text[])) AND
> (examplestartdatetime >= '2020-05-18 00:00:00'::timestamp without time
> zone) AND (examplestartdatetime <= '2020-08-19 00:00:00'::timestamp without
> time zone))
>->  Bitmap Index Scan on example_list10_idx
> (cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=3 loops=1)
>  Index Cond: (examplestartdatetime IS NULL)
>->  Hash Left Join  (cost=0.13..1.29 rows=9 width=4) (actual
> time=88.831..88.840 rows=9 loops=1)
>  Hash Cond: ((js.examplestatuscode)::text =
> (jc.examplestatuscode)::text)
>  ->  Seq Scan on examplestatus js  (cost=0.00..1.09 rows=9
> width=4) (actual time=0.004..0.007 rows=9 loops=1)
>  ->  Hash  (cost=0.08..0.08 rows=4 width=16) (actual
> time=88.817..88.817 rows=5 loops=1)
>Buckets: 1024  Batches: 1  Memory Usage: 9kB
>->  CTE Scan on examplecount jc  (cost=0.00..0.08 rows=4
> width=16) (actual time=88.807..88.812 rows=5 loops=1)
>  Planning Time: 0.979 ms
>  Execution Time: 89.036 ms
> (23 rows)
>
>
> exampleCount 2. With
> internalexamplecode-
>
>
> lmp_examples=> explain analyze 

Re: AWS RDS PostgreSQL CPU Spiking to 100%

2020-09-30 Thread aditya desai
Thanks, I'll check it out.

On Mon, Sep 28, 2020 at 9:40 PM Prince Pathria 
wrote:

> We faced a similar issue, adding RDS proxy in front of RDS Postgres can
> help.
> In our situation, there were a lot of connects/disconnects from Lambda
> functions although concurrency of Lambda was 100 only.
> And adding connection pooler(RDS proxy) helped us to reduce the CPU load
> from 100% to 30%
>
> Happy to help :)
> Prince Pathria Systems Engineer | Certified Kubernetes Administrator |
> AWS Certified Solutions Architect Evive +91 9478670472 goevive.com
>
>
> On Mon, Sep 28, 2020 at 9:21 PM aditya desai  wrote:
>
>>
>>> Hi,
>>> We have an application where one of the APIs calling queries(attached)
>>> is spiking the CPU to 100% during load testing.
>>> However, queries are making use of indexes(Bitmap Index and Bitmap Heap
>>> scan though). When run separately on DB queries hardly take less than 200
>>> ms. Is CPU spiking due to Bitmap Heap Scan?
>>> These queries are being called thousands of times. Application team says
>>> they have handled connection pooling from the Application side. So there is
>>> no connection pooling here from DB side. Current db instance size is 
>>> "db.m4.4xlarge"
>>> 64 GB RAM 16 vCPU".
>>> The Application dev team has primary keys and foreign keys on tables so
>>> they are unable to partition the tables as well due to limitations of
>>> postgres partitioning. Columns in WHERE clauses are not constant in all
>>> queries to decide partition keys.
>>>
>>> 1. Does DB need more CPU considering this kind of load?
>>> 2. Can the query be tuned further? It is already using indexes(Bitmap
>>> though).
>>> 3. Will connection pooling resolve the CPU Spike issues?
>>>
>>> Also pasting Query and plans below.
>>>
>>> --exampleCount 1. Without
>>> internalexamplecode---
>>>
>>> lmp_examples=> explain analyze with exampleCount as ( select
>>> examplestatuscode from example j where 1=1 and j.countrycode = 'AD'   and
>>> j.facilitycode in ('ABCD') and j.internalexamplecode in
>>> ('005','006','007','005') and ((j.examplestartdatetime  between '2020-05-18
>>> 00:00:00' and '2020-08-19 00:00:00' ) or j.examplestartdatetime IS NULL )
>>> group by j.examplestatuscode)
>>> lmp_examples-> select js.examplestatuscode,COALESCE(count(*),0)
>>> stat_count from exampleCount jc right outer join examplestatus js on
>>> jc.examplestatuscode=js.examplestatuscode group by js.examplestatuscode ;
>>>
>>>
>>>  QUERY PLAN
>>>
>>>
>>>
>>> 
>>>  HashAggregate  (cost=79353.80..79353.89 rows=9 width=12) (actual
>>> time=88.847..88.850 rows=9 loops=1)
>>>Group Key: js.examplestatuscode
>>>CTE examplecount
>>>  ->  HashAggregate  (cost=79352.42..79352.46 rows=4 width=4) (actual
>>> time=88.803..88.805 rows=5 loops=1)
>>>Group Key: j.examplestatuscode
>>>->  Bitmap Heap Scan on example j  (cost=1547.81..79251.08
>>> rows=40538 width=4) (actual time=18.424..69.658 rows=62851 loops=1)
>>>  Recheck Cond: countrycode)::text = 'AD'::text) AND
>>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>>> ANY ('{005,006,007,005}'::text[])) AND (examplestartdatetime >= '2020-05-18
>>> 00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
>>> '2020-08-19 00:00:00'::timestamp without time zone)) OR
>>> (examplestartdatetime IS NULL))
>>>  Filter: (((countrycode)::text = 'AD'::text) AND
>>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>>> ANY ('{005,006,007,005}'::text[])))
>>>  Rows Removed by Filter: 3
>>>  Heap Blocks: exact=18307
>>>  ->  BitmapOr  (cost=1547.81..1547.81 rows=40538
>>> width=0) (actual time=15.707..15.

SSL connection getting rejected on AWS RDS

2020-09-30 Thread aditya desai
Hi,
We have AWS RDS and we are trying to connect to DB remotely from EC2
instance.as client connection using psql. We are trying to set up IAM
roles. We did all the necessary settings but got below error. Could you
please advise?

Password for user lmp_cloud_dev:

psql: FATAL:  PAM authentication failed for user "testuser"

FATAL:  pg_hba.conf rejects connection for host "192.168.1.xxx", user
"testuser", database "testdb", SSL off


Regards,

Aditya.


Re: SSL connection getting rejected on AWS RDS

2020-09-30 Thread aditya desai
Hi Hannah,
Thank you very much!! this is really helpful. Do we need to pass
'sslrootcert" as mentioned in the doc below? I see that you have not used
it in  your command.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.Connecting.AWSCLI.PostgreSQL.html

Also do we have to grant the role below to the user?

grant rds_iam to app_user;


If you have any document/Steps to set this up from scratch,could you please
forward? That would be really helpful.

Regards,
Aditya.


On Wed, Sep 30, 2020 at 4:47 PM Hannah Huang 
wrote:

>
>
> On 30 Sep 2020, at 5:19 pm, aditya desai  wrote:
>
> Hi,
> We have AWS RDS and we are trying to connect to DB remotely from EC2
> instance.as client connection using psql. We are trying to set up IAM
> roles. We did all the necessary settings but got below error. Could you
> please advise?
>
> Password for user lmp_cloud_dev:
>
> psql: FATAL:  PAM authentication failed for user "testuser"
>
> FATAL:  pg_hba.conf rejects connection for host "192.168.1.xxx", user
> "testuser", database "testdb", SSL off
>
>
> Regards,
>
> Aditya.
>
>
> Hi Aditya,
>
> See the below example of me connecting to RDS from an EC2 instance:
>
> You need to change the $RDSHOST value
> you need to replace my “app_user” to your “testuser” and database
> “postgres” to your “testdb”
>
> [ec2-user@ip-172-31-13-121 ~]$ export RDSHOST="mypg.cfvvs1nh3f7i.ap-
> southeast-2.rds.amazonaws.com"
>
> [ec2-user@ip-172-31-13-121 ~]$ export PGPASSWORD="$(aws rds
> generate-db-auth-token \
> --hostname $RDSHOST \
> --port 5432 \
> --username app_user)”
>
> [ec2-user@ip-172-31-13-121 ~]$ psql "host=$RDSHOST port=5432
> sslmode=require dbname=postgres user= app_user"
>
> psql (11.5, server 12.3)
> WARNING: psql major version 11, server major version 12.
> Some psql features might not work.
> SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384,
> bits: 256, compression: off)
> Type "help" for help.
> postgres=>
>
> Thanks,
> Hannah
>


CPU Consuming query. Sequential scan despite indexing.

2020-10-15 Thread aditya desai
Hi,
Below query always shows up on top in the CPU matrix. Also despite having
indexes it does sequential scans(probably because WHERE condition satisfies
almost all of the data from table). This query runs on the default landing
page in application and needs to fetch records in less that 100 ms without
consuming too much CPU.

 Any opinions? Table is very huge and due to referential identity and
business requirements we could not implement partitioning as well.

There is index on (countrycode,facilitycode,jobstartdatetime)

explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
00:00:00' ) or j.jobstartdatetime IS NULL )  group by j.jobstatuscode)
 select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc
right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;

  QUERY PLAN



---
 Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual
time=1314.809..1314.849 rows=10 loops=1)
   Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
   Buffers: shared hit=21314 read=3231
   I/O Timings: read=19.867
   CTE jobcount
 ->  Finalize GroupAggregate  (cost=98842.93..98844.71 rows=7 width=12)
(actual time=1314.780..1314.802 rows=6 loops=1)
   Group Key: j.jobstatuscode
   Buffers: shared hit=21313 read=3231
   I/O Timings: read=19.867
   ->  Gather Merge  (cost=98842.93..98844.57 rows=14 width=12)
(actual time=1314.766..1314.857 rows=18 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=60102 read=11834
 I/O Timings: read=59.194
 ->  Sort  (cost=97842.91..97842.93 rows=7 width=12)
(actual time=1305.044..1305.047 rows=6 loops=3)
   Sort Key: j.jobstatuscode
   Sort Method: quicksort  Memory: 25kB
   Worker 0:  Sort Method: quicksort  Memory: 25kB
   Worker 1:  Sort Method: quicksort  Memory: 25kB
   Buffers: shared hit=60102 read=11834
   I/O Timings: read=59.194
   ->  Partial HashAggregate  (cost=97842.74..97842.81
rows=7 width=12) (actual time=1305.010..1305.013 rows=6 loops=3)
 Group Key: j.jobstatuscode
 Buffers: shared hit=60086 read=11834
 I/O Timings: read=59.194
 ->  Parallel Seq Scan on job j
(cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
rows=163200 loops=3)
   Filter: (((countrycode)::text =
'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
without time zone) AND (jobst
artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
(jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
   Rows Removed by Filter: 449035
   Buffers: shared hit=60086 read=11834
   I/O Timings: read=59.194
   ->  CTE Scan on jobcount jc  (cost=0.00..0.14 rows=7 width=24) (actual
time=1314.784..1314.811 rows=6 loops=1)
 Buffers: shared hit=21313 read=3231
 I/O Timings: read=19.867
   ->  Hash  (cost=1.10..1.10 rows=10 width=4) (actual time=0.014..0.015
rows=10 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 9kB
 Buffers: shared hit=1
 ->  Seq Scan on jobstatus js  (cost=0.00..1.10 rows=10 width=4)
(actual time=0.005..0.008 rows=10 loops=1)
   Buffers: shared hit=1
 Planning Time: 0.949 ms
 Execution Time: 1314.993 ms
(40 rows)

Regards,
Aditya.


Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
Hi Michael,
Will follow standard practice going forward. We are in the process of
rebuilding the PST environment equivalent to Prod where these Load tests
were done. I will implement all these suggestions on that environment and
reply back. Sincere apologies for the delay.

Regards,
Aditya.

On Mon, Oct 19, 2020 at 9:50 PM Michael Lewis  wrote:

> Reply to the group, not just me please. Btw, when you do reply to the
> group, it is best practice on these lists to reply in-line and not just
> reply on top with all prior messages quoted.
>
> On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:
>
>> I tried vacuum full and execution time came down to half.
>>
> Great to hear.
>
>
>> However, it still consumes CPU. Setting parallel workers per gather to 0
>> did not help much.
>>
> You didn't answer all of my questions, particularly about disabling
> sequential scan. If you still have the default random_page_cost of 4, it
> might be that 1.5 allows better estimates for cost on index (random) vs
> sequential scan of a table.
>
> Laurenz is a brilliant guy. I would implement the indexes he suggests if
> you don't have them already and report back. If the indexes don't get used,
> try set enable_seqscan = false; before the query and if it is way faster,
> then reduce random_page_cost to maybe 1-2 depending how your overall cache
> hit ratio is across the system.
>
>
>> Auto vacuuming is catching up just fine. No issues in that area.
>>
> If the time came down by half after 'vacuum full', I would question that
> statement.
>
>
>> Temp table size is less that original tables without indexes.
>>
> Significantly less would indicate the regular table still being bloated I
> think. Maybe someone else will suggest otherwise.
>
>
>> Does this mean we need to upgrade the hardware? Also by caching data , do
>> you mean caching at application side(microservices side) ? Or on postgres
>> side? I tried pg_prewarm, it did not help much.
>>
> I can't say about hardware. Until you have exhausted options like configs
> and indexing, spending more money forever onwards seems premature. I meant
> pre-aggregated data, wherever it makes sense to do that. I wouldn't expect
> pg_prewarm to do a ton since you already show high cache hits.
>
>
>> It is actually the CPU consumption which is the issue. Query is fast
>> otherwise.
>>
> Sure, but that is a symptom of reading and processing a lot of data.
>
>>


Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
Hi Laurenz,
I created

On Fri, Oct 16, 2020 at 2:06 PM Laurenz Albe 
wrote:

> On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
> > Below query always shows up on top in the CPU matrix. Also despite
> having indexes it does sequential scans
> > (probably because WHERE condition satisfies almost all of the data from
> table). This query
> > runs on the default landing page in application and needs to fetch
> records in less that 100 ms
> >  without consuming too much CPU.
> >
> >  Any opinions? Table is very huge and due to referential identity and
> business requirements we could not
> >  implement partitioning as well.
> >
> > There is index on (countrycode,facilitycode,jobstartdatetime)
> >
> > explain (analyze,buffers) with JobCount as ( select
> jobstatuscode,count(1) stat_count from job j
> >  where 1=1 and j.countrycode = 'TH'
> > and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> >  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
> 00:00:00' ) or j.jobstartdatetime IS NULL )  group by j.jobstatuscode)
> >  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount
> jc right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;
> >
> >   QUERY PLAN
> >
> >  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual
> time=1314.809..1314.849 rows=10 loops=1)
> >  ->  Parallel Seq Scan on job j
> (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
> rows=163200 loops=3)
> >Filter: (((countrycode)::text =
> 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
> without time zone) AND (jobst
> > artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
> (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
> > ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
> >Rows Removed by Filter: 449035
> >Buffers: shared hit=60086 read=11834
> >I/O Timings: read=59.194
> >
>
> You should rewrite the subquery as a UNION to avoid the OR:
>
>   ... WHERE j.countrycode = 'TH'
> and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime between '2020-08-01 00:00:00' and
> '2020-09-30 00:00:00'
>
> and
>
>   ... WHERE j.countrycode = 'TH'
> and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime IS NULL
>
> These indexes could speed up the resulting query:
>
>   CREATE INDEX ON job (countrycode, facilitycode);
>   CREATE INDEX ON job (countrycode, jobstartdatetime);
>   CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime IS
> NULL;
>

I created the indexes you suggested and changed the query with the UNION
operator. Please see explain plan below. Performance of the query(execution
time has improved mostly because I ran vacuum full). Cost of the query is
still high.This is Dev envrionment and has 2 vCPU and 8 GB RAM.

explain (analyze,buffers) with JobCount as ( (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
00:00:00'group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1'

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
On Mon, Oct 19, 2020 at 9:50 PM Michael Lewis  wrote:

> Reply to the group, not just me please. Btw, when you do reply to the
> group, it is best practice on these lists to reply in-line and not just
> reply on top with all prior messages quoted.
>

Hi  Michael,
Please see below inline response. I tried all this on Dev env 2 vCPU and 8
GB RAM. Still waiting for the PST environment :( with better configuration.

>
> On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:
>
>> I tried vacuum full and execution time came down to half.
>>
> Great to hear.
>
>
>> However, it still consumes CPU. Setting parallel workers per gather to 0
>> did not help much.
>>
> You didn't answer all of my questions, particularly about disabling
> sequential scan. If you still have the default random_page_cost of 4, it
> might be that 1.5 allows better estimates for cost on index (random) vs
> sequential scan of a table.
>

Please see the next inline answer.

>
> Laurenz is a brilliant guy. I would implement the indexes he suggests if
> you don't have them already and report back. If the indexes don't get used,
> try set enable_seqscan = false; before the query and if it is way faster,
> then reduce random_page_cost to maybe 1-2 depending how your overall cache
> hit ratio is across the system.
>

Query plan with enable_seqscan=off , Random page cost=1. With this
execution time and cost of query is almost less than half compared to
original settings. Also used the suggestions given by Laurenze. 1. Made use
of UINON operator and created indexes.

lmp_delivery_jobs=> explain (analyze,buffers) with JobCount as ( (select
jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode =
'TH'   and j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
00:00:00'group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime is null  group by j.jobstatuscode))
 select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc
right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;

QUERY PLAN


---
 Hash Right Join  (cost=68652.52..68652.76 rows=10 width=12) (actual
time=676.477..676.495 rows=10 loops=1)
   Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
   Buffers: shared hit=11897
   CTE jobcount
 ->  HashAggregate  (cost=68650.01..68650.11 rows=10 width=24) (actual
time=676.451..676.454 rows=8 loops=1)
   Group Key: j.jobstatuscode, (count(1))
   Buffers: shared hit=11895
   ->  Append  (cost=68645.89..68649.96 rows=10 width=24) (actual
time=676.346..676.441 rows=8 loops=1)
 Buffers: shared hit=11895
 ->  Finalize GroupAggregate  (cost=68645.89..68648.17
rows=9 width=12) (actual time=676.345..676.379 rows=8 loops=1)
   Group Key: j.jobstatuscode
   Buffers: shared hit=11889
   ->  Gather Merge  (cost=68645.89..68647.99 rows=18
width=12) (actual time=676.330..676.403 rows=24 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=29067 read=1
 I/O Timings: read=0.038
 ->  Sort  (cost=67645.87..67645.89 rows=9
width=12) (actual time=669.544..669.548 rows=8 loops=3)
   Sort Key: j.jobstatuscode
   Sort Method: quicksort  Memory: 25kB
   Worker 0:  Sort Method: quicksort
Memory: 25kB
   Worker 1:  Sort Method: quicksort
Memory: 25kB
   Buffers: shared hit=29067 read=1
   I/O Timings: read=0.038
   ->  Partial HashAggregate
(cost=67645.63..67645.72 rows=9 width=12) (actual time=669.506..669.511
rows=8 loops=3)
 Group Key: j.jobstatuscode
 

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-21 Thread aditya desai
Hi,
Kindly requesting an update on this. Thanks.

-Aditya.

On Tue, Oct 20, 2020 at 6:26 PM aditya desai  wrote:

>
>
> On Mon, Oct 19, 2020 at 9:50 PM Michael Lewis  wrote:
>
>> Reply to the group, not just me please. Btw, when you do reply to the
>> group, it is best practice on these lists to reply in-line and not just
>> reply on top with all prior messages quoted.
>>
>
> Hi  Michael,
> Please see below inline response. I tried all this on Dev env 2 vCPU and 8
> GB RAM. Still waiting for the PST environment :( with better configuration.
>
>>
>> On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:
>>
>>> I tried vacuum full and execution time came down to half.
>>>
>> Great to hear.
>>
>>
>>> However, it still consumes CPU. Setting parallel workers per gather to 0
>>> did not help much.
>>>
>> You didn't answer all of my questions, particularly about disabling
>> sequential scan. If you still have the default random_page_cost of 4, it
>> might be that 1.5 allows better estimates for cost on index (random) vs
>> sequential scan of a table.
>>
>
> Please see the next inline answer.
>
>>
>> Laurenz is a brilliant guy. I would implement the indexes he suggests if
>> you don't have them already and report back. If the indexes don't get used,
>> try set enable_seqscan = false; before the query and if it is way
>> faster, then reduce random_page_cost to maybe 1-2 depending how your
>> overall cache hit ratio is across the system.
>>
>
> Query plan with enable_seqscan=off , Random page cost=1. With this
> execution time and cost of query is almost less than half compared to
> original settings. Also used the suggestions given by Laurenze. 1. Made use
> of UINON operator and created indexes.
>
> lmp_delivery_jobs=> explain (analyze,buffers) with JobCount as ( (select
> jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode =
> 'TH'   and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
> 00:00:00'group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
> stat_count from job j where 1=1 and j.countrycode = 'TH'   and
> j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime is null  group by j.jobstatuscode))
>  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount
> jc right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;
>
>   QUERY PLAN
>
>
> 
> ---
>  Hash Right Join  (cost=68652.52..68652.76 rows=10 width=12) (actual
> time=676.477..676.495 rows=10 loops=1)
>Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
>Buffers: shared hit=11897
>CTE jobcount
>  ->  HashAggregate  (cost=68650.01..68650.11 rows=10 width=24) (actual
> time=676.451..676.454 rows=8 loops=1)
>Group Key: j.jobstatuscode, (count(1))
>Buffers: shared hit=11895
>->  Append  (cost=68645.89..68649.96 rows=10 width=24) (actual
> time=676.346..676.441 rows=8 loops=1)
>  Buffers: shared hit=11895
>  ->  Finalize GroupAggregate  (cost=68645.89..68648.17
> rows=9 width=12) (actual time=676.345..676.379 rows=8 loops=1)
>Group Key: j.jobstatuscode
>Buffers: shared hit=11889
>->  Gather Merge  (cost=68645.89..68647.99 rows=18
> width=12) (actual time=676.330..676.403 rows=24 loops=1)
>  Workers Planned: 2
>  Workers Launched: 2
>  Buffers: shared hit=29067 read=1
>  I/O Timings: read=0.038
>  ->  Sort  (cost=67645.87..67645.89 rows=9
> width=12) (actual time=669.544..669.548 rows=8 loops=3)
>Sort Key: j.jobstatuscode
>Sort Method: quicksort  Memory:

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-21 Thread aditya desai
Hi,
Kindly requesting for help on this. Thanks.

-Aditya.



On Tue, Oct 20, 2020 at 6:00 PM aditya desai  wrote:

> Hi Laurenz,
> I created
>
> On Fri, Oct 16, 2020 at 2:06 PM Laurenz Albe 
> wrote:
>
>> On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
>> > Below query always shows up on top in the CPU matrix. Also despite
>> having indexes it does sequential scans
>> > (probably because WHERE condition satisfies almost all of the data from
>> table). This query
>> > runs on the default landing page in application and needs to fetch
>> records in less that 100 ms
>> >  without consuming too much CPU.
>> >
>> >  Any opinions? Table is very huge and due to referential identity and
>> business requirements we could not
>> >  implement partitioning as well.
>> >
>> > There is index on (countrycode,facilitycode,jobstartdatetime)
>> >
>> > explain (analyze,buffers) with JobCount as ( select
>> jobstatuscode,count(1) stat_count from job j
>> >  where 1=1 and j.countrycode = 'TH'
>> > and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>> >  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and
>> '2020-09-30 00:00:00' ) or j.jobstartdatetime IS NULL )  group by
>> j.jobstatuscode)
>> >  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from
>> JobCount jc right outer join jobstatus js on
>> jc.jobstatuscode=js.jobstatuscode;
>> >
>> >   QUERY PLAN
>> >
>> >  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual
>> time=1314.809..1314.849 rows=10 loops=1)
>> >  ->  Parallel Seq Scan on job j
>> (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
>> rows=163200 loops=3)
>> >Filter: (((countrycode)::text =
>> 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
>> without time zone) AND (jobst
>> > artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
>> (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
>> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
>> > ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>> >Rows Removed by Filter: 449035
>> >Buffers: shared hit=60086 read=11834
>> >I/O Timings: read=59.194
>> >
>>
>> You should rewrite the subquery as a UNION to avoid the OR:
>>
>>   ... WHERE j.countrycode = 'TH'
>> and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>> and j.jobstartdatetime between '2020-08-01 00:00:00' and
>> '2020-09-30 00:00:00'
>>
>> and
>>
>>   ... WHERE j.countrycode = 'TH'
>> and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>> and j.jobstartdatetime IS NULL
>>
>> These indexes could speed up the resulting query:
>>
>>   CREATE INDEX ON job (countrycode, facilitycode);
>>   CREATE INDEX ON job (countrycode, jobstartdatetime);
>>   CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime
>> IS NULL;
>>
>
> I created the indexes you suggested and changed the query with the UNION
> operator. Please see explain plan below. Performance of the query(execution
> time has improved mostly because I ran vacuum full). Cost of the query is
> still high.This is Dev envrionment and has 2 vCPU and 8 GB RAM.
>
> explain (analyze,buffers) with JobCount as ( (select
> jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode =
> 'TH'   and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.j

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-21 Thread aditya desai
Hi David,
Thanks for the suggestion. Let me try to implement this as well. WIll get
back to you soon.

Regards,
Aditya.

On Thu, Oct 22, 2020 at 11:03 AM David G. Johnston <
[email protected]> wrote:

> On Wed, Oct 21, 2020 at 10:22 PM aditya desai  wrote:
>
>> As per application team, it is business requirement to show last 60 days
>>> worth data.
>>>
>>
> I didn't look deeply but it sounds like you are looking backwards into 60
> days worth of detail every single time you perform the query and computing
> an aggregate directly from the detail.  Stop doing that.  By way of
> example, at the end of every day compute the aggregates on the relevant
> dimensions and save them.  Then query the saved aggregates from previous
> days and add them to the computed aggregate from the current day's detail.
>
> David J.
>
>


Re: Pg_locks and pg_stat_activity

2020-12-04 Thread aditya desai
Hi Kyotaro,
Many thanks for the response. Will try and debug further. Have responded to
Justin in another chain. Could you please check and advise if anything ?

Regards,
Aditya.

On Fri, Dec 4, 2020 at 1:23 PM Kyotaro Horiguchi 
wrote:

> At Fri, 4 Dec 2020 12:11:59 +0530, Ravikumar Reddy <
> [email protected]> wrote in
> > Hi,
> > pg_stat_activity -- Providers the active and ideal connection for our
> > database
> > Pg_locks   --  Provider the queries/procedure/function details if
> > any object is locked at the current in our database.
>
> Yeah..
>
> That result is quite hard to see, but..
>
> > On Fri, Dec 4, 2020 at 11:43 AM aditya desai  wrote:
> >
> > > Hi Postgres Experts,
> > > Requesting for advice on below. I am new to postgres :(
> > >
> > > Regards,
> > > Aditya.
> > >
> > > On Tue, Dec 1, 2020 at 9:41 PM aditya desai 
> wrote:
> > >
> > >> Hi,
> > >> One of the API is calling three of the below queries. Output(jobids)
> of
> > >> 1st query gets fed to 2nd query in API.
> > >>
> > >> Query 1:
> > >>
> > >> select j.id from job j where $19=$20 and j.internaljobcode in
> > >> ($21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31)  and j.countrycode =
> $1  and
> > >> j.facilitycode in ($2) and j.jobstatuscode in ($3, $4, $5, $6) and
> > >> j.internaljobcode in ($7, $8, $9, $10, $11, $12, $13, $14, $15)  and
> > >> ((j.jobstartdatetime between $16 and $17) or j.jobstartdatetime IS
> NULL)
> > >> ORDER BY createddate DESC limit $18"
> > >>
> > >> Query 2
> > >>
> > >> with JobData AS ( select * from job where id in ($1, $2, $3, $4, $5,
> $6,
> > >> $7, $8, $9, $10) )
> > >> select j.id
> > >>
> ,j.jobcategory,j.internaljobcode,j.jobname,j.jobstatuscode,j.jobreferencenumber,
> > >>
> > >>
> vws.vascodes,j.createddate,j.facilitycode,j.countrycode,j.sladate,j.codamount,j.jobstartdatetime,j.jobenddatetime,j.attemptcount,
> > >> j.primeindicator,j.rescheduleddatetime,j.jobproductcode,
> > >> j.tour_id, j.pickupaccount,
> > >>
> j.connoterequired,j.expectedbags,j.expectedparcels,j.isservicepointpickup,
> > >>
> > >>
> j.estimateddeliverydatetime,j.currency,j.paymentmethod,j.paymentamount,j.missorted,j.pickupcustomername,j.mps,j.parcelcount,j.jobcontactpersonname,t.courier_id,t.tourid,
> > >> js.jobstatusname, jt.externaljobcode, ja.city, ja.postalcode,
> > >> ja.addressline1, ja.addressline2,
> > >> ja.addressline3,ja.addresstype, ja.state
> > >> from JobData j join jobaddress ja on ja.job_id=j.id join
> > >> jobstatus js on js.jobstatuscode=j.jobstatuscode
> > >> join jobtype jt on j.internaljobcode=jt.internaljobcode
> left
> > >> join
> > >> (select v.job_id, string_agg(distinct(v.code),'PPOD') as
> > >> vascodes from JobData j  join valueaddedservices v on j.id=v.job_id
> > >> group by v.job_id) AS vws on vws.job_id=j.id left join tour t on t.id
> =j.tour_id
> > >> and ((j.internaljobcode in ('003','012') and ja.addresstype='RETURN')
> or
> > >> j.internaljobcode not in ('003','012')) ORDER BY id DESC ;
> > >>
> > >> Query3:
> > >>
> > >> "with JobCount as ( select jobstatuscode,count($14) stat_count from
> job j
> > >> where $15=$16  and j.countrycode = $1  and j.facilitycode in ($2) and
> > >> j.internaljobcode in ($3, $4, $5, $6, $7, $8, $9, $10, $11)  and
> > >> ((j.jobstartdatetime between $12 and $13) or j.jobstartdatetime IS
> NULL)
> > >> group by j.jobstatuscode)
> > >> select js.jobstatuscode,COALESCE(stat_count,$17) stat_count from
> JobCount
> > >> jc right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode"
> > >>
> > >>
> > >> When I run explain analyze for 1st two queries Execution Time is
> below 1
> > >> milliseconds for these queries. Basically queries run fast and with
> low
> > >> cost when ran from Database 'psql' or pgadmin. However when called
> from API
> > >> Average Time in pg_stat_statements shows more than 1 second. When
> Load test
> > >> runs these queries get concurrently called ,response time beomes poor
> with
> > >> more load. Could this be due to Lockings.
> > >>
> > >>
> > >> I checked pg_locks and I see the below records. Query that I used is
> also
> > >> given below. I could see few ExclusiveLocks for "virtualxid" records
> and
> > >> for queries with CTEs(WITH Clause). Please advise
> > >>
> > >>
> > >>
> > >> SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa
> > >> ON pl.pid = psa.pid;
>
> 
>
> You would find that the "granted" column in all the rows from pg_locks
> is "true", that is, no one is waiting on a lock. That slowdown doesn't
> at least seem coming from lock conflict.
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>


Re: Pg_locks and pg_stat_activity

2020-12-04 Thread aditya desai
Hi Justin,
Thanks for your response.
I suspect JDBC is using parameterized queries (postgres $1, $2, ..) and psql
isn't (unless you type "prepare p AS SELECT ..." and execute p(.., .., ..)"

Yes JDBC is using parameterized queries which get constructed dynamically
depending upon user privileges in the application.Does this cause any
issues?

I don't know what server version you have, so I don't know whether to
suggest
testing with plan_cache_mode=force_custom_plan

It's Postgres 11.5.

I will look into links you shared.

Regards,
AD.

On Fri, Dec 4, 2020 at 1:47 PM Justin Pryzby  wrote:

> On Fri, Dec 04, 2020 at 01:31:14PM +0530, aditya desai wrote:
> > Hi Justin,
> > Many thanks for your response. Please see my response below.
> >
> > What do you mean by API ?  If it's a different client, how does it
> connect ?
> > Queries are getting called from Web UI built in Microservices spring
> boot.
> > It connected to Database with JDBC driver. Developers have handled
> > connection pooling at the Application side.
> >
> > What db driver ?
> >
> > Driver is JDBC
>
> I suspect JDBC is using parameterized queries (postgres $1, $2, ..) and
> psql
> isn't (unless you type "prepare p AS SELECT ..." and execute p(.., .., ..)"
>
> You can search and find other people who reported similar issues.
>
> https://www.postgresql.org/message-id/flat/A737B7A37273E048B164557ADEF4A58B50FB8D5E%40ntex2010i.host.magwien.gv.at
>
> https://www.postgresql.org/message-id/flat/20200504191201.GU28974%40telsasoft.com
>
> I don't know what server version you have, so I don't know whether to
> suggest
> testing with plan_cache_mode=force_custom_plan
>
> --
> Justin
>


SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Hi,
We migrated our Oracle Databases to PostgreSQL. One of the simple select
query that takes 4 ms on Oracle is taking around 200 ms on PostgreSQL.
Could you please advise. Please find query and query plans below. Gather
cost seems high. Will increasing max_parallel_worker_per_gather help?

explain analyse SELECT bom.address_key dom2137,bom.address_type_key
dom1727,bom.start_date dom1077,bom.end_date dom828,bom.address_status_key
dom1955,bom.address_role_key dom1711,bom.delivery_point_created
dom2362,bom.postcode dom873,bom.postcode_name dom1390,bom.street_name
dom1186,bom.premises_number_1 dom1777,bom.premises_number_2
dom1778,bom.premises_letter_1 dom1784,bom.premises_letter_2
dom1785,bom.premises_separator dom1962,bom.stairway dom892,bom.po_box
dom653,bom.apartment_number dom1732,bom.apartment_letter
dom1739,bom.street_key dom1097,bom.address_use_key dom1609,bom.language_key
dom1272,bom.address_family_id dom1796,bom.cur_address_key
dom2566,bom.created_by dom1052,bom.modified_by dom1158,bom.creation_time
dom1392,bom.modification_time dom1813 FROM DEPT.address dom WHERE
address_key = 6113763

[
{
"Plan": {
"Node Type": "Gather",
"Parallel Aware": false,
"Actual Rows": 1,
"Actual Loops": 1,
"Workers Planned": 1,
"Workers Launched": 1,
"Single Copy": true,
"Plans": [
{
"Node Type": "Index Scan",
"Parent Relationship": "Outer",
"Parallel Aware": false,
"Scan Direction": "Forward",
"Index Name": "address1_i7",
"Relation Name": "address",
"Alias": "dom",
"Actual Rows": 1,
"Actual Loops": 1,
"Index Cond": "(address_key = 6113763)",
"Rows Removed by Index Recheck": 0
}
]
},
"Triggers": []
}
]

"Gather (cost=1000.43..1002.75 rows=1 width=127) (actual
time=174.318..198.539 rows=1 loops=1)"
" Workers Planned: 1"
" Workers Launched: 1"
" Single Copy: true"
" -> Index Scan using address1_i7 on address1 dom (cost=0.43..2.65 rows=1
width=127) (actual time=0.125..0.125 rows=1 loops=1)"
" Index Cond: (address_key = 6113763)"
"Planning Time: 0.221 ms"
"Execution Time: 198.601 ms"



Regards,
Aditya.


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Hi Michael,
Thanks for your response.
Is this table partitioned? - No
How long ago was migration done? - 27th March 2021
Has vacuum freeze and analyze of tables been done? - We ran vacuum analyze.
 Was index created after populating data or reindexed after perhaps? -
Index was created after data load and reindex was executed on all tables
yesterday.
 Version is PostgreSQL-11

Regards,
Aditya.


On Sat, Apr 3, 2021 at 7:40 PM Michael Lewis  wrote:

> It seems like something is missing. Is this table partitioned? How long
> ago was migration done? Has vacuum freeze and analyze of tables been done?
> Was index created after populating data or reindexed after perhaps? What
> version are you using?
>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Hi Justin,
Yes, force_parallel_mode is on. Should we set it off?

Regards,
Aditya.

On Sat, Apr 3, 2021 at 7:46 PM Justin Pryzby  wrote:

> On Sat, Apr 03, 2021 at 04:08:01PM +0200, Pavel Stehule wrote:
> > so 3. 4. 2021 v 15:38 odesílatel aditya desai 
> napsal:
> > > "Gather (cost=1000.43..1002.75 rows=1 width=127) (actual
> > > time=174.318..198.539 rows=1 loops=1)"
> > > " Workers Planned: 1"
> > > " Workers Launched: 1"
> > > " Single Copy: true"
> > > " -> Index Scan using address1_i7 on address1 dom (cost=0.43..2.65
> rows=1
> > > width=127) (actual time=0.125..0.125 rows=1 loops=1)"
> > > " Index Cond: (address_key = 6113763)"
> > > "Planning Time: 0.221 ms"
> > > "Execution Time: 198.601 ms"
> >
> > You should have broken configuration - there is not any reason to start
> > parallelism -  probably some option in postgresql.conf has very bad
> value.
> > Second - it's crazy to see 200 ms just on interprocess communication -
> > maybe your CPU is overutilized.
>
> It seems like force_parallel_mode is set, which is for debugging and not
> for
> "forcing things to go faster".  Maybe we should rename the parameter, like
> parallel_mode_testing=on.
>
> http://rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html
>
> --
> Justin
>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Thanks Bruce!! Will set it off and retry.

On Sat, Apr 3, 2021 at 8:42 PM Bruce Momjian  wrote:

> On Sat, Apr  3, 2021 at 08:38:18PM +0530, aditya desai wrote:
> > Hi Justin,
> > Yes, force_parallel_mode is on. Should we set it off?
>
> Yes.  I bet someone set it without reading our docs:
>
>
> https://www.postgresql.org/docs/13/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER
>
> --> Allows the use of parallel queries for testing purposes even in
> cases
> --> where no performance benefit is expected.
>
> We might need to clarify this sentence to be clearer it is _only_ for
> testing.  Also, I suggest you review _all_ changes that have been made
> to the server since I am worried other unwise changes might also have
> been made.
>
> ---
>
> >
> > Regards,
> > Aditya.
> >
> > On Sat, Apr 3, 2021 at 7:46 PM Justin Pryzby 
> wrote:
> >
> > On Sat, Apr 03, 2021 at 04:08:01PM +0200, Pavel Stehule wrote:
> > > so 3. 4. 2021 v 15:38 odesílatel aditya desai 
> > napsal:
> > > > "Gather (cost=1000.43..1002.75 rows=1 width=127) (actual
> > > > time=174.318..198.539 rows=1 loops=1)"
> > > > " Workers Planned: 1"
> > > > " Workers Launched: 1"
> > > > " Single Copy: true"
> > > > " -> Index Scan using address1_i7 on address1 dom
> (cost=0.43..2.65 rows
> > =1
> > > > width=127) (actual time=0.125..0.125 rows=1 loops=1)"
> > > > " Index Cond: (address_key = 6113763)"
> > > > "Planning Time: 0.221 ms"
> > > > "Execution Time: 198.601 ms"
> > >
> > > You should have broken configuration - there is not any reason to
> start
> > > parallelism -  probably some option in postgresql.conf has very bad
> > value.
> > > Second - it's crazy to see 200 ms just on interprocess
> communication -
> > > maybe your CPU is overutilized.
> >
> > It seems like force_parallel_mode is set, which is for debugging and
> not
> > for
> > "forcing things to go faster".  Maybe we should rename the
> parameter, like
> > parallel_mode_testing=on.
> >
> >
> http://rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html
> >
> > --
> > Justin
> >
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
   | environment variable
 max_wal_senders   | 10
   | configuration file
 max_wal_size  | 26931MB
  | configuration file
 min_wal_size  | 4GB
  | configuration file
 pg_qs.query_capture_mode  | top
  | configuration file
 pgms_wait_sampling.query_capture_mode | all
  | configuration file
 pgstat_udp_port   | 20224
  | command line
 port  | 20224
  | command line
 random_page_cost  | 1.1
  | configuration file
 shared_buffers| 64GB
   | configuration file
 ssl   | on
   | configuration file
 ssl_ca_file   | root.crt
   | configuration file
 superuser_reserved_connections| 5
  | configuration file
 TimeZone  | EET
  | configuration file
 track_io_timing   | on
   | configuration file
 wal_buffers   | 128MB
  | configuration file
 wal_keep_segments | 25
   | configuration file
 wal_level | replica
  | configuration file
 work_mem  | 16MB
   | configuration file


On Sat, Apr 3, 2021 at 8:59 PM aditya desai  wrote:

> Hi Bruce,
> Please find the below output.force_parallel_mode if off now.
>
>  aad_log_min_messages  | warning
> | configuration file
>  application_name  | psql
>| client
>  archive_command   |
> c:\postgres\bin\xlogcopy\xlogcopy.exe archive blob "%f" "%p" |
> configuration file
>  archive_mode  | on
>| configuration file
>  archive_timeout   | 15min
> | configuration file
>  authentication_timeout| 30s
> | configuration file
>  autovacuum_analyze_scale_factor   | 0.05
>| configuration file
>  autovacuum_naptime| 15s
> | configuration file
>  autovacuum_vacuum_scale_factor| 0.05
>| configuration file
>  bgwriter_delay| 20ms
>| configuration file
>  bgwriter_flush_after  | 512kB
> | configuration file
>  bgwriter_lru_maxpages | 100
> | configuration file
>  checkpoint_completion_target  | 0.9
> | configuration file
>  checkpoint_flush_after| 256kB
> | configuration file
>  checkpoint_timeout| 5min
>| configuration file
>  client_encoding   | UTF8
>| client
>  connection_ID |
> 5b59f092-444c-49df-b5d6-a7a0028a7855 | client
>  connection_PeerIP |
> fd40:4d4a:11:5067:6d11:500:a07:5144  | client
>  connection_Vnet   | on
>| client
>  constraint_exclusion  | partition
> | configuration file
>  data_sync_retry   | on
>| configuration file
>  DateStyle | ISO, MDY
>| configuration file
>  default_text_search_config| pg_catalog.english
>| configuration file
>  dynamic_shared_memory_type| windows
> | configuration file
>  effective_cache_size  | 160GB
> | configuration file
>  enable_seqscan| off
> | configuration file
>  force_parallel_mode   | off
> | configuration file
>  from_collapse_limit   | 15
>| configuration file
>  full_page_writes  | off
> | configuration file
>  hot_standby   | on
>

Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
I will gather all information and get back to you

On Sat, Apr 3, 2021 at 9:00 PM Pavel Stehule 
wrote:

>
>
> so 3. 4. 2021 v 17:15 odesílatel aditya desai  napsal:
>
>> Hi Pavel,
>> Thanks for response. Please see below.
>> work_mem=16MB
>> maintenance_work_mem=1GB
>> effective_cache_size=160GB
>> shared_buffers=64GB
>> force_parallel_mode=ON
>>
>
> force_parallel_mode is very bad idea. efective_cache_size=160GB can be too
> much too. work_mem 16 MB is maybe too low. The configuration looks a little
> bit chaotic :)
>
> How much has RAM your server? How much CPU cores are there? What is
> max_connections?
>
> Regards
>
> Pavel
>
>
>
>> Regards,
>> Aditya.
>>
>>
>> On Sat, Apr 3, 2021 at 7:38 PM Pavel Stehule 
>> wrote:
>>
>>>
>>>
>>> so 3. 4. 2021 v 15:38 odesílatel aditya desai 
>>> napsal:
>>>
>>>> Hi,
>>>> We migrated our Oracle Databases to PostgreSQL. One of the simple
>>>> select query that takes 4 ms on Oracle is taking around 200 ms on
>>>> PostgreSQL. Could you please advise. Please find query and query plans
>>>> below. Gather cost seems high. Will increasing
>>>> max_parallel_worker_per_gather help?
>>>>
>>>> explain analyse SELECT bom.address_key dom2137,bom.address_type_key
>>>> dom1727,bom.start_date dom1077,bom.end_date dom828,bom.address_status_key
>>>> dom1955,bom.address_role_key dom1711,bom.delivery_point_created
>>>> dom2362,bom.postcode dom873,bom.postcode_name dom1390,bom.street_name
>>>> dom1186,bom.premises_number_1 dom1777,bom.premises_number_2
>>>> dom1778,bom.premises_letter_1 dom1784,bom.premises_letter_2
>>>> dom1785,bom.premises_separator dom1962,bom.stairway dom892,bom.po_box
>>>> dom653,bom.apartment_number dom1732,bom.apartment_letter
>>>> dom1739,bom.street_key dom1097,bom.address_use_key dom1609,bom.language_key
>>>> dom1272,bom.address_family_id dom1796,bom.cur_address_key
>>>> dom2566,bom.created_by dom1052,bom.modified_by dom1158,bom.creation_time
>>>> dom1392,bom.modification_time dom1813 FROM DEPT.address dom WHERE
>>>> address_key = 6113763
>>>>
>>>> [
>>>> {
>>>> "Plan": {
>>>> "Node Type": "Gather",
>>>> "Parallel Aware": false,
>>>> "Actual Rows": 1,
>>>> "Actual Loops": 1,
>>>> "Workers Planned": 1,
>>>> "Workers Launched": 1,
>>>> "Single Copy": true,
>>>> "Plans": [
>>>> {
>>>> "Node Type": "Index Scan",
>>>> "Parent Relationship": "Outer",
>>>> "Parallel Aware": false,
>>>> "Scan Direction": "Forward",
>>>> "Index Name": "address1_i7",
>>>> "Relation Name": "address",
>>>> "Alias": "dom",
>>>> "Actual Rows": 1,
>>>> "Actual Loops": 1,
>>>> "Index Cond": "(address_key = 6113763)",
>>>> "Rows Removed by Index Recheck": 0
>>>> }
>>>> ]
>>>> },
>>>> "Triggers": []
>>>> }
>>>> ]
>>>>
>>>> "Gather (cost=1000.43..1002.75 rows=1 width=127) (actual
>>>> time=174.318..198.539 rows=1 loops=1)"
>>>> " Workers Planned: 1"
>>>> " Workers Launched: 1"
>>>> " Single Copy: true"
>>>> " -> Index Scan using address1_i7 on address1 dom (cost=0.43..2.65
>>>> rows=1 width=127) (actual time=0.125..0.125 rows=1 loops=1)"
>>>> " Index Cond: (address_key = 6113763)"
>>>> "Planning Time: 0.221 ms"
>>>> "Execution Time: 198.601 ms"
>>>>
>>>
>>> You should have broken configuration - there is not any reason to start
>>> parallelism -  probably some option in postgresql.conf has very bad value.
>>> Second - it's crazy to see 200 ms just on interprocess communication -
>>> maybe your CPU is overutilized.
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>>
>>>
>>>>
>>>>
>>>> Regards,
>>>> Aditya.
>>>>
>>>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Thanks Justin. Will review all parameters and get back to you.

On Sat, Apr 3, 2021 at 9:11 PM Justin Pryzby  wrote:

> On Sat, Apr 03, 2021 at 11:39:19AM -0400, Tom Lane wrote:
> > Bruce Momjian  writes:
> > > On Sat, Apr  3, 2021 at 08:38:18PM +0530, aditya desai wrote:
> > >> Yes, force_parallel_mode is on. Should we set it off?
> >
> > > Yes.  I bet someone set it without reading our docs:
> >
> > >
> https://www.postgresql.org/docs/13/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER
> >
> > > --> Allows the use of parallel queries for testing purposes even in
> cases
> > > --> where no performance benefit is expected.
> >
> > > We might need to clarify this sentence to be clearer it is _only_ for
> > > testing.
> >
> > I wonder why it is listed under planner options at all, and not under
> > developer options.
>
> Because it's there to help DBAs catch errors in functions incorrectly
> marked as
> parallel safe.
>
> --
> Justin
>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Hi Justin/Bruce/Pavel,
Thanks for your inputs. After setting force_parallel_mode=off Execution
time of same query was reduced to 1ms from 200 ms. Worked like a charm. We
also increased work_mem to 80=MB. Thanks again.

Regards,
Aditya.

On Sat, Apr 3, 2021 at 9:14 PM aditya desai  wrote:

> Thanks Justin. Will review all parameters and get back to you.
>
> On Sat, Apr 3, 2021 at 9:11 PM Justin Pryzby  wrote:
>
>> On Sat, Apr 03, 2021 at 11:39:19AM -0400, Tom Lane wrote:
>> > Bruce Momjian  writes:
>> > > On Sat, Apr  3, 2021 at 08:38:18PM +0530, aditya desai wrote:
>> > >> Yes, force_parallel_mode is on. Should we set it off?
>> >
>> > > Yes.  I bet someone set it without reading our docs:
>> >
>> > >
>> https://www.postgresql.org/docs/13/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER
>> >
>> > > --> Allows the use of parallel queries for testing purposes even in
>> cases
>> > > --> where no performance benefit is expected.
>> >
>> > > We might need to clarify this sentence to be clearer it is _only_ for
>> > > testing.
>> >
>> > I wonder why it is listed under planner options at all, and not under
>> > developer options.
>>
>> Because it's there to help DBAs catch errors in functions incorrectly
>> marked as
>> parallel safe.
>>
>> --
>> Justin
>>
>


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-03 Thread aditya desai
Yes. I have made suggestions on connection pooling as well. Currently it is
being done from Application side.

On Sat, Apr 3, 2021 at 11:12 PM Pavel Stehule 
wrote:

>
>
> so 3. 4. 2021 v 19:37 odesílatel aditya desai  napsal:
>
>> Hi Justin/Bruce/Pavel,
>> Thanks for your inputs. After setting force_parallel_mode=off Execution
>> time of same query was reduced to 1ms from 200 ms. Worked like a charm. We
>> also increased work_mem to 80=MB. Thanks
>>
>
> super.
>
> The too big max_connection can cause a lot of problems. You should install
> and use pgbouncer or pgpool II.
>
>
> https://scalegrid.io/blog/postgresql-connection-pooling-part-4-pgbouncer-vs-pgpool/
>
> Regards
>
> Pavel
>
>
>
>
>> again.
>>
>> Regards,
>> Aditya.
>>
>> On Sat, Apr 3, 2021 at 9:14 PM aditya desai  wrote:
>>
>>> Thanks Justin. Will review all parameters and get back to you.
>>>
>>> On Sat, Apr 3, 2021 at 9:11 PM Justin Pryzby 
>>> wrote:
>>>
>>>> On Sat, Apr 03, 2021 at 11:39:19AM -0400, Tom Lane wrote:
>>>> > Bruce Momjian  writes:
>>>> > > On Sat, Apr  3, 2021 at 08:38:18PM +0530, aditya desai wrote:
>>>> > >> Yes, force_parallel_mode is on. Should we set it off?
>>>> >
>>>> > > Yes.  I bet someone set it without reading our docs:
>>>> >
>>>> > >
>>>> https://www.postgresql.org/docs/13/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER
>>>> >
>>>> > > --> Allows the use of parallel queries for testing purposes even in
>>>> cases
>>>> > > --> where no performance benefit is expected.
>>>> >
>>>> > > We might need to clarify this sentence to be clearer it is _only_
>>>> for
>>>> > > testing.
>>>> >
>>>> > I wonder why it is listed under planner options at all, and not under
>>>> > developer options.
>>>>
>>>> Because it's there to help DBAs catch errors in functions incorrectly
>>>> marked as
>>>> parallel safe.
>>>>
>>>> --
>>>> Justin
>>>>
>>>


SHARED LOCKS , EXCLUSIVE LOCKS, ACCESS EXCLUSIVE LOCKS

2021-04-04 Thread aditya desai
Hi,
We have few select queries during which we see SHARED LOCKS and EXCLUSIVE
LOCKS on tables. Can these locks cause slowness? Is there any way to reduce
the locks?

What must be causing ACCESS EXCLUSIVE LOCKS when the application is running
select queries? Is it AUTOVACUUM?

Regards,
Aditya.


Re: SELECT Query taking 200 ms on PostgreSQL compared to 4 ms on Oracle after migration.

2021-04-04 Thread aditya desai
Noted thanks!!

On Sun, Apr 4, 2021 at 4:19 PM Pavel Stehule 
wrote:

>
>
> ne 4. 4. 2021 v 12:39 odesílatel aditya desai  napsal:
>
>> Hi Pavel,
>> Notes thanks. We have 64 core cpu and 320 GB RAM.
>>
>
> ok - this is probably good for max thousand connections, maybe less (about
> 6 hundred). Postgres doesn't perform well, when there are too many active
> queries. Other databases have limits for active queries, and then use an
> internal queue. But Postgres has nothing similar.
>
>
>
>
>
>
>
>> Regards,
>> Aditya.
>>
>> On Sat, Apr 3, 2021 at 11:21 PM Pavel Stehule 
>> wrote:
>>
>>>
>>>
>>> so 3. 4. 2021 v 19:45 odesílatel aditya desai 
>>> napsal:
>>>
>>>> Yes. I have made suggestions on connection pooling as well. Currently
>>>> it is being done from Application side.
>>>>
>>>
>>> It is usual - but the application side pooling doesn't solve well
>>> overloading. The behaviour of the database is not linear. Usually opened
>>> connections are not active. But any non active connection can be changed to
>>> an active connection (there is not any limit for active connections), and
>>> then the performance can be very very slow. Good pooling and good setting
>>> of max_connections is protection against overloading. max_connection should
>>> be 10-20 x CPU cores  (for OLTP)
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>>
>>>
>>>> On Sat, Apr 3, 2021 at 11:12 PM Pavel Stehule 
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> so 3. 4. 2021 v 19:37 odesílatel aditya desai 
>>>>> napsal:
>>>>>
>>>>>> Hi Justin/Bruce/Pavel,
>>>>>> Thanks for your inputs. After setting force_parallel_mode=off
>>>>>> Execution time of same query was reduced to 1ms from 200 ms. Worked like 
>>>>>> a
>>>>>> charm. We also increased work_mem to 80=MB. Thanks
>>>>>>
>>>>>
>>>>> super.
>>>>>
>>>>> The too big max_connection can cause a lot of problems. You should
>>>>> install and use pgbouncer or pgpool II.
>>>>>
>>>>>
>>>>> https://scalegrid.io/blog/postgresql-connection-pooling-part-4-pgbouncer-vs-pgpool/
>>>>>
>>>>> Regards
>>>>>
>>>>> Pavel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> again.
>>>>>>
>>>>>> Regards,
>>>>>> Aditya.
>>>>>>
>>>>>> On Sat, Apr 3, 2021 at 9:14 PM aditya desai 
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Justin. Will review all parameters and get back to you.
>>>>>>>
>>>>>>> On Sat, Apr 3, 2021 at 9:11 PM Justin Pryzby 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Sat, Apr 03, 2021 at 11:39:19AM -0400, Tom Lane wrote:
>>>>>>>> > Bruce Momjian  writes:
>>>>>>>> > > On Sat, Apr  3, 2021 at 08:38:18PM +0530, aditya desai wrote:
>>>>>>>> > >> Yes, force_parallel_mode is on. Should we set it off?
>>>>>>>> >
>>>>>>>> > > Yes.  I bet someone set it without reading our docs:
>>>>>>>> >
>>>>>>>> > >
>>>>>>>> https://www.postgresql.org/docs/13/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER
>>>>>>>> >
>>>>>>>> > > --> Allows the use of parallel queries for testing purposes
>>>>>>>> even in cases
>>>>>>>> > > --> where no performance benefit is expected.
>>>>>>>> >
>>>>>>>> > > We might need to clarify this sentence to be clearer it is
>>>>>>>> _only_ for
>>>>>>>> > > testing.
>>>>>>>> >
>>>>>>>> > I wonder why it is listed under planner options at all, and not
>>>>>>>> under
>>>>>>>> > developer options.
>>>>>>>>
>>>>>>>> Because it's there to help DBAs catch errors in functions
>>>>>>>> incorrectly marked as
>>>>>>>> parallel safe.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Justin
>>>>>>>>
>>>>>>>


Re: SHARED LOCKS , EXCLUSIVE LOCKS, ACCESS EXCLUSIVE LOCKS

2021-04-05 Thread aditya desai
Thanks Amine and Justin. I will check and try this.

Regards,
Aditya.

On Sun, Apr 4, 2021 at 10:49 PM Justin Pryzby  wrote:

> On Sun, Apr 04, 2021 at 04:12:14PM +0530, aditya desai wrote:
> > Hi,
> > We have few select queries during which we see SHARED LOCKS and EXCLUSIVE
> > LOCKS on tables. Can these locks cause slowness? Is there any way to
> reduce
> > the locks?
> >
> > What must be causing ACCESS EXCLUSIVE LOCKS when the application is
> running
> > select queries? Is it AUTOVACUUM?
>
> I suggest to review all the logging settings, and consider setting:
> log_destination = 'stderr,csvlog'
>
>
> log_checkpoints = on
>
>
>
> log_lock_waits  = on
>
>
>
> log_min_messages= info
>
>
> log_min_error_statement = notice
>
>
>
> log_temp_files  = 0
>
>
> log_min_duration_statement  = '9sec'
>
>
>
> log_autovacuum_min_duration = '99sec'
>
>
>
> You should probably set up some way to monitor logs.
> We set log_destination=csvlog and import them into the DB.
> Then I have nagios checks for slow queries, errors, many tempfiles, etc.
>
> https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG
> https://www.postgresql.org/message-id/[email protected]
>
> --
> Justin
>


Substitute for synonym in Oracle after migration to postgres

2021-04-06 Thread aditya desai
Hi,
We have to access data from one schema to another. We have created a view
for this but performance is not good. We tried materialized views as well
but Refresh MV is creating problem as it puts and access exclusive locks.

Is there any other way to achieve this?


Regards,
Aditya.


select count(*) is slow

2021-04-06 Thread aditya desai
Hi,
Below query takes 12 seconds. We have an index on  postcode.

select count(*) from table where postcode >= '00420' AND postcode <= '00500'

index:

CREATE INDEX Table_i1
ON table  USING btree
((postcode::numeric));

Table has 180,000 rows and the count is  150,000. Expectation is to run
this query in 2-3 seconds(it takes 2 seconds in Oracle).

Here is a query plan:

"Aggregate  (cost=622347.34..622347.35 rows=1 width=8) (actual
time=12850.580..12850.580 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on table  (cost=413379.89..621681.38 rows=266383
width=0) (actual time=12645.656..12835.185 rows=209749 loops=1)"
"Recheck Cond: (((postcode)::text >= '00420'::text) AND
((postcode)::text <= '00500'::text))"
"Heap Blocks: exact=118286"
"->  Bitmap Index Scan on table_i4  (cost=0.00..413313.29
rows=266383 width=0) (actual time=12615.321..12615.321 rows=209982 loops=1)"
"  Index Cond: (((postcode)::text >= '00420'::text) AND
((postcode)::text <= '00500'::text))"
"Planning Time: 0.191 ms"
"Execution Time: 12852.823 ms"



Regards,
Aditya.


Re: Substitute for synonym in Oracle after migration to postgres

2021-04-06 Thread aditya desai
Thanks will check.

On Tue, Apr 6, 2021 at 4:11 PM hubert depesz lubaczewski 
wrote:

> On Tue, Apr 06, 2021 at 01:22:31PM +0530, aditya desai wrote:
> > Hi,
> > We have to access data from one schema to another. We have created a
> view for this but performance is not good. We tried
> > materialized views as well but Refresh MV is creating problem as it puts
> and access exclusive locks.
> > Is there any other way to achieve this?
>
> Yes, just use the other table right in your query. There is no need to
> add wrappers.
>
> select * from schema1.table join schema2.table on ...
>
> depesz
>


Re: select count(*) is slow

2021-04-06 Thread aditya desai
Thanks Tom. Will try with numeric. Please ignore table and index naming.

On Tue, Apr 6, 2021 at 6:55 PM Tom Lane  wrote:

> aditya desai  writes:
> > Below query takes 12 seconds. We have an index on  postcode.
>
> > select count(*) from table where postcode >= '00420' AND postcode <=
> '00500'
>
> That query does not match this index:
>
> > CREATE INDEX Table_i1
> > ON table  USING btree
> > ((postcode::numeric));
>
> You could either change postcode to numeric, change all your queries
> of this sort to include the cast explicitly, or make an index that
> doesn't have a cast.
>
> regards, tom lane
>


Re: select count(*) is slow

2021-04-07 Thread aditya desai
Thanks to all of you. Removed casting to numeric from Index. Performance
improved from 12 sec to 500 ms. Rocket!!!

On Tue, Apr 6, 2021 at 9:14 PM Andrew Dunstan  wrote:

>
> On 4/6/21 9:30 AM, aditya desai wrote:
> > Thanks Tom. Will try with numeric. Please ignore table and index naming.
> >
> > On Tue, Apr 6, 2021 at 6:55 PM Tom Lane  > <mailto:[email protected]>> wrote:
> >
> > aditya desai mailto:[email protected]>>
> writes:
> > > Below query takes 12 seconds. We have an index on  postcode.
> >
> > > select count(*) from table where postcode >= '00420' AND
> > postcode <= '00500'
> >
> > That query does not match this index:
> >
> > > CREATE INDEX Table_i1
> > > ON table  USING btree
> > > ((postcode::numeric));
> >
> > You could either change postcode to numeric, change all your queries
> > of this sort to include the cast explicitly, or make an index that
> > doesn't have a cast.
> >
> >
> >
>
>
> IMNSHO postcodes, zip codes, telephone numbers and the like should never
> be numeric under any circumstances. This isn't numeric data (what is the
> average postcode?), it's textual data consisting of digits, so they
> should always be text/varchar. The index here should just be on the
> plain text column, not cast to numeric.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


str_aggr function not wokring

2021-04-08 Thread aditya desai
Hi,
I need to combine results of multiple rows in one row. I get below error.
Could you please help.

Query:

select string_agg((select '**' || P.PhaseName || ' - ' ||
R.Recommendation AS "ABC" from tblAssessmentRecommendation
R,tblAssessmentPhases P
where  R.PhaseID = P.PhaseID  Order BY P.sortOrder DESC),' ')

Error:

ERROR: more than one row returned by a subquery used as an expression SQL
state: 21000

Regards,
Aditya.


Re: str_aggr function not wokring

2021-04-08 Thread aditya desai
Thanks Patrick. I used WITH Query and feeded that output to string_aggr
which worked. However it is giving performance issues. Will check on that.
THanks.

On Thu, Apr 8, 2021 at 5:11 PM Patrick FICHE 
wrote:

> *From:* aditya desai 
> *Sent:* Thursday, April 8, 2021 1:32 PM
> *To:* Pgsql Performance 
> *Subject:* str_aggr function not wokring
>
>
>
> Hi,
>
> I need to combine results of multiple rows in one row. I get below error.
> Could you please help.
>
>
>
> Query:
>
>
>
> select string_agg((select '**' || P.PhaseName || ' - ' ||
> R.Recommendation AS "ABC" from tblAssessmentRecommendation
> R,tblAssessmentPhases P
>
> where  R.PhaseID = P.PhaseID  Order BY P.sortOrder DESC),' ')
>
>
>
> Error:
>
>
>
> ERROR: more than one row returned by a subquery used as an expression SQL
> state: 21000
>
>
>
> Regards,
>
> Aditya.
>
>
>
>
>
> Hi,
>
>
>
> I would suggest you to try something like this instead
>
>
>
> select string_agg( '**' || P.PhaseName || ' - ' || R.Recommendation ''
>  ORDER BY P.sortOrder DESC ) AS "ABC"
>
> from tblAssessmentRecommendation R,tblAssessmentPhases P
>
> where  R.PhaseID = P.PhaseID
>
>
>
> Regards,
>
>
>
> Patrick
>
>
>


Re: str_aggr function not wokring

2021-04-08 Thread aditya desai
Sure!! Thanks for the response. Apologies for multiple questions. Faced
this during high priority MSSQL to PostgreSQL migration. Did not see any
equivalent of XML PATH which would give desired results. Finally was able
to resolve the issue by rewriting the Proc using WITH and string_aggr in
combination. However still facing performance issues in the same. Will
investigate it.

On Thu, Apr 8, 2021 at 5:08 PM Mike Sofen  wrote:

> You realize that there are a million answers to your questions online?
> Are you doing any google searches before bothering this list with basic
> questions?  I personally never email this list until I’ve exhausted all
> searches and extensive trial and error, as do most practitioners.  This
> list is incredibly patient and polite, and...there are limits.  Please
> consider doing more research before asking a question.  In your example
> below, you’re getting a basic subquery error – research how to fix that.
> Mike
>
>
>
> *From:* aditya desai 
> *Sent:* Thursday, April 08, 2021 4:32 AM
> *To:* Pgsql Performance 
> *Subject:* str_aggr function not wokring
>
>
>
> Hi,
>
> I need to combine results of multiple rows in one row. I get below error.
> Could you please help.
>
>
>
> Query:
>
>
>
> select string_agg((select '**' || P.PhaseName || ' - ' ||
> R.Recommendation AS "ABC" from tblAssessmentRecommendation
> R,tblAssessmentPhases P
>
> where  R.PhaseID = P.PhaseID  Order BY P.sortOrder DESC),' ')
>
>
>
> Error:
>
>
>
> ERROR: more than one row returned by a subquery used as an expression SQL
> state: 21000
>
>
>
> Regards,
>
> Aditya.
>


Re: OLEDB for PostgreSQL

2021-04-19 Thread aditya desai
Hi Mustafa,
You can look into the SQLine tool. We recently used to migrate MSSQL to
PostgreSQL. For procedures/functions etc you need to have good amount of
understanding of TSQL and PL-PGSQL. SQLine will convert 60-80% of you TSQL
code. Some manual effort is required at the end.

Regards,
Aditya.

On Fri, Apr 16, 2021 at 4:03 PM  wrote:

> Hello,
>
>
>
> We want to start migration to POSTGRESQL13 from  MSSQL SERVER  but we
> couldnt find any solution for oledb drivers. Please help us to find a
> solution or any workaround if possible.
>
>
>
>
>
>
>
> Thanks.
>
> Mustafa
>


Error while calling proc with table type from Application

2021-04-29 Thread aditya desai
Hi,
One of the procs which accept tabletype as parameter gives below error
while being called from Application. Could not find a concrete solution for
this. Can someone help?

call PROCEDURE ABC (p_optiontable optiontype)



Below is the error while executing proc -

“the clr type system.data.datatable isn't natively supported by npgsql or
your postgresql. to use it with a postgresql composite you need to specify
datatypename or to map it, please refer to the documentation.”


Regards,

Aditya.


Re: Error while calling proc with table type from Application (npgsql)

2021-04-30 Thread aditya desai
Hi Justin,
Thanks for  your response. We have a user defined type created as below and
we need to pass this user defined parameter to a procedure from .net code.
Basically the procedure needs to accept  multiple rows as parameters(user
defined table type). This happened seamlessly in SQL Server but while doing
it in Postgres after migration we get the error mentioned in the above
chain. Is theere any way we can achieve this?

CREATE TYPE public.optiontype AS (
projectid integer,
optionid integer,
phaseid integer,
remarks text
);

Regards,
Aditya.



On Thu, Apr 29, 2021 at 6:32 PM Justin Pryzby  wrote:

> On Thu, Apr 29, 2021 at 02:52:23PM +0530, aditya desai wrote:
> > Hi,
> > One of the procs which accept tabletype as parameter gives below error
> > while being called from Application. Could not find a concrete solution
> for
> > this. Can someone help?
> >
> > call PROCEDURE ABC (p_optiontable optiontype)
>
> What is PROCEDURE ABC ?  If you created it, send its definition with your
> problem report.
>
> > Below is the error while executing proc -
>
> How are you executing it?  This seems like an error from npgsl, not
> postgres.
> It may be a client-side error, and it may be that the query isn't even
> being
> sent to the server at that point.
>
> > “the clr type system.data.datatable isn't natively supported by npgsql or
> > your postgresql. to use it with a postgresql composite you need to
> specify
> > datatypename or to map it, please refer to the documentation.”
>
> Did you do this ?
> https://www.npgsql.org/doc/types/enums_and_composites.html
>
> --
> Justin
>


Re: Error while calling proc with table type from Application (npgsql)

2021-04-30 Thread aditya desai
aditya desai 
6:32 PM (19 minutes ago)
to Justin, Pgsql
Hi Justin,
Thanks for  your response. We have a user defined type created as below and
we need to pass this user defined parameter to a procedure from .net code.
Basically the procedure needs to accept  multiple rows as parameters(user
defined table type). This happened seamlessly in SQL Server but while doing
it in Postgres after migration we get the error mentioned in the above
chain. Is theere any way we can achieve this?

CREATE TYPE public.optiontype AS (
projectid integer,
optionid integer,
phaseid integer,
remarks text
);

Also here is a sample procedure.

 CREATE OR REPLACE procedure SaveAssessmentInfo
 (

  p_Optiontable OptionType
 )

LANGUAGE 'plpgsql'

AS $BODY$

BEGIN

  insert into tempOptions
  select * from p_Optiontable;

END


 END;
 $BODY$;

Regards,
Aditya.

On Fri, Apr 30, 2021 at 6:32 PM aditya desai  wrote:

> Hi Justin,
> Thanks for  your response. We have a user defined type created as below
> and we need to pass this user defined parameter to a procedure from .net
> code. Basically the procedure needs to accept  multiple rows as
> parameters(user defined table type). This happened seamlessly in SQL Server
> but while doing it in Postgres after migration we get the error mentioned
> in the above chain. Is theere any way we can achieve this?
>
> CREATE TYPE public.optiontype AS (
> projectid integer,
> optionid integer,
> phaseid integer,
> remarks text
> );
>
> Regards,
> Aditya.
>
>
>
> On Thu, Apr 29, 2021 at 6:32 PM Justin Pryzby 
> wrote:
>
>> On Thu, Apr 29, 2021 at 02:52:23PM +0530, aditya desai wrote:
>> > Hi,
>> > One of the procs which accept tabletype as parameter gives below error
>> > while being called from Application. Could not find a concrete solution
>> for
>> > this. Can someone help?
>> >
>> > call PROCEDURE ABC (p_optiontable optiontype)
>>
>> What is PROCEDURE ABC ?  If you created it, send its definition with your
>> problem report.
>>
>> > Below is the error while executing proc -
>>
>> How are you executing it?  This seems like an error from npgsl, not
>> postgres.
>> It may be a client-side error, and it may be that the query isn't even
>> being
>> sent to the server at that point.
>>
>> > “the clr type system.data.datatable isn't natively supported by npgsql
>> or
>> > your postgresql. to use it with a postgresql composite you need to
>> specify
>> > datatypename or to map it, please refer to the documentation.”
>>
>> Did you do this ?
>> https://www.npgsql.org/doc/types/enums_and_composites.html
>>
>> --
>> Justin
>>
>


difference between pg_triggers and information_schema.triggers

2021-08-11 Thread aditya desai
Hi All,
What is the difference between pg_triggers and information_schema.triggers?
I want to list all triggers in the database.

The count differs in both.

select count(1) from information_schema.triggers  -55
select count(1) from pg_trigger - 48

What is the best way to list all objects in PostgreSQL?(similar to
all_objects in Oracle).


Regards,
Aditya.


Re: difference between pg_triggers and information_schema.triggers

2021-08-11 Thread aditya desai
Seems like multiple entries in information_schema.triggers for
INSERT/UPDATE/DELETE. Understood thanks.

postgres=# select tgname,tgtype  from pg_trigger;
   tgname   | tgtype
+
 insert_empployee   | 31
 insert_empployee_1 | 31
(2 rows)


postgres=# select tgname  from pg_trigger;
   tgname

 insert_empployee
 insert_empployee_1
(2 rows)


postgres=# select trigger_name,event_manipulation from
information_schema.triggers;
trigger_name| event_manipulation
+
 insert_empployee   | INSERT
 insert_empployee   | DELETE
 insert_empployee   | UPDATE
 insert_empployee_1 | INSERT
 insert_empployee_1 | DELETE
 insert_empployee_1 | UPDATE
(6 rows)

Regards,
Aditya.

On Thu, Aug 12, 2021 at 12:07 AM David G. Johnston <
[email protected]> wrote:

> On Wednesday, August 11, 2021, aditya desai  wrote:
>
>> Hi All,
>> What is the difference between pg_triggers and
>> information_schema.triggers? I want to list all triggers in the database.
>>
>
> Read the docs for information_schema.triggers.
>
>
>> What is the best way to list all objects in PostgreSQL?(similar to
>> all_objects in Oracle).
>>
>>
> With pg_catalog tables.  But I’m not aware of anything that combines all
> object types into a single result.  Seems like an easy enough query to put
> together though.
>
> David J.
>
>


PostgreSQL equivalent of UTL_HTTP

2021-08-11 Thread aditya desai
Hi,
We are migrating Oracle to PostgreSQL. We need the equivalent of UTL_HTTP.
How to invoke Web service from PostgreSQL.

Also please let me know the PostgreSQL equivalents of below
Oracle utilities..

utl.logger,UTL_FILE,UTL_SMTP

Regards,
Aditya.


Query going to all paritions

2021-10-01 Thread aditya desai
Hi,
Query on  one of our partitioned tables which is range partitioned on
"run"date" column is going to all partitions despite having run_date in
WHERE clause. "enable_parition_pruning" is also on. I am unable to generate
a query plan as the query never runs fully even waiting for say half an
hour.

We have composite indexes on run_date,status. Do I need to create an index
on run_date only?

Any other solutions?

Regards,
Aditya.


Re: Query going to all paritions

2021-10-01 Thread aditya desai
Will try to get a query in text format. It looks difficult though.

Regards,
Aditya.


On Fri, Oct 1, 2021 at 4:03 PM hubert depesz lubaczewski 
wrote:

> On Fri, Oct 01, 2021 at 02:24:11PM +0530, aditya desai wrote:
> > Hi Laurenz,
> > Please find attached explain query plan and query.
>
> Can you show us \d of the table, and exact query you ran?
>
> Also, please, don't send images. This is text, so you can copy-paste it
> directly into mail.
>
> Or, put it on some paste site - for explains, I suggest
> https://explain.depesz.com/
>
> It's impossible to select text from image. It's much harder to read (it
> doesn't help that it's not even screenshot, but, what looks like,
> a photo of screen ?!
>


Fwd: Query out of memory

2021-10-18 Thread aditya desai
Sending to a performance group instead of PLPGSQL.

.
.
Hi,
I am running the below query. Table has 21 million records. I get an Out Of
Memory error after a while.(from both pgadmin and psql). Can someone review
DB parameters given below.

select t.*,g.column,a.column from
gk_staging g, transaction t,account a
where
g.accountcodeis not null AND
g.accountcode::text <> '' AND
length(g.accountcode)=13 AND
g.closeid::text=t.transactionid::text AND
subsrting(g.accountcode::text,8)=a.mask_code::text

Below are system parameters.
shared_buffers=3GB
work_mem=2GB
effective_cache_size=10GB
maintenance_work_mem=1GB
max_connections=250

I am unable to paste explain plan here due to security concerns.

Regards,
Aditya.


Re: Query out of memory

2021-10-19 Thread aditya desai
Hi Justin,
Out of memory on pgadmin and psql. I executed it with explain analyze.
Still going out of memory.

 Also currently 250 user connections are not being made. There are hardly
10 connections to database. When I run thi query it is going out of memory.

Also this query is part of a view that gets referred in a
procedure.Transaction table is partitioned table but due to business
requirements partition key is not part of where clause.

Regards,
Aditya.

On Tuesday, October 19, 2021, Justin Pryzby  wrote:

> On Tue, Oct 19, 2021 at 11:28:46AM +0530, aditya desai wrote:
> > I am running the below query. Table has 21 million records. I get an Out
> Of
> > Memory error after a while.(from both pgadmin and psql). Can someone
> review
>
> Is the out of memory error on the client side ?
> Then you've simply returned more rows than the client can support.
>
> In that case, you can run it with "explain analyze" to prove that the
> server
> side can run the query.  That returns no data rows to the client, but
> shows the
> number of rows which would normally be returned.
>
> --
> Justin
>


Re: Query out of memory

2021-10-19 Thread aditya desai
Thanks Michael. I will check this further.

On Tue, Oct 19, 2021 at 7:09 PM Michael Lewis  wrote:

> Check explain plan, change work mem to 100MBs and then check explain plan
> again. If it changed, then try explain analyze.
>
> Work mem is limit is used per node in the plan, so especially with
> partitioned tables, that limit is way too high.
>


Out of memory error

2021-11-23 Thread aditya desai
Hi,
In a trigger function I am creating a temp table . When an update on a
table is executed for say 10k rows. I get the below error.

ERROR: out of shared memory
HINT:You might need to increase max_locks_per_transaction
CONTEXT: SQL Statement "created temp table changedinfo(colName
varchar(100), oldValue varchar(4000), newValue varchar(4000)

Current value of  max_locks_per_transaction is 64. Do I have to increase
this?

Regards,
Aditya.


Re: Out of memory error

2021-11-23 Thread aditya desai
Thanks Tom. However I could not find any solution to achieve the given
requirement. I have to take all values in the temp table and assign it to
an array variable to pass it to the audit procedure as shown below. Can you
please advise ?

CREATE OR REPLACE FUNCTION call_insert_info(

) RETURNS void AS $$
DECLARE
v_message r_log_message[];
OLDVALUE1 varchar(4000);
BEGIN
drop table if exists changedinfo
create temp table changedinfo(colName varchar(100), oldValue
varchar(4000), newValue varchar(4000));
insert into changed infot select 'empName', OLD.empName,
NEW.empName from employee;
insert into changed infot select 'location', OLD.location,
NEW.location from employee;


v_message:=   array(select '(' || columname || ',' || oldvalue || ',' ||
newvalue ||')' from changedinfo);
perform insert_info(v_message);
raise notice '%',v_message;
END;
$$ LANGUAGE plpgsql;

Regards,
AD.


On Wed, Nov 24, 2021 at 11:22 AM Tom Lane  wrote:

> aditya desai  writes:
> > In a trigger function I am creating a temp table . When an update on a
> > table is executed for say 10k rows. I get the below error.
>
> > ERROR: out of shared memory
> > HINT:You might need to increase max_locks_per_transaction
> > CONTEXT: SQL Statement "created temp table changedinfo(colName
> > varchar(100), oldValue varchar(4000), newValue varchar(4000)
>
> [ raised eyebrow ... ]  If you are concerned about performance,
> I'd start by not creating a temp table per row of the outer update.
> That's costing probably 100x to 1000x as much as the row update itself.
>
> regards, tom lane
>


Re: Out of memory error

2021-11-23 Thread aditya desai
Ok. Let me try this. Thanks!!

On Wed, Nov 24, 2021 at 12:01 PM Thomas Kellerer  wrote:

> aditya desai schrieb am 24.11.2021 um 07:25:
> > Thanks Tom. However I could not find any solution to achieve the given
> requirement. I have to take all values in the temp table and assign it to
> an array variable to pass it to the audit procedure as shown below. Can you
> please advise ?
> >
> > CREATE OR REPLACE FUNCTION call_insert_info(
> >
> > ) RETURNS void AS $$
> > DECLARE
> > v_message r_log_message[];
> > OLDVALUE1 varchar(4000);
> > BEGIN
> > drop table if exists changedinfo
> > create temp table changedinfo(colName varchar(100), oldValue
> varchar(4000), newValue varchar(4000));
> > insert into changed infot select 'empName', OLD.empName,
> NEW.empName from employee;
> > insert into changed infot select 'location', OLD.location,
> NEW.location from employee;
> >
> >
> > v_message:=   array(select '(' || columname || ',' || oldvalue || ',' ||
> newvalue ||')' from changedinfo);
> > perform insert_info(v_message);
> > raise notice '%',v_message;
> > END;
> > $$ LANGUAGE plpgsql;
>
>
> You don't need a temp table for that. You can create the array directly
> from the new and old records:
>
> v_message := array[concat_ws(',', 'empName', old.empname,
> new.empname), concat_ws(',', 'location', old.location, new.location)];
>
> Although nowadays I would probably pass such an "structure" as JSON
> though, not as a comma separated list.
>
>
>
>


Re: Out of memory error

2021-11-23 Thread aditya desai
H Michael,
Please see insert_info function below. Also r_log_message is composite data
type and it's definition is also given below.

CREATE OR REPLACE FUNCTION insert_info(
info_array  r_log_message[]
) RETURNS varchar AS $$
DECLARE
info_element  r_log_message;
BEGIN
FOREACH info_element IN ARRAY info_array
LOOP
INSERT INTO testaditya(
columname,
oldvalue,
newvalue
) VALUES(
info_element.column_name,
info_element.oldvalue,
info_element.newvalue
);
END LOOP;
RETURN 'OK';
END;
$$ LANGUAGE plpgsql;


postgres=# \d r_log_message;
 Composite type "public.r_log_message"
   Column|  Type   | Collation | Nullable | Default
-+-+---+--+-
 column_name | character varying(30)   |   |  |
 oldvalue| character varying(4000) |   |  |
 newvalue| character varying(4000) |   |  |

Regards,
Aditya.



On Wed, Nov 24, 2021 at 12:16 PM Michael Lewis  wrote:

> It seems like that function has some syntax errors, and also doesn't do
> what you want since I presume the "from employee" bit would mean you get
> many rows inserted into that temp table for all the existing data and not
> the one row you are operating on at the moment the trigger fires.
>
> It is worth noting also that if bulk operations are at all common for this
> table then writing this as an after statement trigger will likely be
> helpful for performance.
>
> For full context, we'd need to see how the function insert_info is defined.
>


Re: Out of memory error

2021-11-23 Thread aditya desai
Hi Thomas,
v_message is of composite data type r_log_message and it's definition is as
shown below.

postgres=# \d r_log_message;
 Composite type "public.r_log_message"
   Column|  Type   | Collation | Nullable | Default
-+-+---+--+-
 column_name | character varying(30)   |   |  |
 oldvalue| character varying(4000) |   |  |
 newvalue| character varying(4000) |   |  |

Regards,
Aditya.

On Wed, Nov 24, 2021 at 12:01 PM Thomas Kellerer  wrote:

> aditya desai schrieb am 24.11.2021 um 07:25:
> > Thanks Tom. However I could not find any solution to achieve the given
> requirement. I have to take all values in the temp table and assign it to
> an array variable to pass it to the audit procedure as shown below. Can you
> please advise ?
> >
> > CREATE OR REPLACE FUNCTION call_insert_info(
> >
> > ) RETURNS void AS $$
> > DECLARE
> > v_message r_log_message[];
> > OLDVALUE1 varchar(4000);
> > BEGIN
> > drop table if exists changedinfo
> > create temp table changedinfo(colName varchar(100), oldValue
> varchar(4000), newValue varchar(4000));
> > insert into changed infot select 'empName', OLD.empName,
> NEW.empName from employee;
> > insert into changed infot select 'location', OLD.location,
> NEW.location from employee;
> >
> >
> > v_message:=   array(select '(' || columname || ',' || oldvalue || ',' ||
> newvalue ||')' from changedinfo);
> > perform insert_info(v_message);
> > raise notice '%',v_message;
> > END;
> > $$ LANGUAGE plpgsql;
>
>
> You don't need a temp table for that. You can create the array directly
> from the new and old records:
>
> v_message := array[concat_ws(',', 'empName', old.empname,
> new.empname), concat_ws(',', 'location', old.location, new.location)];
>
> Although nowadays I would probably pass such an "structure" as JSON
> though, not as a comma separated list.
>
>
>
>


Re: Out of memory error

2021-11-24 Thread aditya desai
Thanks Thomas!  Sorry to say this but ,this was migrated from Oracle to PG
:) and the app team just wants to keep the data type as it is  :(

On Wed, Nov 24, 2021 at 5:40 PM Thomas Kellerer  wrote:

> aditya desai schrieb am 24.11.2021 um 08:35:
> > Hi Thomas,
> > v_message is of composite data type r_log_message and it's definition is
> as shown below.
> >
> > postgres=# \d r_log_message;
> >  Composite type "public.r_log_message"
> >Column|  Type   | Collation | Nullable | Default
> > -+-+---+--+-
> >  column_name | character varying(30)   |   |  |
> >  oldvalue| character varying(4000) |   |  |
> >  newvalue| character varying(4000) |   |  |
> >
> > Regards,
> > Aditya.
>
> Sorry, didn't see that.
>
> Then you need to create records of that type in the array:
>
>v_message := array[('empName', old.empname,
> new.empname)::r_log_message, ('location', old.location,
> new.location)::r_log_message];
>
> or an array of that type:
>
>v_message := array[('empName', old.empname, new.empname), ('location',
> old.location, new.location)]::r_log_message[];
>
>
> Btw: why don't you use `text` instead of varchar(4000).
>
>
>