Re: dexter on AWS RDS auto tune queries

2021-06-07 Thread Ayub Khan
Thank you @Julian

@Christophe: yes I am using RDS performance insights, however it might
be more helpful if it could give more info about the slowness of the
queries and what improvements could be done to the queries itself.

I am using pgMusted to analyze a slow query and there the suggestion is to
create an index on   app2.user_id, however app2.user_id is a primary key.

below is the query and its explain:

select * from (
  SELECT
act.*,
app1.user_name AS created_by_username,
app2.user_name AS last_updated_by_username
  FROM
account_transactions AS act LEFT OUTER JOIN app_user AS app1 ON
app1.user_id = act.created_by
LEFT OUTER JOIN app_user AS app2 ON app2.user_id = act.last_updated_by
  WHERE act.is_deleted = 'false' AND
act.CREATION_DATE BETWEEN TO_DATE('06/06/2021', 'DD-MM-')
AND TO_DATE('07-06-2021', 'DD-MM-')
  ORDER BY act.ID DESC
) as items order by id desc


Sort  (cost=488871.14..489914.69 rows=417420 width=270) (actual
time=2965.815..2979.921 rows=118040 loops=1)
  Sort Key: act.id DESC
  Sort Method: quicksort  Memory: 57607kB
  ->  Merge Left Join  (cost=422961.21..449902.61 rows=417420 width=270)
(actual time=2120.021..2884.484 rows=118040 loops=1)
Merge Cond: (act.last_updated_by = ((app2.user_id)::numeric))
->  Sort  (cost=7293.98..7301.62 rows=3054 width=257) (actual
time=464.243..481.292 rows=118040 loops=1)
  Sort Key: act.last_updated_by
  Sort Method: quicksort  Memory: 50899kB
  ->  Nested Loop Left Join  (cost=0.87..7117.21 rows=3054
width=257) (actual time=0.307..316.148 rows=118040 loops=1)
->  Index Scan using creation_date on
account_transactions act  (cost=0.44..192.55 rows=3054 width=244) (actual
time=0.295..67.330 rows=118040 loops=1)
"  Index Cond: ((creation_date >=
to_date('06/06/2021'::text, 'DD-MM-'::text)) AND (creation_date <=
to_date('07-06-2021'::text, 'DD-MM-'::text)))"
  Filter: ((is_deleted)::text = 'false'::text)
->  Index Scan using app_user_pk on app_user app1
 (cost=0.43..2.27 rows=1 width=21) (actual time=0.002..0.002 rows=1
loops=118040)
  Index Cond: (user_id = act.created_by)
->  Sort  (cost=415667.22..423248.65 rows=3032573 width=21) (actual
time=1655.748..1876.596 rows=3079326 loops=1)
  Sort Key: ((app2.user_id)::numeric)
  Sort Method: quicksort  Memory: 335248kB
  ->  Seq Scan on app_user app2  (cost=0.00..89178.73
rows=3032573 width=21) (actual time=0.013..575.630 rows=3032702 loops=1)
Planning Time: 2.222 ms
Execution Time: 3009.387 ms


On Mon, Jun 7, 2021 at 8:00 AM Christophe Pettus  wrote:

>
>
> > On Jun 6, 2021, at 21:51, Ayub Khan  wrote:
> > Other than Dexter, Is there an auto tune or query performance indicator
> for postgres ?
>
> Generally, auto-creating indexes isn't a great idea.  I respect the work
> that went into Dexter, but it's much better to find the queries and study
> them, then decide if index creation is the right thing.
>
> RDS has Performance Insights, which is a very useful tool for finding
> where the load on your server is actually coming from.



-- 

Sun Certified Enterprise Architect 1.5
Sun Certified Java Programmer 1.4
Microsoft Certified Systems Engineer 2000
http://in.linkedin.com/pub/ayub-khan/a/811/b81
mobile:+966-502674604
--
It is proved that Hard Work and kowledge will get you close but attitude
will get you there. However, it's the Love
of God that will put you over the top!!


Re: dexter on AWS RDS auto tune queries

2021-06-07 Thread Julien Rouhaud
Please don't top post here.

On Mon, Jun 7, 2021 at 3:50 PM Ayub Khan  wrote:
>
> @Christophe: yes I am using RDS performance insights, however it might be 
> more helpful if it could give more info about the slowness of the queries and 
> what improvements could be done to the queries itself.
>
> I am using pgMusted to analyze a slow query and there the suggestion is to 
> create an index on   app2.user_id, however app2.user_id is a primary key.
>
> below is the query and its explain:
>
> select * from (
>   SELECT
> act.*,
> app1.user_name AS created_by_username,
> app2.user_name AS last_updated_by_username
>   FROM
> account_transactions AS act LEFT OUTER JOIN app_user AS app1 ON 
> app1.user_id = act.created_by
> LEFT OUTER JOIN app_user AS app2 ON app2.user_id = act.last_updated_by
>   WHERE act.is_deleted = 'false' AND
> act.CREATION_DATE BETWEEN TO_DATE('06/06/2021', 'DD-MM-') AND 
> TO_DATE('07-06-2021', 'DD-MM-')
>   ORDER BY act.ID DESC
> ) as items order by id desc
>
>
> Sort  (cost=488871.14..489914.69 rows=417420 width=270) (actual 
> time=2965.815..2979.921 rows=118040 loops=1)
>   Sort Key: act.id DESC
>   Sort Method: quicksort  Memory: 57607kB
>   ->  Merge Left Join  (cost=422961.21..449902.61 rows=417420 width=270) 
> (actual time=2120.021..2884.484 rows=118040 loops=1)
> Merge Cond: (act.last_updated_by = ((app2.user_id)::numeric))
> ->  Sort  (cost=7293.98..7301.62 rows=3054 width=257) (actual 
> time=464.243..481.292 rows=118040 loops=1)
>   Sort Key: act.last_updated_by
>   Sort Method: quicksort  Memory: 50899kB
>   ->  Nested Loop Left Join  (cost=0.87..7117.21 rows=3054 
> width=257) (actual time=0.307..316.148 rows=118040 loops=1)
> ->  Index Scan using creation_date on 
> account_transactions act  (cost=0.44..192.55 rows=3054 width=244) (actual 
> time=0.295..67.330 rows=118040 loops=1)
> "  Index Cond: ((creation_date >= 
> to_date('06/06/2021'::text, 'DD-MM-'::text)) AND (creation_date <= 
> to_date('07-06-2021'::text, 'DD-MM-'::text)))"
>   Filter: ((is_deleted)::text = 'false'::text)
> ->  Index Scan using app_user_pk on app_user app1  
> (cost=0.43..2.27 rows=1 width=21) (actual time=0.002..0.002 rows=1 
> loops=118040)
>   Index Cond: (user_id = act.created_by)
> ->  Sort  (cost=415667.22..423248.65 rows=3032573 width=21) (actual 
> time=1655.748..1876.596 rows=3079326 loops=1)
>   Sort Key: ((app2.user_id)::numeric)
>   Sort Method: quicksort  Memory: 335248kB
>   ->  Seq Scan on app_user app2  (cost=0.00..89178.73 
> rows=3032573 width=21) (actual time=0.013..575.630 rows=3032702 loops=1)
> Planning Time: 2.222 ms
> Execution Time: 3009.387 ms

I'd say that your problem is that account_transactions.updated_by is
numeric (which seems like a terrible idea) while app_user.user_id is
not, so the index can't be used.  Some extensions could detect that,
but you won't be able to install them on RDS.




Re: dexter on AWS RDS auto tune queries

2021-06-07 Thread Ayub Khan
Julien,

Thank you for the pointer. I will change the data type and verify the query
again.

-Ayub

On Mon, Jun 7, 2021 at 7:51 AM Ayub Khan  wrote:

>
> Other than Dexter, Is there an auto tune or query performance indicator
> for postgres ?
> Also which are the most commonly used monitoring (slow query, cpu, index
> creation for missing indexs ) tools being used for postgres ?
>
> --Ayub
>


-- 

Sun Certified Enterprise Architect 1.5
Sun Certified Java Programmer 1.4
Microsoft Certified Systems Engineer 2000
http://in.linkedin.com/pub/ayub-khan/a/811/b81
mobile:+966-502674604
--
It is proved that Hard Work and kowledge will get you close but attitude
will get you there. However, it's the Love
of God that will put you over the top!!


Re: AWS forcing PG upgrade from v9.6 a disaster

2021-06-07 Thread Andrew Dunstan


On 6/6/21 7:49 PM, Dean Gibson (DB Administrator) wrote:
> On 2021-05-29 13:35, Andrew Dunstan wrote:
>> On 5/29/21 3:59 PM, Dean Gibson (DB Administrator) wrote:
>>> Meanwhile, I've been doing some checking.  If I remove "CAST(
>>> license_status AS CHAR ) = 'A'", the problem disappears.  Changing the
>>> JOIN to a RIGHT JOIN, & replacing WHERE with ON, also "solves" the
>>> problem, but there is an extra row where license_status is NULL, due
>>> to the RIGHT JOIN.  Currently trying to figure that out (why did the
>>> CAST ... match 'A', if it is null?)...
>> Why are you using this expression? It's something you almost never want
>> to do in my experience. Why not use the substr() function to get the
>> first character?
>>
>
> Although it doesn't matter in this case, I do it because in general,
> it changes the type of the value from CHAR to bptext or whatever it
> is, & that has causes comparison issues in the past.  It's just a
> matter of habit for me when working with CHAR() types.
>
> But this case, where it doesn't matter, I'd use LEFT().



That raises the issue of why you're using CHAR(n) fields. Just about
every consultant I know advises simply avoiding them. :-)


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: AWS forcing PG upgrade from v9.6 a disaster

2021-06-07 Thread Dean Gibson (DB Administrator)

On 2021-06-07 04:52, Andrew Dunstan wrote:

On 6/6/21 7:49 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-29 13:35, Andrew Dunstan wrote:

On 5/29/21 3:59 PM, Dean Gibson (DB Administrator) wrote:

...  If I remove "CAST( license_status AS CHAR ) = 'A'", ...

Why are you using this expression? It's something you almost never want to do 
in my experience. Why not use the substr() function to get the
first character?

Although it doesn't matter in this case, I do it because in general, it changes the 
type of the value from CHAR to bptext or whatever it is, & that has caused 
comparison issues in the past.  It's just a matter of habit for me when working 
with CHAR() types.

But this case, where it doesn't matter, I'd use LEFT().


That raises the issue of why you're using CHAR(n) fields. Just about every 
consultant I know advises simply avoiding them. :-)

cheers, andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com


As I mentioned earlier, both the data & the table definitions come from 
the FCC, the latter in the form of text files containing their formal 
SQL definitions.  These often change (like two weeks ago).  There are 18 
tables currently of interest to me, with between 30 & 60 fields in each 
table.   Further, the entire data set is replaced every Sunday, with 
daily updates during the week.  About 1/6th of the text fields are 
defined as VARCHAR;  the rest are CHAR.  All of the text fields that are 
used as indexes, are CHAR.


Being mindful of the fact that trailing blanks are significant in CHAR 
fields, I find it easier to keep the original FCC table definitions, & 
remap them to VIEWs containing the fields I am interested in.  I've been 
doing this with the FCC data for over 15 years, starting with PostgreSQL 
7.3.


As far as needing a consultant in DB design, the FCC is planning a new 
DB architecture "soon", & they sorely need one.  When they export the 
data to the public (delimited by "|"), they don't escape some characters 
like "|", "\", & .  That makes it fun ...


-- Dean