Postgres using the wrong index index

2021-08-11 Thread Matt Dupree
I've created a partial index that I expect the query planner to use in
executing a query, but it's using another index instead. Using this other
partial index results in a slower query. I'd really appreciate some help
understanding why this is occurring. Thanks in advance!

*Postgres Version*

PostgreSQL 12.7 (Ubuntu 12.7-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit

*Problem Description*

Here's the index I expect the planner to use:

CREATE INDEX other_events_1004175222_pim_evdef_67951aef14bc_idx ON
public.other_events_1004175222 USING btree ("time", user_id) WHERE (
(user_id <= '(1080212440,9007199254740991)'::app_user_id) AND
(user_id >= '(1080212440,0)'::app_user_id) AND
(
(
(type = 'click'::text) AND (library = 'web'::text) AND
(strpos(hierarchy, '#close_onborading;'::text)
<> 0) AND (object IS NULL)
) OR
(
(type = 'click'::text) AND (library = 'web'::text) AND
(strpos(hierarchy,
'#proceedOnboarding;'::text) <> 0) AND (object IS NULL)
)
)
);


Here's the query:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT user_id,
   "time",
   0 AS event,
   session_id
FROM test_yasp_events_exp_1004175222
WHERE ((test_yasp_events_exp_1004175222.user_id >=
'(1080212440,0)'::app_user_id) AND
   (test_yasp_events_exp_1004175222.user_id <=
'(1080212440,9007199254740991)'::app_user_id) AND
   ("time" >=
'162477720'::bigint) AND
   ("time" <=
'162736920'::bigint) AND (
   (
   (type = 'click'::text) AND
   (library = 'web'::text) AND
   (strpos(hierarchy, '#close_onborading;'::text) <>
0) AND
   (object IS NULL)) OR
   (
   (type = 'click'::text) AND
   (library = 'web'::text) AND
   (strpos(hierarchy,
   '#proceedOnboarding;'::text) <>
0) AND (object IS NULL


Here's the plan: https://explain.depesz.com/s/uNGg

Note that the index being used is
other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx, which is
defined this way:

CREATE INDEX other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx
ON public.other_events_1004175222 USING btree (type, "time", user_id)
WHERE (
(type IS NOT NULL) AND (object IS NULL) AND
((user_id >= '(1080212440,0)'::app_user_id) AND (user_id <=
'(1080212440,9007199254740991)'::app_user_id)))

You can view the definition of test_yasp_events_exp_1004175222 here
. Note the child tables,
other_events_1004175222, pageviews_1004175222, and sessions_1004175222
which have the following constraints:

other_events_1004175222: CHECK (object IS NULL)
pageviews_1004175222: CHECK (object IS NOT NULL AND object = 'pageview'::text)
sessions_1004175222: CHECK (object IS NOT NULL AND object = 'session'::text)

Also note that these child tables have 100s of partial indexes. You
can find history on why we have things set up this way here
.

Here's the table metadata for other_events_1004175222:

SELECT relname,
   relpages,
   reltuples,
   relallvisible,
   relkind,
   relnatts,
   relhassubclass,
   reloptions,
   pg_table_size(oid)
FROM pg_class
WHERE relname = 'other_events_1004175222';

Results:

[image: image.png]

-- 

K. Matt Dupree

Data Science Engineer
321.754.0526  |  [email protected]


Re: Postgres using the wrong index index

2021-08-11 Thread Justin Pryzby
On Tue, Aug 10, 2021 at 12:47:20PM -0400, Matt Dupree wrote:
> Here's the plan: https://explain.depesz.com/s/uNGg
> 
> Note that the index being used is

Could you show the plan if you force use of the intended index ?
For example by doing begin; DROP INDEX indexbeingused; explain thequery; 
rollback;
Or: begin; UPDATE pg_index SET indisvalid=false WHERE 
indexrelid='indexbeingused'::regclass explain thequery; rollback;

Could you show the table statistics for the time, user_id, and type columns on
all 4 tables ?
| SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, 
attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) 
n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE 
attname='...' AND tablename='...' ORDER BY 1 DESC; 

It might be interesting to see both query plans when index scans are disabled
and bitmap scan are used instead (this might be as simple as begin; SET LOCAL
enable_indexscan=off ...; rollback;);

> Also note that these child tables have 100s of partial indexes. You
> can find history on why we have things set up this way here
> .

I have read it before :)

> SELECT relname, relpages, reltuples, relallvisible, pg_table_size(oid)
> FROM pg_class WHERE relname = 'other_events_1004175222';

Could you also show the table stats for the two indexes ?

One problem is that the rowcount estimate is badly off:
| Index Scan using other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx 
on public.other_events_1004175222 (cost=0.57..1,213,327.64 rows=1,854,125 
width=32) (actual time=450.588..29,057.269 rows=23 loops=1) 

To my eyes, this looks like a typo ; it's used in the index predicate as well
as the query, but maybe it's still relevant ?
| #close_onborading

-- 
Justin




difference between pg_triggers and information_schema.triggers

2021-08-11 Thread aditya desai
Hi All,
What is the difference between pg_triggers and information_schema.triggers?
I want to list all triggers in the database.

The count differs in both.

select count(1) from information_schema.triggers  -55
select count(1) from pg_trigger - 48

What is the best way to list all objects in PostgreSQL?(similar to
all_objects in Oracle).


Regards,
Aditya.


Re: difference between pg_triggers and information_schema.triggers

2021-08-11 Thread David G. Johnston
On Wednesday, August 11, 2021, aditya desai  wrote:

> Hi All,
> What is the difference between pg_triggers and
> information_schema.triggers? I want to list all triggers in the database.
>

Read the docs for information_schema.triggers.


> What is the best way to list all objects in PostgreSQL?(similar to
> all_objects in Oracle).
>
>
With pg_catalog tables.  But I’m not aware of anything that combines all
object types into a single result.  Seems like an easy enough query to put
together though.

David J.


Re: difference between pg_triggers and information_schema.triggers

2021-08-11 Thread aditya desai
Seems like multiple entries in information_schema.triggers for
INSERT/UPDATE/DELETE. Understood thanks.

postgres=# select tgname,tgtype  from pg_trigger;
   tgname   | tgtype
+
 insert_empployee   | 31
 insert_empployee_1 | 31
(2 rows)


postgres=# select tgname  from pg_trigger;
   tgname

 insert_empployee
 insert_empployee_1
(2 rows)


postgres=# select trigger_name,event_manipulation from
information_schema.triggers;
trigger_name| event_manipulation
+
 insert_empployee   | INSERT
 insert_empployee   | DELETE
 insert_empployee   | UPDATE
 insert_empployee_1 | INSERT
 insert_empployee_1 | DELETE
 insert_empployee_1 | UPDATE
(6 rows)

Regards,
Aditya.

On Thu, Aug 12, 2021 at 12:07 AM David G. Johnston <
[email protected]> wrote:

> On Wednesday, August 11, 2021, aditya desai  wrote:
>
>> Hi All,
>> What is the difference between pg_triggers and
>> information_schema.triggers? I want to list all triggers in the database.
>>
>
> Read the docs for information_schema.triggers.
>
>
>> What is the best way to list all objects in PostgreSQL?(similar to
>> all_objects in Oracle).
>>
>>
> With pg_catalog tables.  But I’m not aware of anything that combines all
> object types into a single result.  Seems like an easy enough query to put
> together though.
>
> David J.
>
>


PostgreSQL equivalent of UTL_HTTP

2021-08-11 Thread aditya desai
Hi,
We are migrating Oracle to PostgreSQL. We need the equivalent of UTL_HTTP.
How to invoke Web service from PostgreSQL.

Also please let me know the PostgreSQL equivalents of below
Oracle utilities..

utl.logger,UTL_FILE,UTL_SMTP

Regards,
Aditya.


Re: PostgreSQL equivalent of UTL_HTTP

2021-08-11 Thread Pavel Stehule
Hi

st 11. 8. 2021 v 20:57 odesílatel aditya desai  napsal:

> Hi,
> We are migrating Oracle to PostgreSQL. We need the equivalent of UTL_HTTP.
> How to invoke Web service from PostgreSQL.
>
> Also please let me know the PostgreSQL equivalents of below
> Oracle utilities..
>
> utl.logger,UTL_FILE,UTL_SMTP
>

you can use extensions https://github.com/pramsey/pgsql-http or
https://github.com/RekGRpth/pg_curl

You can use an routines in untrusted PLPerl or untrusted PLPython, but
these routines can be really unsafe (due possibility to break signal
handling).

Personally, I think using http access in stored procedures is a very bad
idea - access from transactional to non-transactional (and possibly pretty
slow) environments creates a lot of ugly problems. Stored procedures are
great technology with a pretty bad reputation, and one reason why is usage
of this technology for bad cases.

I think this mailing list is wrong for this kind of question. There is no
relation to performance.

Regards

Pavel





> Regards,
> Aditya.
>
>
>


Re: Postgres using the wrong index index

2021-08-11 Thread Mladen Gogala
You know that you can use pg_hint_plan extension? That way you don't 
have to disable indexes or set session parameters.


Regards

On 8/11/21 3:56 PM, Matt Dupree wrote:

Thanks for your response, Justin!

Here's  the plan if we disable the 
custom_2 index. It uses the index I expect and it's much faster.


Here's  a plan if we disable index 
scans. It uses both indexes and is much faster.



--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com



Re: Postgres using the wrong index index

2021-08-11 Thread Justin Pryzby
The rowcount estimate for the time column is bad for all these plans - do you
know why ?  You're using inheritence - have you analyzed the parent tables
recently ?

| Index Scan using other_events_1004175222_pim_evdef_67951aef14bc_idx on 
public.other_events_1004175222 (cost=0.28..1,648,877.92 rows=1,858,891 
width=32) (actual time=1.008..15.245 rows=23 loops=1)
|Index Cond: ((other_events_1004175222."time" >= '162477720'::bigint) 
AND (other_events_1004175222."time" <= '162736920'::bigint))

-- 
Justin