RE: select query does not pick up the right index
-Original Message- From: David Rowley Sent: 03 January 2019 22:42 To: Abadie Lana Cc: [email protected] Subject: Re: select query does not pick up the right index On Fri, 4 Jan 2019 at 02:20, Abadie Lana wrote: > > From: David Rowley > > Sent: 03 January 2019 14:01 > Right, so you need to check your indexes on sample_ctrl_year and > sample_buil_year. You need an index on (channel_id, smpl_time) on those. > These indexes exist already That's interesting. The \d output indicates that the indexes are not INVALID, so it's not all that obvious why the planner would choose a lesser index to provide the required rows. One thought is that the more suitable index is very bloated. This would increase the estimated cost of scanning the index and reduce the chances of the index being selected by the query planner. If you execute: select indrelid::regclass as table_name, indexrelid::Regclass as index_name,pg_size_pretty(pg_relation_size(indrelid)) table_size,pg_size_pretty(pg_relation_size(indexrelid)) index_size from pg_index where indrelid in('sample_ctrl_year'::regclass, 'sample_buil_year'::regclass) order by indrelid::regclass::name, indexrelid::regclass::name; This should show you the size of the tables and indexes in question. If the sample_time_cy_idx and sample_time_by_idx indexes are very large when compared with the size of their table, then it is likely worth building a new index for these then dropping the old index then retrying the re-written version of the query. If this is a live system then you can build the new indexes by using the CREATE INDEX CONCURRENTLY command. This will allow other DML operations to work without being blocked. The old indexes can then be dropped with DROP INDEX CONCURRENTLY. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services Here the result...For me it does not sound that it is bloated...Also still a mystery why wrong indexes are picked up for buil and ctrl and not for util... select indrelid::regclass as table_name, indexrelid::Regclass as index_name,pg_size_pretty(pg_relation_size(indrelid)) table_size,pg_size_pretty(pg_relation_size(indexrelid)) index_size from pg_index where indrelid in('sample_ctrl_year'::regclass, 'sample_buil_year'::regclass,'sample_util_year'::regclass) order by indrelid::regclass::name, indexrelid::regclass::name; table_name| index_name | table_size | index_size --+-++ sample_buil_year | sample_time_by_idx | 4492 MB| 1522 MB sample_buil_year | sample_time_yb1_idx | 4492 MB| 1522 MB sample_buil_year | smpl_time_bx2_idx | 4492 MB| 1084 MB sample_ctrl_year | sample_time_cy_idx | 7065 MB| 2394 MB sample_ctrl_year | sample_time_yc1_idx | 7065 MB| 2394 MB sample_ctrl_year | smpl_time_cmx2_idx | 7065 MB| 1705 MB sample_util_year | sample_time_uy_idx | 7140 MB| 2426 MB sample_util_year | sample_time_yu1_idx | 7140 MB| 2426 MB sample_util_year | smpl_time_ux2_idx | 7140 MB| 1727 MB (9 rows) I have recreated the indexes for sample_ctrl_year and sample_buil_year and same index size. I rerun the query... and still the same plan execution as previously sent Thanks for your support...One thing I spot is the I/O on this machine is rather slow... the very first time I run this query it will take Execution time: 247503.006 ms ( I can see that postgres process is in state D and low CPU...,using iotop I can see I/O read speed cannot go beyond 20MB/sec. The second time I run the query, the CPU goes up to 100%, no D state).
RE: select query does not pick up the right index
-Original Message- From: Justin Pryzby Sent: 04 January 2019 00:48 To: Abadie Lana Cc: David Rowley ; [email protected] Subject: Re: select query does not pick up the right index On Thu, Jan 03, 2019 at 12:57:27PM +, Abadie Lana wrote: > Main parameters : effective_cache_size : 4GB, shared_buffers 4GB, > work_mem 4MB I doubt it will help much, but you should consider increasing work_mem, unless you have many expensive queries running at once. Could you also send the rest of the pg_statistic for that table ? https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC; Hmm. Is it normal that the couple (tablename,attname ) is not unique? I'm surprised to see sample_{ctrl,util,buil} quoted twice css_archive_3_0_0=# SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename like 'sample%' ORDER BY 1 DESC; frac_mcv | tablename | attname | null_frac | n_distinct | n_mcv | n_hist --+---++---++---+ 1 | sample_buil_year | channel_id | 0 | 16 |16 | 0.98249 | sample_ctrl | channel_id | 0 | 26 |17 | 9 0.982333 | sample_ctrl_month | channel_id | 0 | 34 |17 | 17 0.981533 | sample_ctrl | channel_id | 0 | 28 |18 | 10 0.9371 | sample_ctrl_year | channel_id | 0 | 38 |16 | 22 0.928767 | sample_buil_month | channel_id | 0 |940 |54 | 101 0.92535 | sample| channel_id | 0 | 2144 | 167 | 1001 0.907501 | sample_buil | channel_id | 0 |565 |43 | 101 0.8876 | sample_util_year | channel_id | 0 |501 |45 | 101 0.815 | sample_util | channel_id | 0 |557 |82 | 101 0.807667 | sample_buil | channel_id | 0 |164 |31 | 101 0.806267 | sample_util | channel_id | 0 |732 | 100 | 101 0.803766 | sample_util_month | channel_id | 0 |731 | 100 | 101 (13 rows) Ah...sample_ctrl_year and sample_buil_year have n_distinct -1? Unlike sample_util_year. Could that explain the wrong choice? SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='smpl_time' AND tablename like 'sample%' ORDER BY 1 DESC; frac_mcv | tablename | attname | null_frac | n_distinct | n_mcv | n_hist +---+---+---+-+---+ | sample_ctrl_month | smpl_time | 0 | -1 | | 101 | sample_ctrl_year | smpl_time | 0 | -1 | | 101 | sample_ctrl | smpl_time | 0 | -1 | | 101 | sample_ctrl | smpl_time | 0 | -1 | | 101 | sample_buil_year | smpl_time | 0 | -1 | | 101 0.0154667 | sample_buil_month | smpl_time | 0 | 1.03857e+06 | 100 | 101 0.0154523 | sample_buil | smpl_time | 0 | 854250 | 100 | 101 0.0115 | sample_util | smpl_time | 0 | 405269 | 100 | 101 0.0112333 | sample_util | smpl_time | 0 | 537030 | 100 | 101 0.0106667 | sample_util_month | smpl_time | 0 | 539001 | 100 | 101 0.00946667 | sample_buil | smpl_time | 0 | -0.328554 | 100 | 101 0.00852342 | sample| smpl_time | 0 | 1.5125e+07 | 1000 | 1001 0.00780001 | sample_util_year | smpl_time | 0 | 1.73199e+06 | 100 | 101 (13 rows)
RE: select query does not pick up the right index
-Original Message- From: [email protected] On Behalf Of Abadie Lana Sent: 04 January 2019 09:18 To: Justin Pryzby Cc: David Rowley ; [email protected] Subject: [Possible Spoof] RE: select query does not pick up the right index Warning: This message was sent by [email protected] supposedly on behalf of Abadie Lana . Please contact -Original Message- From: Justin Pryzby Sent: 04 January 2019 00:48 To: Abadie Lana Cc: David Rowley ; [email protected] Subject: Re: select query does not pick up the right index On Thu, Jan 03, 2019 at 12:57:27PM +, Abadie Lana wrote: > Main parameters : effective_cache_size : 4GB, shared_buffers 4GB, > work_mem 4MB I doubt it will help much, but you should consider increasing work_mem, unless you have many expensive queries running at once. Could you also send the rest of the pg_statistic for that table ? https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC; Hmm. Is it normal that the couple (tablename,attname ) is not unique? I'm surprised to see sample_{ctrl,util,buil} quoted twice css_archive_3_0_0=# SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename like 'sample%' ORDER BY 1 DESC; frac_mcv | tablename | attname | null_frac | n_distinct | n_mcv | n_hist --+---++---++---+ 1 | sample_buil_year | channel_id | 0 | 16 |16 | 0.98249 | sample_ctrl | channel_id | 0 | 26 |17 | 9 0.982333 | sample_ctrl_month | channel_id | 0 | 34 |17 | 17 0.981533 | sample_ctrl | channel_id | 0 | 28 |18 | 10 0.9371 | sample_ctrl_year | channel_id | 0 | 38 |16 | 22 0.928767 | sample_buil_month | channel_id | 0 |940 |54 | 101 0.92535 | sample| channel_id | 0 | 2144 | 167 | 1001 0.907501 | sample_buil | channel_id | 0 |565 |43 | 101 0.8876 | sample_util_year | channel_id | 0 |501 |45 | 101 0.815 | sample_util | channel_id | 0 |557 |82 | 101 0.807667 | sample_buil | channel_id | 0 |164 |31 | 101 0.806267 | sample_util | channel_id | 0 |732 | 100 | 101 0.803766 | sample_util_month | channel_id | 0 |731 | 100 | 101 (13 rows) Ah...sample_ctrl_year and sample_buil_year have n_distinct -1? Unlike sample_util_year. Could that explain the wrong choice? SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='smpl_time' AND tablename like 'sample%' ORDER BY 1 DESC; frac_mcv | tablename | attname | null_frac | n_distinct | n_mcv | n_hist +---+---+---+-+---+ | sample_ctrl_month | smpl_time | 0 | -1 | | 101 | sample_ctrl_year | smpl_time | 0 | -1 | | 101 | sample_ctrl | smpl_time | 0 | -1 | | 101 | sample_ctrl | smpl_time | 0 | -1 | | 101 | sample_buil_year | smpl_time | 0 | -1 | | 101 0.0154667 | sample_buil_month | smpl_time | 0 | 1.03857e+06 | 100 | 101 0.0154523 | sample_buil | smpl_time | 0 | 854250 | 100 | 101 0.0115 | sample_util | smpl_time | 0 | 405269 | 100 | 101 0.0112333 | sample_util | smpl_time | 0 | 537030 | 100 | 101 0.0106667 | sample_util_month | smpl_time | 0 | 539001 | 100 | 101 0.00946667 | sample_buil | smpl_time | 0 | -0.328554 | 100 | 101 0.00852342 | sample| smpl_time | 0 | 1.5125e+07 | 1000 | 1001 0.00780001 | sample_util_year | smpl_time | 0 | 1.73199e+06 | 100 | 101 (13 rows) Based on your feedback...i rerun analyse directly on the two table sample_ctrl_year and sample_buil_year The new values are SELECT (SELECT sum(x) FROM unnest(most_commo
Re: select query does not pick up the right index
On Fri, Jan 04, 2019 at 08:58:57AM +, Abadie Lana wrote:
> SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename,
> attname, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv,
> array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='...' AND
> tablename='...' ORDER BY 1 DESC;
>
> Hmm. Is it normal that the couple (tablename,attname ) is not unique? I'm
> surprised to see sample_{ctrl,util,buil} quoted twice
One of the rows is for "inherited stats" (including child tables) stats and one
is "noninherited stats".
The unique index on the table behind that view is:
"pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum,
stainherit)
On the wiki, I added inherited and correlation columns. Would you rerun that
query ?
https://wiki.postgresql.org/wiki/Slow_Query_Questions#Statistics:_n_distinct.2C_MCV.2C_histogram
I'm also interested to see \d and channel_id statistics for the channel table.
> explain (analyze, buffers) select
> 'BUIL-B36-VA-RT-RT1:CL0001-2-ABW',c.smpl_time,c.nanosecs,c.float_val,c.num_val,c.str_val,c.datatype,c.array_val
> from sample c WHERE c.channel_id = (SELECT channel_id FROM channel WHERE
> name='BUIL-B36-VA-RT-RT1:CL0001-2-ABW') order by c.smpl_time desc limit 5;
You originally wrote this as a implicit comma join. Does the original query
still have an issue ? The =(subselect query) doesn't allow the planner to
optimize for the given channel, which seems to be a fundamental problem.
On Fri, Jan 04, 2019 at 08:58:57AM +, Abadie Lana wrote:
> Based on your feedback...i rerun analyse directly on the two table
> sample_ctrl_year and sample_buil_year
> [...] Now when running the query again, only for sample_buil_year table the
> wrong index is picked up...
It looks like statistics on your tables were completely wrong; not just
sample_ctrl_year and sample_buil_year. Right ?
Autoanalyze would normally handle this on nonempty tables (children or
otherwise) and you should manually run ANALZYE on the parents (both levels of
them) whenever statistics change, like after running a big DELETE or DROP or
after a significant interval of time has passed relative to the range of time
in the table's timestamp columns.
Do you know why autoanalze didn't handle the nonempty tables on its own ?
> Now, the channel name I gave has no entries in sample_buil_year...(and when I
> run the query directly against sample_buil_year the right index is picked
> up) So maybe something related with the partitioning?
>-> Index Scan Backward using smpl_time_bx2_idx on
> sample_buil_year c_5 (cost=0.56..2023054.76 rows=665761 width=75) (actual
> time=13216.589..13216.589 rows=0 loops=1)
> Filter: (channel_id = $0)
> Rows Removed by Filter: 50597834
> Buffers: shared hit=26626368
So it scanned the entire index expecting to find 5 matching channel IDs "pretty
soon", based on the generic distribution of channel IDs, without the benefit of
knowing that this channel ID doesn't exist at all (due to =(subquery)).
26e6 buffers is 200GB, apparently accessing some pages many
times (even if cached).
table_name| index_name | table_size | index_size
sample_buil_year | smpl_time_bx2_idx | 4492 MB| 1084 MB
General comments:
On Wed, Jan 02, 2019 at 04:28:41PM +, Abadie Lana wrote:
>"sample_time_bm_idx" btree (channel_id, smpl_time)
>"sample_time_mb1_idx" btree (smpl_time, channel_id)
>"smpl_time_bx1_idx" btree (smpl_time)
The smpl_time index is loosely redundant with index on (smpl_time,channel_id).
You might consider dropping it, or otherwise dropping the smpl_time,channel_id
index and making two separate indices on smpl_time and channel. That would
allow bitmap ANDing them together.
Or possibly (depending on detail of your data loading) leaving the composite
index and changing smpl_time to a BRIN index - it's nice to be able to CLUSTER
on the btree index to maximize the efficiency of the brin index.
>Check constraints:
>"sample_buil_month_smpl_time_check" CHECK (smpl_time >= (now() - '32
> days'::interval)::timestamp without time zone AND smpl_time <= now())
I'm surprised that works, and not really sure what it's doing..but in any case
it's maybe not doing what you wanted(??). I'm guessing you never get
constraint exclusion (which is irrelevant for this query but still).
Justin
