On Tue, Dec 5, 2017 at 11:06 PM, Vitaliy Garnashevich <
[email protected]> wrote:
This is very cool, thanks.
> I've tried to create a better test case:
> - Increase shared_buffers and effective_cache_size to fit whole database,
> including indexes.
> - Use random(), to avoid correlation between the filtered values.
> - Make both columns of integer type, to avoid special cases with boolean
> (like the one happened with CREATE STATISTICS).
> - Flush OS disk cache and then try running the query several times, to get
> both cold-cache results and all-in-ram results.
> - There are several tests, with different frequency of the selected values
> in the two columns: [1/10, 1/10], [1/50, 1/10], [1/100, 1/10].
> - There is a split of cost by contribution of each of its components:
> seq_page_cost, random_page_cost, cpu_tuple_cost, cpu_index_tuple_cost,
> cpu_operator_cost. The EXPLAIN is run for each component, every time with
> only one of the components set to non-zero.
>
Where you have question marks, that means you could not force it into the
plan you wanted with all-but-one settings being zero?
> - The test was run on a Digitalocean VM: Ubuntu 16.04.3 LTS (GNU/Linux
> 4.4.0-101-generic x86_64), 2 GB RAM, 2 core CPU, SSD; PostgreSQL 9.5.10.
>
>
> shared_buffers = 1024MB
> effective_cache_size = 1024MB
>
I would set this even higher.
>
> work_mem = 100MB
>
> create table aaa as select floor(random()*10)::int num, (random()*10 <
> 1)::int flag from generate_series(1, 1000) id;
> create table aaa as select floor(random()*50)::int num, (random()*10 <
> 1)::int flag from generate_series(1, 1000) id;
> create table aaa as select floor(random()*100)::int num, (random()*10 <
> 1)::int flag from generate_series(1, 1000) id;
>
> create index i1 on aaa (num);
> create index i2 on aaa (flag);
>
> set enable_bitmapscan = on; set enable_indexscan = off; set
> enable_seqscan = off;
> set enable_bitmapscan = off; set enable_indexscan = on; set
> enable_seqscan = off;
> set enable_bitmapscan = off; set enable_indexscan = off; set
> enable_seqscan = on;
>
> set seq_page_cost = 1.0; set random_page_cost = 1.0; set cpu_tuple_cost =
> 0.01; set cpu_index_tuple_cost = 0.005; set cpu_operator_cost = 0.0025;
>
> explain (analyze,verbose,costs,buffers) select * from aaa where num = 1
> and flag = 1;
>
One thing to try is to use explain (analyze, timing off), and then get the
total execution time from the summary line at the end of the explain,
rather than from "actual time" fields. Collecting the times of each
individual step in the execution can impose a lot of overhead, and some
plans have more if this artificial overhead than others. It might change
the results, or it might not. I know that sorts and hash joins are very
sensitive to this, but I don't know about bitmap scans.
What seems odd to me is that in different kinds of tests (with different
> frequency of column values):
>
> i1 Rows Removed by Filter = 900156, 179792, 89762 (decreased a lot)
> i1 buffers = 46983, 44373, 39928 (decreased, but not a lot)
>
To filter out 89762 tuples, you first have to look them up in the table,
and since they are randomly scattered that means you hit nearly every page
in the table at least once. In fact, I don't understand how the empirical
number of buffers hits can be only 46983 in the first case, when it has to
visit 1,000,000 rows (and reject 90% of them). I'm guessing that it is
because your newly created index is sorted by ctid order within a given
index value, and that the scan holds a pin on the table page between
visits, and so doesn't count as a hit if it already holds the pin.
You could try to create an empty table, create the indexes, then populate
the table with your random select query, to see if that changes the buffer
hit count. (Note that this wouldn't change the cost estimates much even it
does change the measured number of buffer hits, because of
effective_cache_size. It knows you will be hitting ~47,000 pages ~25 times
each, and so only charges you for the first time each one is hit.)
> i1 best case time = 756.045, 127.814, 79.492 (decreased a lot, as well as
> probably average case too)
> i1 cost estimates = 67358.15, 48809.34, 46544.80 (did not decrease a lot)
>
Right. Your best case times are when the data is completely in cache. But
your cost estimates are dominated by *_page_cost, which are irrelevant when
the data is entirely in cache. You are telling it to estimate the
worse-case costs, and it is doing that pretty well (within this one plan).
>
> i2 Rows Removed by Filter = 900835, 980350, 991389
> i2 buffers = 46985, 46985, 46987
> i2 best case time = 377.554, 346.481, 387.874
> i2 cost estimates = 39166.34, 39247.83, 40167.34
>
> It's odd that increase in actual execution time for "i1" was not reflected
> enough in cost estimates.
>
No, that's entirely expected given your settings. As long as you are
charging disk-read costs for reading data from RAM, you