Re: Bitmap scan is undercosted? - overestimated correlation and cost_index

2017-12-12 Thread Jeff Janes
On Wed, Dec 6, 2017 at 1:46 PM, Justin Pryzby  wrote:

> On Tue, Dec 05, 2017 at 01:50:11PM -0500, Tom Lane wrote:
> > Jeff Janes  writes:
> > > On Dec 3, 2017 15:31, "Tom Lane"  wrote:
> > >> Jeff Janes  writes:
> > >>> But I do see that ties within the logical order of the column values
> are
> > >>> broken to agree with the physical order.  That is wrong, right?  Is
> there
> > >>> any argument that this is desirable?
> >
> > >> Uh ... what do you propose doing instead?  We'd have to do something
> with
> > >> ties, and it's not so obvious this way is wrong.
> >
> > > Let them be tied.
> ...
> > I thought some more about this.  What we really want the correlation stat
> > to do is help us estimate how randomly an index-ordered scan will access
> > the heap.  If the values we've sampled are all unequal then there's no
> > particular issue.  However, if we have some group of equal values, we
> > do not really know what order an indexscan will visit them in.  The
> > existing correlation calculation is making the *most optimistic possible*
> > assumption, that such a group will be visited exactly in heap order ---
> > and that assumption isn't too defensible.
>
> I'm interested in discusstion regarding bitmap cost, since it would have
> helped
> our case discussed here ~18 months ago:
> https://www.postgresql.org/message-id/flat/20160524173914.GA11880%
> 40telsasoft.com#[email protected]
>
> ...but remember: in Vitaliy's case (as opposed to mine), the index scan is
> *faster* but being estimated at higher cost than bitmap (I have to keep
> reminding myself).  So the rest of this discussion is about the
> overly-optimistic cost estimate of index scans, which moves in the opposite
> direction for this reported problem.  For the test cases I looked at, index
> scans were used when RPC=1 and redundant conditions were avoided, so I'm
> not
> sure if there's any remaining issue (but I haven't looked at the latest
> cases
> Vitaliy sent).
>
> > In any case, given that we do this calculation without regard
> > to any specific index,
>
> One solution is to compute stats (at least correlation) for all indices,
> not
> just expr inds.  I did that earlier this year while throwing around/out
> ideas.
> https://www.postgresql.org/message-id/20170707234119.
> GN17566%40telsasoft.com


When is the correlation of a column which is not the leading column of a
btree index or in a brin index ever used?  If we did compute index-specific
correlations, we could maybe just drop pure-column correlations.


>
> > We do have an idea, from the data we have, whether the duplicates are
> close
> > together in the heap or spread all over.
>
> I think you just mean pg_stats.correlation for all values, not just
> duplicates
> (with the understanding that duplicates might be a large fraction of the
> tuples, and high weight in correlation).
>
> Another issue I noted in an earlier thread is that as table sizes
> increase, the
> existing correlation computation approaches 1 for correlated insertions,
> (like
> "append-only" timestamps clustered around now()), due to ANALYZE sampling a
> fraction of the table, and thereby representing only large-scale
> correlation,
>

That isn't due to sampling.  That is due to the definition of linear
correlation.  Large scale is what it is about.


> Generated data demonstrating this (I reused this query so it's more
> complicated
> than it needs to be):
>
> [pryzbyj@database ~]$ time for sz in {,9{,9{,9{,9 ; do psql
> postgres -tc "DROP TABLE IF EXISTS t; CREATE TABLE t(i float, j int);
> CREATE INDEX ON t(i);INSERT INTO t SELECT i/9.0+pow(2,(-random())) FROM
> generate_series(1,$sz) i ORDER BY i; ANALYZE t; SELECT $sz, correlation,
> most_common_freqs[1] FROM pg_stats WHERE attname='i' AND tablename='t'";
> done
>
>   |0.187146 |
> 9 |0.900629 |
>99 |0.998772 |
>   999 |0.87 |
>

Because the amount of jitter introduced is constant WRT $sz, but the range
of i/9.0 increases with $sz, the correlation actually does increase; it
is not a sampling effect.

Trying to keep it all in my own head: For sufficiently large number of
> pages,
> bitmap scan should be preferred to idx scan due to reduced random-page-cost
> outweighing its overhead in CPU cost.


But CPU cost is probably not why it is losing anyway.

Index scans get a double bonus from high correlation.  It assumes that only
a small fraction of the table will be visited.  And then it assumes that
the visits it does make will be largely sequential.  I think that you are
saying that for a large enough table, that last assumption is wrong, that
the residual amount of non-correlation is enough to make the table reads
more random than sequential.  Maybe.  Do you have a test case that
demonstrates this?  If so, how big do we need to go, and can you see the
problem on SSD as well as HDD?

The thing is, the bitmap scan gets cheated out of one of these bonuses.  It
gets no credit f

PostgreSQL database size is not reasonable

2017-12-12 Thread Mariel Cherkassky
In my postgresql 9.6 instance I have 1 production database. When I query
the size of all databases :

combit=> Select
pg_database.datname,pg_size_pretty(pg_database_size(pg_database.datname))
as size from pg_database;
  datname  |  size   ---+-
 template0 | 7265 kB
 combit| 285 GB
 postgres  | 7959 kB
 template1 | 7983 kB
 repmgr| 8135 kB(5 rows)

When I check what are the big tables in my database (includes indexes) :

combit=> SELECT nspname || '.' || relname AS "relation",
combit-> pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
combit->   FROM pg_class C
combit->   LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
combit->   WHERE nspname NOT IN ('pg_catalog', 'information_schema')
combit-> AND C.relkind <> 'i'
combit-> AND nspname !~ '^pg_toast'
combit->   ORDER BY pg_total_relation_size(C.oid) DESC
combit->   LIMIT 20;
  relation   | total_size -+
 rep.ps_rf_inst_prod  | 48 GB
 rep.nap_inter_x5 | 46 GB
 rep.man_x5   | 16 GB
 rep.tc_fint_x5   | 9695 MB
 rep.nap_ip_debit_x5  | 7645 MB
 rep.ip__billing  | 5458 MB
 rep.ps_rd| 3417 MB
 rep.nap_ip_discount  | 3147 MB
 rep.custo_x5 | 2154 MB
 rep.ip_service_discou_x5 | 1836 MB
 rep.tc_sub_rate__x5  | 294 MB

The total sum is not more than 120G.

When I check the fs directly :

[/data/base] : du -sk * | sort -n7284133227868133237892
18156166694298713364   16400

[/data/base] :

16400 is the oid of the combit database. As you can see the size of combit
on the fs is about 298G.

I checked for dead tuples in the biggest tables :

combit=>select 
relname,n_dead_tup,last_autoanalyze,last_analyze,last_autovacuum,last_vacuum
from pg_stat_user_tables order by n_live_tup desc limit4;

  -[ RECORD 1 ]+--
  relname  | ps_rf_inst_prod
  n_dead_tup   | 0
  last_autoanalyze | 2017-12-04 09:00:16.585295+02
  last_analyze | 2017-12-05 16:08:31.218621+02
  last_autovacuum  |
  last_vacuum  |
  -[ RECORD 2 ]+--
  relname  | man_x5
  n_dead_tup   | 0
  last_autoanalyze | 2017-12-05 06:02:07.189184+02
  last_analyze | 2017-12-05 16:12:58.130519+02
  last_autovacuum  |
  last_vacuum  |
  -[ RECORD 3 ]+--
  relname  | tc_fint_x5
  n_dead_tup   | 0
  last_autoanalyze | 2017-12-05 06:04:06.698422+02
  last_analyze |
  last_autovacuum  |
  last_vacuum  |
  -[ RECORD 4 ]+--
  relname  | nap_inter_x5
  n_dead_tup   | 0
  last_autoanalyze | 2017-12-04 08:54:16.764392+02
  last_analyze | 2017-12-05 16:10:23.411266+02
  last_autovacuum  |
  last_vacuum  |

I run vacuum full on all 5 top tables 2 hours ago and it didnt free alot of
space...

On this database the only operations that happen are truncate , insert and
select. So how can it be that I had dead tuples on some of my tables ? If I
only run truncate,select,insert query tuples shouldnt be created..

And the bigger question, Where are the missing 180G ?


Re: PostgreSQL database size is not reasonable

2017-12-12 Thread David G. Johnston
On Tue, Dec 12, 2017 at 8:15 AM, Mariel Cherkassky <
[email protected]> wrote:

> ​A​
> nd the bigger question, Where are the missing 180G ?
>
> ​In the toaster probably...

https://www.postgresql.org/docs/current/static/storage-toast.html

Basically large data values are store in another table different than both
the main table and indexes.

David J.


RE: PostgreSQL database size is not reasonable

2017-12-12 Thread Craig McIlwee

On Tue, Dec 12, 2017 at 8:15 AM, Mariel Cherkassky 
mailto:[email protected]>> wrote:

​A​nd the bigger question, Where are the missing 180G ?

​In the toaster probably...



https://www.postgresql.org/docs/current/static/storage-toast.html



Basically large data values are store in another table different than both the 
main table and indexes.



David J.


The query also says C.relkind <> 'i' which means it’s excluding indexes.  Also 
note that pg_catalog is excluded but LOB data would be stored in 
pg_catalog.pg_largeobject.  That could account for some overlooked space as 
well.

Craig


Re: PostgreSQL database size is not reasonable

2017-12-12 Thread Tom Lane
"David G. Johnston"  writes:
> On Tue, Dec 12, 2017 at 8:15 AM, Mariel Cherkassky <
> [email protected]> wrote:
>> And the bigger question, Where are the missing 180G ?

> ​In the toaster probably...

pg_total_relation_size should have counted the toast tables,
as well as the indexes, if memory serves.

What I'm wondering about is the system catalogs, which Mariel's
query explicitly excluded.  180G would be awful darn large for
those, but maybe there's a bloat problem somewhere.

Otherwise, try to identify the largest individual files in the
database directory ...

regards, tom lane



Re: PostgreSQL database size is not reasonable

2017-12-12 Thread Mariel Cherkassky
The system catalogs located in the global directory but the global
directory isnt so big(500K). As I mentioned, the base directory is huge and
the directory 16400 is the biggest inside. I checked some big files inside
the directory 16400 (which represents the commbit database) and for some
there *isnt an object that match* and for some there are. So, How can I
continue ?


2017-12-12 17:49 GMT+02:00 Tom Lane :

> "David G. Johnston"  writes:
> > On Tue, Dec 12, 2017 at 8:15 AM, Mariel Cherkassky <
> > [email protected]> wrote:
> >> And the bigger question, Where are the missing 180G ?
>
> > ​In the toaster probably...
>
> pg_total_relation_size should have counted the toast tables,
> as well as the indexes, if memory serves.
>
> What I'm wondering about is the system catalogs, which Mariel's
> query explicitly excluded.  180G would be awful darn large for
> those, but maybe there's a bloat problem somewhere.
>
> Otherwise, try to identify the largest individual files in the
> database directory ...
>
> regards, tom lane
>


Re: PostgreSQL database size is not reasonable

2017-12-12 Thread Tom Lane
Mariel Cherkassky  writes:
> The system catalogs located in the global directory but the global
> directory isnt so big(500K).

You're operating under false assumptions.  Only catalogs marked
relisshared are in that directory, other ones are in the per-database
directories.

Somebody mentioned pg_largeobject upthread --- that would definitely
be a candidate to be big, if you're using large objects at all.

regards, tom lane