Java client fails to connect to database with replication slots registered if targetServerType=master

2018-03-07 Thread Tom Dearman
Hi,

We are trying to use logical decoding on our db (9.4.17), I have written a 
decoder based on the examples one, it runs ok on my local db, but we found that 
in another environment we we unable to get a db connection.  It turns out to be 
related to having targetServerType=master in our jdbc connection string, if we 
set this to ‘any' everything works fine.  The error we get when it is set to 
'master’ is below.  Any ideas what is going on or whether this is a bug.

org.postgresql.util.PSQLException: ERROR: syntax error
at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422)
at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2167)
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
at org.postgresql.core.SetupQueryRunner.run(SetupQueryRunner.java:53)
at 
org.postgresql.core.v3.ConnectionFactoryImpl.isMaster(ConnectionFactoryImpl.java:701)
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:218)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.(PgConnection.java:195)
at org.postgresql.Driver.makeConnection(Driver.java:452)
at org.postgresql.Driver.connect(Driver.java:254)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:94)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:79)
at 
net.sf.log4jdbc.sql.jdbcapi.DataSourceSpy.getConnection(DataSourceSpy.java:149)
at sun.reflect.GeneratedMethodAccessor503.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.sixty6.as66.platform.service.support.StaticServiceProxyMethodInterceptor.invoke(StaticServiceProxyMethodInterceptor.java:52)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at 
org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:133)
at 
org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:121)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at 
org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:133)
at 
org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:121)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy93.getConnection(Unknown Source)
at 
org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at 
org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)
at 
org.sixty6.as66.utils.postgresql.logicaldecoding.LogicalDecodingConsumer$WorkerThread.doRun(LogicalDecodingConsumer.java:86)
at 
org.sixty6.as66.utils.concurrent.PollingWorkerThread.run(PollingWorkerThread.java:140)



logical decoder lsn order between transactions

2018-03-27 Thread Tom Dearman
Hi,

We have written a decoder plugin to use streaming replication in an aggregation 
framework and we write the changes in chunks using OutputPluginWriter from 
pg_decode_change. In order to ensure we don't process a message twice we add 
the lsn of each message to our aggregated value and check as new messages come 
in that they are not less than the present lsn saved. The problem we have found 
is that interleaved inserts in 2 separate session have interleaved lsn values, 
an example of this can be reproduced using the test_decoder:

# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);
lsn |  xid   |   data   
++--
 0/3AEB0DC0 | 182312 | BEGIN 182312
 0/3AEB0DC0 | 182312 | table public.data: INSERT: id[integer]:11 data[text]:'1'
 0/3AEB0F20 | 182312 | COMMIT 182312
 0/3AEA4770 | 182265 | BEGIN 182265
 0/3AEA47D8 | 182265 | table public.data: INSERT: id[integer]:10 data[text]:'1'
 0/3AEB1790 | 182265 | COMMIT 182265
(6 rows)


The lsn of the commit is correctly ordered but the inserts/update/BEGIN seems 
to be in the order that they happened, this means that if we save value 
'0/3AEB0DC0’, when we reach '0/3AEA47D8’ we think we have already processed it 
as it has a value less then the last one processed.  I have found that if I 
change to a batched approach, writing to OutputPluginWriter in the 
pg_decode_commit_txn, then this will work as the java client used for the 
aggregation returns the value of the commit lsn as the getLastReceiveLSN, or I 
can just put the commit_lsn value into our streamed results.

If we would like to use a none-batched approach, what are the suggestions for 
how to ensure we only process a given record once, whist only saving one value 
against our aggregation value ie something like an lsn value.

Thanks.



Frequetly updated partial index leads to bloat on index for Postresql 11

2021-03-18 Thread Tom Dearman
Hi,
We have a partial index on a column of the form:

CREATE TABLE table_p2021q1 (
  pk_id BIGINT,
  col1 BIGINT NOT NULL,
  status character varying(255) NOT NULL,
  ...other columns
  PRIMARY KEY (pk_id)
);

CREATE INDEX table_p2021q1_ix04 ON table_p2021q1 (col1) WHERE status =
'IN_PROGRESS';

(the table is also a partitioned table on a date field)
Every one of our entries in the table will go through a state where
status = 'IN_PROGRESS' but there are only a few hundred of them at any
one time.  The index grows quite quickly to about 300MB.  I have run
some bloat queries and the size should always be 8KB and bloat is
always high (99.99% bloat).  We have found that the only way to
reduce the bloat is through reindexing, but after a couple of days the
size goes back to 300MB or so.  Running a vacuum does not reduce the
bloat.

Is this a known issue, are they any ways around it, and if it is an
issue is there a plan to fix it if a fix is possible?

Thanks.




Re: Frequetly updated partial index leads to bloat on index for Postresql 11

2021-07-16 Thread Tom Dearman
Hi,

We upgraded to 13 a couple of months ago on production but are still having an 
issue with bloated partial indexes which have an impact on our partial queries 
especially towards the end of a quarter when our quarterly-partitioned tables 
are getting big.  I have built 14 (on macOS catalina, 14 beta 2) and run a test 
but the index grows fairly large (even though vacuums are running as the table 
is still relatively small - I put in 2 million inserts, each having one update 
of the column that makes up the partial index). The table is:

 Table "public.buyer"
Column |  Type  | Collation | Nullable | Default | 
Storage  | Compression | Stats target | Description 
---++---+--+-+--+-+--+-
 buyer_id  | integer|   | not null | | 
plain| |  | 
 first_name| character varying(35)  |   | not null | | 
extended | |  | 
 last_name | character varying(35)  |   | not null | | 
extended | |  | 
 email_address | character varying(50)  |   |  | | 
extended | |  | 
 status| character varying(256) |   | not null | | 
extended | |  | 
Indexes:
"buyer_pkey" PRIMARY KEY, btree (buyer_id)
"idex_buyer_inactive" btree (first_name) WHERE status::text = 
'IN_PROGRESS'::text
Access method: heap


I run a loop to insert, commit, update, commit one row at a time as this is an 
emulation of what a similar table would experience in production.  The index 
never has many rows with status=‘IN_PROGRESS’ as each row is set to CANCEL in 
the update.  If the index is reindexed it takes 1 page as expected but without 
the reindexing it keeps growing, currently reaching 3MB - this is with 2 
million inserts and updates but our production will have about 300 million 
inserts and > 300 million updates on the partial index in the quarter.  Should 
we have seen more of an improvement in 14?  Is it valid to look at the size of 
the index (\di+) as a measure of whether this latest change to bottom up index 
deleting has helped?

Thanks,

Tom

> On 18 Mar 2021, at 16:30, Peter Geoghegan  wrote:
> 
> On Thu, Mar 18, 2021 at 6:51 AM Tom Dearman  wrote:
>> Is this a known issue, are they any ways around it, and if it is an
>> issue is there a plan to fix it if a fix is possible?
> 
> It's not exactly a known issue per se, but I think the problem here is
> related to the fact that you have lots of duplicates, which did
> perform rather badly prior to Postgres 12. I bet that you'd benefit
> from upgrading to Postgres 12, or especially to Postgres 13. The
> B-Tree space management is a lot better now. (Actually, it'll be
> better again in Postgres 14.)
> 
> -- 
> Peter Geoghegan



Re: Frequetly updated partial index leads to bloat on index for Postresql 11

2021-07-16 Thread Tom Dearman
We have change autovacuum so that it runs more frequently 
autovacuum_vacuum_scale_factor=0.035, the reason we have a partial index on the 
status is that in a table of 300 million entries, only about 100 or so would 
have status=‘IN_PROGRESS’ so we think this should be a nice small index and 
many of our queries want to look up with a where clause status=‘IN_PROGRESS’.  
In theory it works well, but we get a lot of index bloat as there is a lot of 
churn on the status value, ie each row starts as IN_PROGRESS and then goes to 
one of 4 possible completed statuses. 

> On 16 Jul 2021, at 15:49, Michael Lewis  wrote:
> 
> Have you tried setting autovacuum to run quite aggressively, perhaps just on 
> this table? Have you tried an index on the status column only, rather than 
> partial?



Re: Frequetly updated partial index leads to bloat on index for Postresql 11

2021-07-16 Thread Tom Dearman
Other indexes do bloat, but the percentage bloat is a lot less, presumably 
because this is a partial index where the partial column has a high degree of 
changes ie maybe 100 genuinely ‘live’ rows in a table of 300 million where 
every row has gone through a state where it would have been in the index.  In 
some of our partitions we might have 2000 old rows that do hang around for a 
long time and another 100 or so ‘real’ partial index entries so 2200 in total 
but the number of rows would be 300 million so it is a lot less than 1%.


> On 16 Jul 2021, at 16:43, Tom Lane  wrote:
> 
> Tom Dearman  writes:
>> We have change autovacuum so that it runs more frequently 
>> autovacuum_vacuum_scale_factor=0.035, the reason we have a partial index on 
>> the status is that in a table of 300 million entries, only about 100 or so 
>> would have status=‘IN_PROGRESS’ so we think this should be a nice small 
>> index and many of our queries want to look up with a where clause 
>> status=‘IN_PROGRESS’.  In theory it works well, but we get a lot of index 
>> bloat as there is a lot of churn on the status value, ie each row starts as 
>> IN_PROGRESS and then goes to one of 4 possible completed statuses. 
> 
> Is it really the case that only this index is bloating?  In principle, an
> update on a row of the table should result in new entries in every index
> of the table.  A partial index, due to the filter applied to possibly not
> store any index entry, should in theory have less bloat than other
> indexes.
> 
> If that's not what you're seeing, there must be something about the data
> being stored in that index (not the partial-index filter condition) that
> results in a lot of low-occupancy index pages over time.  You didn't say
> anything about what the data payload is.  But we've seen bloat problems in
> indexes where, say, every tenth or hundredth value in the index ordering
> would persist for a long time while the ones in between get deleted
> quickly.  That leads to low-density indexes that VACUUM can't do anything
> about.
> 
>   regards, tom lane





tstzrange on large table gives poor estimate of expected rows

2022-01-24 Thread Tom Dearman
Hi,

We have a fairly big table (22 million rows) which has a start and end
timestamp with time zone and other columns.  Some of the columns plus
start timestamp make a primary key.  The end timestamp is exactly one
day ahead of the start timestamp for each row and there are
approximately 1 rows per day, so each day there will be about
1 inserts (and many updates) where the start is -mm-dd
00:00:00.0 and the end column is one day ahead of that.  We have
created a tstzrange on start and end column and then do a query which
looks for exactly one day range:

explain (analyze, buffers) select tstzrange_test.interval_start_date
as interval_start_date
from tstzrange_test tstzrange_test
where tstzrange('2021-07-20 00:00:00.0', '2021-07-21 00:00:00.0',
'[]') @> tstzrange(tstzrange_test.interval_start_date,
tstzrange_test.interval_end_date, '[]');

In the real query this result is then grouped other columns then
joined on another table which has similar results but by the hour
instead of by the day.

The query always underestimates the number of rows:

Index Scan using tstzrange_test_ix01 on tstzrange_test
(cost=0.41..8.43 rows=1 width=8) (actual time=0.347..8.889 rows=1
loops=1)
   Index Cond: (tstzrange(interval_start_date, interval_end_date,
'[]'::text) <@ '["2021-07-20 00:00:00+00","2021-07-21
00:00:00+00"]'::tstzrange)
   Buffers: shared hit=815
 Planning Time: 0.120 ms
 Execution Time: 9.591 ms
(5 rows)

and when this is aggregated and then joined on another table it leads
to a slow query.  If I change the query so the start date is 12 hours
earlier (ie so no extra results would be found as the previous day
start date would be 24 hours earlier) it gives a better estimate and
uses a different plan:

explain (analyze, buffers) select tstzrange_test.interval_start_date
as interval_start_date
from tstzrange_test tstzrange_test
where tstzrange('2021-07-19 12:00:00.0', '2021-07-21 00:00:00.0',
'[]') @> tstzrange(tstzrange_test.interval_start_date,
tstzrange_test.interval_end_date, '[]');

 Bitmap Heap Scan on tstzrange_test  (cost=199.89..11672.03 rows=4577
width=8) (actual time=6.880..7.844 rows=1 loops=1)
   Recheck Cond: ('["2021-07-19 12:00:00+00","2021-07-21
00:00:00+00"]'::tstzrange @> tstzrange(interval_start_date,
interval_end_date, '[]'::text))
   Heap Blocks: exact=65
   Buffers: shared hit=330
   ->  Bitmap Index Scan on tstzrange_test_ix01  (cost=0.00..198.74
rows=4577 width=0) (actual time=6.866..6.866 rows=1 loops=1)
 Index Cond: (tstzrange(interval_start_date,
interval_end_date, '[]'::text) <@ '["2021-07-19
12:00:00+00","2021-07-21 00:00:00+00"]'::tstzrange)
 Buffers: shared hit=265
 Planning Time: 0.157 ms
 Execution Time: 8.186 ms
(9 rows)

I have tried increasing the states to 1 on each of the timestamp
columns and also setting up a statistics object to say that start and
end are related (dependencies).

This is how I created and  populated the table:

CREATE TABLE tstzrange_test (
interval_start_datetimestamp with time zone,
interval_end_date timestamp with time zone,
user_registration_id bigint
);

insert into tstzrange_test (interval_start_date, interval_end_date,
user_registration_id) select '2021-01-01 00:00:00.0'::timestamp with
time zone + ((psid)/1 || ' day')::interval, '2021-01-02
00:00:00.0'::timestamp with time zone + ((psid)/1 || '
day')::interval, floor(random() * (500) + 1)::int from
generate_series(1,360) as s(psid);

CREATE INDEX tstzrange_test_ix01 ON tstzrange_test USING gist
(tstzrange(interval_start_date, interval_end_date, '[]'::text));


Any help on how I can make the planner estimate better would be much
appreciated.

Thanks.




Re: tstzrange on large table gives poor estimate of expected rows

2022-01-25 Thread Tom Dearman
Thanks for your help.  It is true we could get rid of it but we still
want to use the functional index on the date range as we understand it
is supposed to be a better look up - we also have other date range
look ups on tables that seem to be degrading.  I have found a solution
to the problem.  The postgres default_statistics_target is 100 and
when we upped it to 1 the estimate was good.  We could not have
set the default to 1 on production but there appeared to be no way
to change the value for the function index as statistics is set per
column.  However, in a post answered by Tom Lane in 2012 he gives a
way to set the value for the statistics target on the functional index
(https://www.postgresql.org/message-id/6668.1351105908%40sss.pgh.pa.us)

Thanks.

On Mon, 24 Jan 2022 at 17:43, Michael Lewis  wrote:
>
> If interval_end_date is always 1 day ahead, why store it at all?
>
> Dependencies on a custom stats object wouldn't do anything I don't think 
> because they are offset. They are 100% correlated, but not in a way that any 
> of the existing stat types capture as far as I can figure.