how to get top plan of GatherMerge in OSS10

2020-09-04 Thread Yang, Rong
Hello~

When I look at the plan for a query in OSS11 and OSS10, the plan tree structure 
is different.
OSS10.13:
postgres=# explain (costs off) select C,((select c_int  from table1 where 
C<=2)except all(select c_int  from table1 where C=2)) from table1 where C < 100 
order by C;
  QUERY PLAN
--
Sort
   Sort Key: table1.c
   InitPlan 1 (returns $2)
 ->  HashSetOp Except All
   ->  Append
 ->  Subquery Scan on "*SELECT* 1"
   ->  Gather
 Workers Planned: 2
 ->  Parallel Seq Scan on table1 table1_1
   Filter: (c <= 2)
 ->  Subquery Scan on "*SELECT* 2"
   ->  Gather
Workers Planned: 2
 ->  Parallel Seq Scan on table1 table1_2
   Filter: (c = 2)
   ->  Gather
 Workers Planned: 2
 ->  Parallel Seq Scan on table1
   Filter: (c < 100)
(19 rows)

OSS11.8:
postgres=# explain (costs off) select C,((select c_int  from table1 where 
C<=2)except all(select c_int  from table1 where C=2)) from table1 where C < 100 
order by C;
  QUERY PLAN
--
Gather Merge
   Workers Planned: 2
   Params Evaluated: $2
   InitPlan 1 (returns $2)
 ->  HashSetOp Except All
   ->  Append
 ->  Subquery Scan on "*SELECT* 1"
   ->  Gather
 Workers Planned: 2
 ->  Parallel Seq Scan on table1 table1_1
   Filter: (c <= 2)
 ->  Subquery Scan on "*SELECT* 2"
   ->  Gather
 Workers Planned: 2
 ->  Parallel Seq Scan on table1 table1_2
   Filter: (c = 2)
   ->  Sort
 Sort Key: table1.c
 ->  Parallel Seq Scan on table1
   Filter: (c < 100)
(20 rows)

Data in a table and Database configuration are the same,as follows:
create table table1(C int, c_int int, c_varchar varchar(1024),c_bigint bigint, 
c_varchar2 varchar(1024),c_time timestamp);
insert into table1 
values(1,10,'1ttt',123456789,'012-3456-7890','2018-05-06 10:00:00');
insert into table1 
values(2,20,'2tt2t222t',223456789,'023-3486-2342','2019-05-06 10:00:00');
insert into table1 
values(3,30,'3tt3t333tt',323456789,'021-9823-8821','2020-05-06 10:00:00');
insert into table1 
values(4,20,'2tt2t222t',223456789,'023-3486-2342','2019-05-06 10:00:00');
set enable_sort=off;
set force_parallel_mode to on;
set parallel_setup_cost to 0.1;
set min_parallel_table_scan_size to 0;
set min_parallel_index_scan_size to 0;
insert into table1 select 
generate_series(10,100),generate_series(100,10),'aaa',7,'012-3456-7890','2018-05-06
 10:00:00';

question:
in OSS 10, how  to make the top plan of the plan tree to GatherMerge with 
‘except all’ in sql?





Re: How to enumerate partitions from a window function?

2020-09-04 Thread Alban Hertroys
On Thu, 3 Sep 2020 at 20:59, Michael Lewis  wrote:

> It seems like you are maybe wanting this- If the previous row is the same,
> then get the previous row's run_nr. If it is different, then increment.
>
> case when lag( property_A ) over() = property_A and lag( property_B )
> over() = property_B then coalesce( lag( run_nr ) over(), 1 ) else lag(
> run_nr ) over() + 1 end
>
> Perhaps there is a much simpler implementation though.
>

That would work were it not that the very column we're defining is the one
to be aliased run_nr. The data does not contain that information, it's what
I'm trying to enrich it with and what I'm having trouble wrapping my head
around.
Your query (adopted a tiny bit) unfortunately results in:

select datetime, property_A, property_B
, first_value(datetime::time) over run_win as swap_time
, case
when lag(property_A) over time_win = property_A
and lag(property_B) over time_win = property_B
then coalesce(lag(run_nr) over time_win, 1)
else lag(run_nr) over time_win +1
  end
, value
from process_data
window
time_win as (order by datetime)
, run_win as (partition by property_A, property_B order by datetime)
order by datetime
;

ERROR:  column "run_nr" does not exist
LINE 6:  then coalesce(lag(run_nr) over time_win, 1)
   ^
SQL state: 42703
Character: 221

I turned my example into a proper test-case (better late than never):

CREATE TABLE process_data (
datetime timestamp without time zone NOT NULL,
property_a text NOT NULL,
property_b text NOT NULL,
value numeric(12,3)
);

COPY process_data (datetime, property_a, property_b, value) FROM stdin;
2020-09-03 15:06:00 tea earl grey 0.230
2020-09-03 15:07:00 tea earl grey 0.220
2020-09-03 15:08:00 tea ceylon 0.340
2020-09-03 15:09:00 coffee cappucino 0.450
2020-09-03 15:10:00 coffee cappucino 0.430
2020-09-03 15:11:00 tea earl grey 0.230
\.

With the desired result (note that swap_time and run_nr are calculated
columns):
  datetime   | property_a | property_b | swap_time | run_nr | value
-+++---++---
 2020-09-03 15:06:00 | tea| earl grey  | 15:06:00  |  1 | 0.230
 2020-09-03 15:07:00 | tea| earl grey  | 15:06:00  |  1 | 0.220
 2020-09-03 15:08:00 | tea| ceylon | 15:08:00  |  2 | 0.340
 2020-09-03 15:09:00 | coffee | cappucino  | 15:09:00  |  3 | 0.450
 2020-09-03 15:10:00 | coffee | cappucino  | 15:09:00  |  3 | 0.430
 2020-09-03 15:11:00 | tea| earl grey  | 15:06:00  |  4 | 0.230
(6 rows)

I've been looking around on the Internet in the meantime, and it seems
people either solve this with a recursive CTE (referencing the previous row
by row_number() over (...)) or by writing a set-returning function that
walks over the data in datetime order using a cursor.

Since the actual query is growing more and more state-tracking flags, using
a function has the added benefit that referencing state columns from the
previous row gets a lot easier (lots of repeated window functions
otherwise). It would become a procedural solution instead of a set-based
one, but considering that this data is order-sensitive (on datetime),
that's probably what a set-based solution would also end up doing anyway.

Regards,
Alban.

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


Implement a new data type

2020-09-04 Thread mohand oubelkacem makhoukhene
Hello,

If I want to create a new type Person (firstname varchar, lastname varchar, 
address varchar ...) what is the best way to procede in Postgresql

Create type or  create extension ? What are advantages /disadvantages of both 
solutions ?
Thank you and best regards
Mohand


Aw: Implement a new data type

2020-09-04 Thread Karsten Hilbert
> If I want to create a new type Person (firstname varchar, lastname varchar, 
> address varchar ...)
> what is the best way to procede in Postgresql
 
The best way is to re-evaluate the "I want".

Karsten




Re: how to get top plan of GatherMerge in OSS10

2020-09-04 Thread David Rowley
On Fri, 4 Sep 2020 at 20:21, Yang, Rong  wrote:
> in OSS 10, how  to make the top plan of the plan tree to GatherMerge with 
> ‘except all’ in sql?

(I guess you're talking about PostgreSQL 10. I'm not sure what OSS 10 is.)

The ability for Gather Merge to work with Init Plan values was only
added in [1], which was new to PostgreSQL 11. So you're not going to
make that work in PostgreSQL 10.

You could try crafting the query in such a way that an Init plan is
not used. e.g a CROSS JOIN, but not sure if that'll improve your
performance any. So you could try reducing the parallel_tuple_cost a
bit, which might give you a Gather, so at least the Seq Scan will be
done in parallel. The sort will still be serial though.

A btree index on table1 (c) looks like it might be worth considering.

David

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e89a71fb44




Bitmap scan seem like such a strange choice when "limit 1"

2020-09-04 Thread Klaudie Willis
Postgres 13 beta3

set enable_bitmapscan=1; -- default
explain (analyze,buffers)
select *
from bigtable
where cars_ref = 1769854207 and t > '2020-01-01'::timestamp and t < 
'2021-01-01'::timestamp
limit 1

Short story. Big table > 100M rows. b-tree index on cars_ref, the t constraints 
limits it to one partition, but I don't think that is very relevant. In any 
case, running this query takes 1.5s:

Limit (cost=23728.33..23729.24 rows=1 width=635) (actual 
time=1516.865..1516.867 rows=1 loops=1)
Buffers: shared hit=2376
-> Bitmap Heap Scan on bigtable_y2020 bigtable (cost=23728.33..2530109.01 
rows=2730872 width=635) (actual time=1516.863..1516.864 rows=1 loops=1)
Recheck Cond: (cars_ref = 1769854207)
Filter: ((t > '2020-01-01 00:00:00'::timestamp without time zone) AND (t < 
'2021-01-01 00:00:00'::timestamp without time zone))
Heap Blocks: exact=1
Buffers: shared hit=2376
-> Bitmap Index Scan on bigtable_y2020_cars_ref_idx (cost=0.00..23045.61 
rows=2731965 width=0) (actual time=751.640..751.640 rows=2817675 loops=1)
Index Cond: (cars_ref = 1769854207)
Buffers: shared hit=2375
Planning Time: 0.365 ms
Execution Time: 1540.207 ms

1.5 seconds seems a lot for a single indexed row.  I would think it should be 
instant, and so it is when I disable bitmap scan with: set enable_bitmapscan=0

Limit (cost=0.57..1.60 rows=1 width=636) (actual time=0.027..0.028 rows=1 
loops=1)
Buffers: shared hit=5
-> Index Scan using bigtable_y2020_cars_ref_idx on bigtable_y2020 bigtable 
(cost=0.57..2966738.51 rows=2873818 width=636) (actual time=0.026..0.026 rows=1 
loops=1)
Index Cond: (cars_ref = 1769854207)
Filter: ((t > '2020-01-01 00:00:00'::timestamp without time zone) AND (t < 
'2021-01-01 00:00:00'::timestamp without time zone))
Buffers: shared hit=5
Planning Time: 0.291 ms
Execution Time: 0.049 ms

But I am not supposed to disable bitmap scan! So why on earth do Postgres 13 
beta3 think that returning 1 row, should be done with a bitmap scan?
I noticed that different values for "cars_ref" result in different plans in the 
query above. I belive that it has to do with wheter or not the cars_ref is in 
the "most common value" list. But in any case, I cant see why a bitmap scan is 
wise then you expect one row.

best regards
Klaudie

Re: Bitmap scan seem like such a strange choice when "limit 1"

2020-09-04 Thread Laurenz Albe
On Fri, 2020-09-04 at 11:42 +, Klaudie Willis wrote:
> Postgres 13 beta3
> 
> set enable_bitmapscan=1; -- default
> explain (analyze,buffers) 
> select *
> from bigtable 
> where cars_ref = 1769854207 and t > '2020-01-01'::timestamp  and  t < 
> '2021-01-01'::timestamp 
> limit 1
> 
> Short story.  Big table > 100M rows. b-tree index on cars_ref, the t 
> constraints limits it to one partition, but I don't think that is very 
> relevant.  In any case, running this query takes 1.5s:
> 
> Limit  (cost=23728.33..23729.24 rows=1 width=635) (actual 
> time=1516.865..1516.867 rows=1 loops=1)
>   Buffers: shared hit=2376
>   ->  Bitmap Heap Scan on bigtable_y2020 bigtable (cost=23728.33..2530109.01 
> rows=2730872 width=635) (actual time=1516.863..1516.864 rows=1 loops=1)
> Recheck Cond: (cars_ref = 1769854207)
> Filter: ((t > '2020-01-01 00:00:00'::timestamp without time zone) AND 
> (t < '2021-01-01 00:00:00'::timestamp without time zone))
> Heap Blocks: exact=1
> Buffers: shared hit=2376
> ->  Bitmap Index Scan on bigtable_y2020_cars_ref_idx  
> (cost=0.00..23045.61 rows=2731965 width=0) (actual time=751.640..751.640 
> rows=2817675 loops=1)
>   Index Cond: (cars_ref = 1769854207)
>   Buffers: shared hit=2375
> Planning Time: 0.365 ms
> Execution Time: 1540.207 ms
> 
> 1.5 seconds seems a lot for a single indexed row.  I would think it should be 
> instant, and so it is when I disable bitmap scan with: set enable_bitmapscan=0
> 
> Limit  (cost=0.57..1.60 rows=1 width=636) (actual time=0.027..0.028 rows=1 
> loops=1)
>   Buffers: shared hit=5
>   ->  Index Scan using bigtable_y2020_cars_ref_idx on bigtable_y2020 bigtable 
> (cost=0.57..2966738.51 rows=2873818 width=636) (actual time=0.026..0.026 
> rows=1 loops=1)
> Index Cond: (cars_ref = 1769854207)
> Filter: ((t > '2020-01-01 00:00:00'::timestamp without time zone) AND 
> (t < '2021-01-01 00:00:00'::timestamp without time zone))
> Buffers: shared hit=5
> Planning Time: 0.291 ms
> Execution Time: 0.049 ms
> 
> But I am not supposed to disable bitmap scan!  So why on earth do Postgres 13 
> beta3 think that returning 1 row, should be done with a bitmap scan?
> I noticed that different values for "cars_ref" result in different plans in 
> the query above.  I belive that it has to do with wheter or not the cars_ref 
> is in the "most common value" list.  But in
> any case, I cant see why a bitmap scan is wise then you expect one row.

PostgreSQL estimates that 2817675 rows satisfy the index condition and expects
that it will have to scan many of them before it finds one that satisfies the
filter condition.  That turns out to be a wrong guess.

You could create an index on (cars_ref, t), then PostgreSQL will certainly
pick an index scan.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Bitmap scan seem like such a strange choice when "limit 1"

2020-09-04 Thread Klaudie Willis
> PostgreSQL estimates that 2817675 rows satisfy the index condition and expects
> that it will have to scan many of them before it finds one that satisfies the
> filter condition. That turns out to be a wrong guess.
>
> You could create an index on (cars_ref, t), then PostgreSQL will certainly
> pick an index scan.


Thanks!  But, the t (time constraint) already isolates a particular partition.  
The bigtable is partitioned on exactly t, by year.  This is why you do not see 
any other indexes/partitions being queried in the EXPLAIN.

...
PARTITION BY RANGE (t)
...
CREATE TABLE public.bigtable_y2020 PARTITION OF public.bigtable
FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2021-01-01 00:00:00');

To me, it seems like filter on date is unnecessary when you already IS on such 
a partition!

I'd like to add, that when I do the same query DIRECTLY on the bigtable_y2020 
(instead of the partition parent) it does change to "index scan" again.

best regards
K





Check for duplicates before inserting new rows

2020-09-04 Thread Rich Shepard

This is a new issue for me: I've received data from a different source and
need to add non-duplicates to two tables in the database. Each row in the
new data has a station ID and associated measurements.

The existing database includes a table for station information and another
for measurements made there.

I want to learn which stations and which measurements in the new data set
are not included in the existing tables. And, I don't want to try inserting
the new data and have postgres tell me when it's found duplicates,
especially since there are two tables involved.

My research into how to do this has not found a solution so I ask for
pointers to resources that will teach me how to add these new data to both
tables.

Regards,

Rich





Re: Check for duplicates before inserting new rows

2020-09-04 Thread George Woodring
I would suggest creating a temp table based on the original table and
loading the data into it first.  You can then purge the duplicates.

George Woodring
iGLASS Networks
www.iglass.net


On Fri, Sep 4, 2020 at 9:21 AM Rich Shepard 
wrote:

> This is a new issue for me: I've received data from a different source and
> need to add non-duplicates to two tables in the database. Each row in the
> new data has a station ID and associated measurements.
>
> The existing database includes a table for station information and another
> for measurements made there.
>
> I want to learn which stations and which measurements in the new data set
> are not included in the existing tables. And, I don't want to try inserting
> the new data and have postgres tell me when it's found duplicates,
> especially since there are two tables involved.
>
> My research into how to do this has not found a solution so I ask for
> pointers to resources that will teach me how to add these new data to both
> tables.
>
> Regards,
>
> Rich
>
>
>
>


Re: Check for duplicates before inserting new rows

2020-09-04 Thread Rich Shepard

On Fri, 4 Sep 2020, George Woodring wrote:


I would suggest creating a temp table based on the original table and
loading the data into it first. You can then purge the duplicates.


George,

I hadn't thought of this. Using a duplicate table without a PK would work
well if there's only one attribute that needs checking.

In my case there are two tables involved: locations and measurements. So, I
can create a temporary table from the new data which holds only the
locations. I can then isolate the new location rows that do not already
exist in that table, then insert the new data to eliminate the duplicated
measurements.

Thanks,

Rich





Re: Bitmap scan seem like such a strange choice when "limit 1"

2020-09-04 Thread Tom Lane
Klaudie Willis  writes:
> I'd like to add, that when I do the same query DIRECTLY on the bigtable_y2020 
> (instead of the partition parent) it does change to "index scan" again.

Yeah.  I think the issue here is that add_paths_to_append_rel only
considers cheapest-total paths for the member relations.  Seeing
that it's already considering a slightly ridiculous number of
parallelization options, I'm hesitant to throw in cheapest-startup
considerations as well, for fear of blowing out planning time.

Maybe the right way to improve this is to bypass add_paths_to_append_rel
entirely when there's exactly one surviving child rel, and make it
just use all the surviving paths for that child.

regards, tom lane




Re: Check for duplicates before inserting new rows

2020-09-04 Thread Rich Shepard

On Fri, 4 Sep 2020, Olivier Gautherot wrote:


First of all, what version of PostgreSQL are you using?


Olivier,

12.2.


One way would be to add a UNIQUE constraint and perform for each row of the
source table a INSERT ... ON CONFLICT DO NOTHING (see
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)

If it is a 2-way merge, I would encapsulate the process in a function
(which will create a transaction to protect your process) and add a column
to trace the rows that have been merged. For this purpose, you can use the
ON CONFLICT DO UPDATE ...


Lots for me to ponder.

The new data has attributes stored in two database tables: location and
measurements. I'll need to think about how both could be handled in a single
function. For example, the new data contains a row with a location not
already in the existing location table. That location needs to be added to
the location table and its associated measurement attributes then added to
the existing measurements table.

Thanks for the suggestion,

Rich




Re: Dependency problem using community repo on Redhat 7

2020-09-04 Thread Adrian Klaver

On 9/3/20 11:31 AM, Lawrence Layhee wrote:

Hi,

We are having a dependency problem when using the community repo on redhat 7

When we install postgresql12-devel-12.4-1PGDG.rhel7.x86_64 we get the 
issue below.


Redhat doesn’t support the dependencies. Any ideas? We are trying to get 
the ruby gem installed. It works with the Redhat dist of PostgreSQL but 
we prefer community.


Any help would be appreciated

Finished Dependency Resolution

Error: Package: postgresql12-devel-12.4-1PGDG.rhel7.x86_64 
(/postgresql12-devel-12.4-1PGDG.rhel7.x86_64)


    Requires: llvm-toolset-7-clang >= 4.0.1

Error: Package: llvm5.0-devel-5.0.1-7.el7.x86_64 (epel)

    Requires: libedit-devel

Error: Package: postgresql12-devel-12.4-1PGDG.rhel7.x86_64 
(/postgresql12-devel-12.4-1PGDG.rhel7.x86_64)


    Requires: libicu-devel



Have you seen this?:

https://yum.postgresql.org/news/devel-rpms-require-a-new-repository/



Thank you, Larry

Lawrence Layhee

Database Administration

Midrange Computing Division

Los Angeles County Internal Services Department

((562) 658-1701 – Office

Cell (562)-302-3080

* _llay...@isd.lacounty.gov _

/P //Please consider the environment before printing this e-mail and/or 
attachments/





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Check for duplicates before inserting new rows

2020-09-04 Thread Chris Sterritt


On 04/09/2020 15:46, Rich Shepard wrote:

On Fri, 4 Sep 2020, Olivier Gautherot wrote:


First of all, what version of PostgreSQL are you using?


Olivier,

12.2.

One way would be to add a UNIQUE constraint and perform for each row 
of the

source table a INSERT ... ON CONFLICT DO NOTHING (see
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)

If it is a 2-way merge, I would encapsulate the process in a function
(which will create a transaction to protect your process) and add a 
column
to trace the rows that have been merged. For this purpose, you can 
use the

ON CONFLICT DO UPDATE ...


Lots for me to ponder.

The new data has attributes stored in two database tables: location and
measurements. I'll need to think about how both could be handled in a 
single

function. For example, the new data contains a row with a location not
already in the existing location table. That location needs to be 
added to
the location table and its associated measurement attributes then 
added to

the existing measurements table.

Thanks for the suggestion,

Rich




Assuming some simple table structures (I've not included PK or FK 
definitions for simplicity):


create table station (station_id integer,station_data text);

create table measurement (measurement_id bigserial,station_id 
integer,ameasurement text);

-- Add some test stations
insert into station(station_id,station_data)
values (1,'station1'),
 (2,'station2'),
 (3,'station3') ;


*Query to identify new measurements:
select station_id,ameasurement from (VALUES (1,'1meas1'), -- this represents 
your new test data set
 (1,'1meas2'),
 (2,'2meas1'),
 (3,'3meas1'))as m(station_id,ameasurement)
except select station_id,ameasurement from measurement;

The query above will give you a list of the new values which are not currently 
stored in table measurement.
Run it and we get all 4 rows returned.

Add a measurement row:
insert into measurement(station_id,ameasurement)
values (2,'2meas1');

Now if you repeat the check for new measurements with the same query as at *, 
you only get 3 rows.

Cheers, Chris Sterritt



Re: Check for duplicates before inserting new rows

2020-09-04 Thread Chris Sterritt


On 04/09/2020 14:21, Rich Shepard wrote:
This is a new issue for me: I've received data from a different source 
and

need to add non-duplicates to two tables in the database. Each row in the
new data has a station ID and associated measurements.

The existing database includes a table for station information and 
another

for measurements made there.

I want to learn which stations and which measurements in the new data set
are not included in the existing tables. And, I don't want to try 
inserting

the new data and have postgres tell me when it's found duplicates,
especially since there are two tables involved.

My research into how to do this has not found a solution so I ask for
pointers to resources that will teach me how to add these new data to 
both

tables.

Regards,

Rich




To insert data into both tables:


drop table if exists station;
drop table if exists measurement;

create table station (station_id serial,station_data text);

create table measurement (measurement_id bigserial,station_id 
integer,ameasurement text);

insert into station(station_data)
values ('station1'),
 ('station2'),
 ('station3') ;

with src_data as (select station_data,ameasurement from (VALUES 
('station1','meas1'),
   ('station2','meas2'),
   ('station3','meas3'),
   ('station4','meas4'))as 
m(station_data,ameasurement)),
 ins_station as (insert into station (station_data)
 select station_data from src_data except select station_data from 
station returning station_id,station_data )

insert into measurement (station_id,ameasurement)
select s.station_id,sd.ameasurement from src_data sd join (SELECT 
station_id,station_data FROM ins_station UNION SELECT station_id,station_data 
FROM station)s using (station_data)
except select station_id,ameasurement from measurement;

select * from station;
select * from measurement;


Regards,
Chris Sterritt



Re: Dependency problem using community repo on Redhat 7

2020-09-04 Thread Devrim Gündüz

Hi,

On Thu, 2020-09-03 at 18:31 +, Lawrence Layhee wrote:
> We are having a dependency problem when using the community repo on
> redhat 7
> When we install postgresql12-devel-12.4-1PGDG.rhel7.x86_64 we get the
> issue below.



https://yum.postgresql.org/news/devel-rpms-require-a-new-repository/

Regards,
-- 
Devrim Gündüz
Open Source Solution Architect, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part


Can I get some advice regarding logical replication server?

2020-09-04 Thread Sang Gyu Kim
Hello Ranjan,

This is Sang Kim and I recently set up a logical replication server on
postgresql 11 but got an error. While google for the issue, I found some
pages and this email address is the responder.

Please give me some idea how to fix the issue on postgresql 11.

I set up logical replication server and it work for several hours but
finally stuck.
I google it and someone recommended to set wal_sender/receiver_timeout
longer, so I tested with zero, 1 min, 5 mins, 10 mins, 30 mins. None of
them work after I got error  "ERROR:  replication slot "profprod01_sub2" is
active for PID 8328".
Even though I bounced logical replication server, stream replication from
master server is alive and   bring same error.
Please give me some idea or advice for this issue.

Thanks,

Sang

-- 
s...@webmd.net
646-674-5346 (desk)
732-829-2872 (cell)


Re: Check for duplicates before inserting new rows

2020-09-04 Thread Rich Shepard

On Fri, 4 Sep 2020, Chris Sterritt wrote:

Assuming some simple table structures (I've not included PK or FK definitions 
for simplicity):


Chris,

Thanks very much.

Stay well,

Rich




Re: 12.3 replicas falling over during WAL redo

2020-09-04 Thread Ben Chobot

Alvaro Herrera wrote on 8/3/20 4:54 PM:

On 2020-Aug-03, Ben Chobot wrote:


Alvaro Herrera wrote on 8/3/20 2:34 PM:

On 2020-Aug-03, Ben Chobot wrote:
dd if=16605/16613/60529051 bs=8192 count=1 seek=6501 of=/tmp/page.6501

If I use skip instead of seek

Argh, yes, I did correct that in my test and forgot to copy and paste.


  lsn  | checksum | flags | lower | upper | special | pagesize |
version | prune_xid
--+--+---+---+---+-+--+-+---
  A0A/99BA11F8 | -215 | 0 |   180 |  7240 |    8176 | 8192
|   4 | 0

As I understand what we're looking at, this means the WAL stream was
assuming this page was last touched by A0A/AB2C43D0, but the page itself
thinks it was last touched by A0A/99BA11F8, which means at least one write
to the page is missing?

Yeah, that's exactly what we're seeing.  Somehow an older page version
was resurrected.  Of course, this should never happen.

So my theory has been proved.  What now?


Just to close the loop on this, we haven't seen the issue since we've 
stopped expanding our filesystems by moving LVM extents between devices, 
so while I don't know exactly where the bug lies, I feel it's quite 
likely not in Postgres.


Re: Bitmap scan seem like such a strange choice when "limit 1"

2020-09-04 Thread Michael Lewis
Index Scan using bigtable_y2020_cars_ref_idx on bigtable_y2020 bigtable
(cost=0.57..2966738.51 rows=2873818 width=636) (actual time=0.026..0.026
rows=1 loops=1)

Given the system expects to get almost 3 million rows when it should be
just 1, it seems like a stats problem to me. How is ndistinct on that
column compared to reality?