PITR based recovery in a primary/standby cluster setup

2019-05-21 Thread Abhijit Gharami
Hi,

We have primary and standby PostgreSQL cluster setup and also we have PITR
enabled on it. To improve the recovery time we are thinking of recovering
the database to both primary and standby at the same time.

These are the steps we are following:
  1. Restore the base backup to the both primary and standby server
  2. Replay the WAL files on both primary and standby and once the recovery
target is reached stop the servers
  3. Start one of the server as Primary and other one as standby


We have followed the above steps but when we are trying to start the
servers as primary and standby we are having replication issues.

Could you please suggest what should be done here so that we can recover
the database in both primary as well as in  standby server?

We are using PostgrerSQL version: 9.6.12 and for PITR we are using WAL-E.

Regards,
Abhijit


no matching entries in passwd file

2019-05-21 Thread Daulat Ram
Hello team,

I have  database & users created inside the docker but we are getting 
connection issue while trying to connect to database using user created in 
postgres.


docker exec -it -u test b8e7ejb1e31d bash

unable to find user test: no matching entries in passwd file

Regards,

Daulat



Re: no matching entries in passwd file

2019-05-21 Thread Fabio Pardi
Hi Daulat,

I believe that the error message is referring to the system user, not
the database one.


docker exec --help | grep -- -u
  -u, --user string  Username or UID (format:
[:])



regards,

fabio pardi


On 5/21/19 2:42 PM, Daulat Ram wrote:
> Hello team,
> 
>  
> 
> I have  database & users created inside the docker but we are getting
> connection issue while trying to connect to database using user created
> in postgres.
> 
>  
> 
> docker exec -it -u test b8e7ejb1e31d bash
> 
> unable to find user test: no matching entries in passwd file
> 
> Regards,
> 
> Daulat
> 
>  
> 




Re: PITR based recovery in a primary/standby cluster setup

2019-05-21 Thread Frank Alberto Rodriguez
Hello.I'm not sure which replications issues you have, and I never used
Wall-E before, but I get some issues with PotgreSql 10 and Barman.
Try starting the primary server at first, when it finish to recovery
this should start as primary, if not then go to the postgresql data
directory and rename the recovery.conf to recovery.done and start the
server as primary. 
Then start the standby server and when  recovery target  is reached,
the standby server should not leave the recovery status and this should
weep receiving wal through the archive_command, and should not rename
the recovery.conf file.
Regards
On Tue, 2019-05-21 at 14:27 +0530, Abhijit Gharami wrote:
> Hi,
> 
> We have primary and standby PostgreSQL cluster setup and also we have
> PITR enabled on it. To improve the recovery time we are thinking of
> recovering the database to both primary and standby at the same time.
> 
> These are the steps we are following:
>   1. Restore the base backup to the both primary and standby server 
>   2. Replay the WAL files on both primary and standby and once the
> recovery target is reached stop the servers 
>   3. Start one of the server as Primary and other one as standby
> 
> 
> We have followed the above steps but when we are trying to start the
> servers as primary and standby we are having replication issues.
> 
> Could you please suggest what should be done here so that we can
> recover the database in both primary as well as in  standby server?
> 
> We are using PostgrerSQL version: 9.6.12 and for PITR we are using
> WAL-E.
> 
> Regards,
> Abhijit


Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

I'm cleaning and formatting a 800-line data file to be inserted into a
database. Some of the input file fields will be inserted into an
'organizations' table letting postgres assign sequential org_id numbers.
Other fields will be inserted into a separate 'people' table associated with
each organization. The people table insert allows postgres to assign the
person_id number and this table's foreign key, people.org_id ==
organizations.org_id.

I could insert all new rows into the organizations table, then produce a
list of the org_id and org_name to manually insert the foreign key in the
related people table. Time and effort intense.

Is there a way to insert the two tables sequentially without manually adding
the organizations.org_id to the appropriate foreign key column (people.org_id)
of the people table?

Rich




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Adrian Klaver

On 5/21/19 9:56 AM, Rich Shepard wrote:

I'm cleaning and formatting a 800-line data file to be inserted into a
database. Some of the input file fields will be inserted into an
'organizations' table letting postgres assign sequential org_id numbers.
Other fields will be inserted into a separate 'people' table associated 
with

each organization. The people table insert allows postgres to assign the
person_id number and this table's foreign key, people.org_id ==
organizations.org_id.

I could insert all new rows into the organizations table, then produce a
list of the org_id and org_name to manually insert the foreign key in the
related people table. Time and effort intense.

Is there a way to insert the two tables sequentially without manually 
adding
the organizations.org_id to the appropriate foreign key column 
(people.org_id)

of the people table?


Well you are not going to know the org_id until the organization table 
is loaded, which means something like:


1) First run through file load the organizations table.

2) Build a mapping of org_id to organization.

3) Run through data file again and load people data using the mapping in 
2) to provide the people.org_id.




Rich





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread PegoraroF10
I cannot because we created a replication for ALL TABLES



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread PegoraroF10
Restart Postgres means exactly what ? We tried just restart the service but
we tried to refresh publication the old view was used because it took 2hours
and gave us a timeout.

I found some people talking that I need to initdb, but initdb means recreate
entirely my database or just reinstall my postgres server ?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Francisco Olarte
Rich:

On Tue, May 21, 2019 at 6:56 PM Rich Shepard  wrote:
> I'm cleaning and formatting a 800-line data file to be inserted into a
> database. Some of the input file fields will be inserted into an
> 'organizations' table letting postgres assign sequential org_id numbers.
> Other fields will be inserted into a separate 'people' table associated with
> each organization. The people table insert allows postgres to assign the
> person_id number and this table's foreign key, people.org_id ==
> organizations.org_id.
> I could insert all new rows into the organizations table, then produce a
> list of the org_id and org_name to manually insert the foreign key in the
> related people table. Time and effort intense.
> Is there a way to insert the two tables sequentially without manually adding
> the organizations.org_id to the appropriate foreign key column (people.org_id)
> of the people table?

>From how you say it, I assume you have some data in your original
dumps which can relate boths, lets assume it's org_name, but may be an
org-code. If you do not have it it means you cannot match people to
orgs in your data, all is lost.

You can do it in a couple steps, first do a copy (org_id, org_name,
org.others) into the organizations table,

then create a temporary table and copy (people.id, people.org_name,
people.*)  into it

and then insert into people "select people.id, org.id as
people_org_id, people.* from tmp_people, orgs where
tmp_people.org_name = org.org_name)

and drop the temp table.

If the matching data is some king of code you do not want in the final
organization table, create a temporary organizations table, copy into
it, join it with the temporary people table, insert into the final
organizations table slicing with a select, drop it.

If you are using a sequence or similar thing for generating
organization ids and use the second approach, remember to use the same
sequence to generate the default values into the temporary table, or
chaos will ensue ( not really, but you'll need to advance it manually
).

This is a classic problem, the classic way to solve is that, prepare
an script which loads some temporary tables and then insert joins into
the final ones. On small data sets like yours you can just edit
everything into a single sql script.

Francisco Olarte.




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

On Tue, 21 May 2019, Adrian Klaver wrote:

Well you are not going to know the org_id until the organization table is 
loaded, which means something like:


1) First run through file load the organizations table.

2) Build a mapping of org_id to organization.

3) Run through data file again and load people data using the mapping in 2) 
to provide the people.org_id.


Adrian,

That's what I planned to do. I wondered if there was a more direct way known
to those with more experience than I have. The short answer is 'no.'

Thanks,

Rich




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Michael Lewis
For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.

Else, load all the data with empty ID column on person table,then just
update the person table afterward and drop the org name column.

Perhaps I am missing something.


Re: Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

On Tue, 21 May 2019, Francisco Olarte wrote:


From how you say it, I assume you have some data in your original
dumps which can relate boths, lets assume it's org_name, but may be an
org-code. If you do not have it it means you cannot match people to
orgs in your data, all is lost.


Francisco,

Not yet with these new data.

I'll manually insert the org_id numbers from the organizations table into
the people table.

Thanks,

Rich




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Adrian Klaver

On 5/21/19 10:22 AM, Rich Shepard wrote:

On Tue, 21 May 2019, Adrian Klaver wrote:

Well you are not going to know the org_id until the organization table 
is loaded, which means something like:


1) First run through file load the organizations table.

2) Build a mapping of org_id to organization.

3) Run through data file again and load people data using the mapping 
in 2) to provide the people.org_id.


Adrian,

That's what I planned to do. I wondered if there was a more direct way 
known

to those with more experience than I have. The short answer is 'no.'


The other way is to create the org_id for each organization ahead of 
time and put it into the data file. Either way you have create the 
org_id for the FK relationship, it is just a matter of where and when.




Thanks,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

On Tue, 21 May 2019, Michael Lewis wrote:


For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.


Michael,

The org_id will not exist until I run the insert script.


Else, load all the data with empty ID column on person table,then just
update the person table afterward and drop the org name column.


Then I'll copy the org_id numbers to the appropriate row(s) in the people
table.

Thanks,

Rich




Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Fabrízio de Royes Mello
Em ter, 21 de mai de 2019 às 14:17, PegoraroF10 
escreveu:
>
> Restart Postgres means exactly what ? We tried just restart the service
but
> we tried to refresh publication the old view was used because it took
2hours
> and gave us a timeout.
>

As I said before to change system catalog you should set
"allow_system_table_mods=on" and restart PostgreSQL service.

After that you'll able to recreate the "pg_catalog.pg_publication_tables"
system view. (You can use the Tom's suggestion using LATERAL)

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

On Tue, 21 May 2019, Adrian Klaver wrote:


The other way is to create the org_id for each organization ahead of time
and put it into the data file. Either way you have create the org_id for
the FK relationship, it is just a matter of where and when.


Adrian,

I had thought of that but overlooked it. The max(org_id) number is 338 so I
can assign sequential numbers above that as I format the data for the insert
scripts.

Thanks for the reminder,

Rich




Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?=  writes:
> As I said before to change system catalog you should set
> "allow_system_table_mods=on" and restart PostgreSQL service.
> After that you'll able to recreate the "pg_catalog.pg_publication_tables"
> system view. (You can use the Tom's suggestion using LATERAL)

It's a view, not a table, so I don't think you need
allow_system_table_mods.  A quick test here says that being
superuser is enough to do a CREATE OR REPLACE VIEW on it.

regards, tom lane




Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Fabrízio de Royes Mello
Em ter, 21 de mai de 2019 às 14:41, Tom Lane  escreveu:
>
> =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?=  writes:
> > As I said before to change system catalog you should set
> > "allow_system_table_mods=on" and restart PostgreSQL service.
> > After that you'll able to recreate the
"pg_catalog.pg_publication_tables"
> > system view. (You can use the Tom's suggestion using LATERAL)
>
> It's a view, not a table, so I don't think you need
> allow_system_table_mods.  A quick test here says that being
> superuser is enough to do a CREATE OR REPLACE VIEW on it.
>

Interesting, I tried the following commands and got error:

postgres=# SELECT version();
 version

--
 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
(1 row)

postgres=# SELECT session_user;
 session_user
--
 postgres
(1 row)

postgres=# SHOW allow_system_table_mods ;
 allow_system_table_mods
-
 off
(1 row)

postgres=# CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
postgres-# SELECT
postgres-# P.pubname AS pubname,
postgres-# N.nspname AS schemaname,
postgres-# C.relname AS tablename
postgres-# FROM pg_publication P, pg_class C
postgres-#  JOIN pg_namespace N ON (N.oid = C.relnamespace),
postgres-#  LATERAL pg_get_publication_tables(P.pubname)
postgres-# WHERE C.oid = pg_get_publication_tables.relid;
ERROR:  permission denied: "pg_publication_tables" is a system catalog

But changing "allow_system_table_mods=on" works as expected:

postgres=# SHOW allow_system_table_mods ;
 allow_system_table_mods
-
 on
(1 row)

postgres=# CREATE OR REPLACE VIEW pg_catalog.pg_publication_tables AS
SELECT
P.pubname AS pubname,
N.nspname AS schemaname,
C.relname AS tablename
FROM pg_publication P, pg_class C
 JOIN pg_namespace N ON (N.oid = C.relnamespace),
 LATERAL pg_get_publication_tables(P.pubname)
WHERE C.oid = pg_get_publication_tables.relid;
CREATE VIEW

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: Bulk inserts into two (related) tables

2019-05-21 Thread Ron

On 5/21/19 12:27 PM, Rich Shepard wrote:

On Tue, 21 May 2019, Michael Lewis wrote:


For each row-
Insert into organizations table if the record does not exist, returning ID.
Insert into people using that ID.


Michael,

The org_id will not exist until I run the insert script.


Else, load all the data with empty ID column on person table,then just
update the person table afterward and drop the org name column.


Then I'll copy the org_id numbers to the appropriate row(s) in the people
table.


A Perl/Python/etc script would solve your problem.

--
Angular momentum makes the world go 'round.




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Will Hartung

> On May 21, 2019, at 9:56 AM, Rich Shepard  wrote:
> 
> I could insert all new rows into the organizations table, then produce a
> list of the org_id and org_name to manually insert the foreign key in the
> related people table. Time and effort intense.

You can either use a script for the entire process, or, convert the people 
table to INSERT statements that have a SELECT for the foreign key as part of 
the insert.

INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id FROM org 
WHERE org_name=‘Main Office’))




Re: Re: Re: Refresh Publication takes hours and doesn´t finish

2019-05-21 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?=  writes:
> Em ter, 21 de mai de 2019 às 14:41, Tom Lane  escreveu:
>> It's a view, not a table, so I don't think you need
>> allow_system_table_mods.  A quick test here says that being
>> superuser is enough to do a CREATE OR REPLACE VIEW on it.

> Interesting, I tried the following commands and got error:

Oh, huh, this is something that changed recently in HEAD ---
since commit 2d7d946cd, stuff created by system_views.sql
is not protected as though it were a system catalog.

So in released versions, yes you need allow_system_table_mods=on.
Sorry for the misinformation.

regards, tom lane




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Rich Shepard

On Tue, 21 May 2019, Will Hartung wrote:


You can either use a script for the entire process, or, convert the people
table to INSERT statements that have a SELECT for the foreign key as part
of the insert.

INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id
FROM org WHERE org_name=‘Main Office’))


Thanks, Will. That's a good alternative.

Regards,

Rich




Re: Bulk inserts into two (related) tables

2019-05-21 Thread Jeremy Finzel
On Tue, May 21, 2019 at 12:24 PM Rich Shepard 
wrote:

> On Tue, 21 May 2019, Francisco Olarte wrote:
>
> > From how you say it, I assume you have some data in your original
> > dumps which can relate boths, lets assume it's org_name, but may be an
> > org-code. If you do not have it it means you cannot match people to
> > orgs in your data, all is lost.
>
> Francisco,
>
> Not yet with these new data.
>
> I'll manually insert the org_id numbers from the organizations table into
> the people table.
>

To me, this is the key to your problem what will either make this a time
saver or time waster.  Somehow you are accounting for what uniquely
identifies organizations, right?

Say there are 5 fields that correspond to an organization.  I assume then
you are creating only one new org_id for each unique combination of these
fields?

Then take Francisco's suggestion, only use an md5 of the organization
fields to create yourself a unique identifier.  Then you can use ctid
(unique internal identifier for each row) to join back.  You use SQL like
this:

SELECT md5(row(org_name, org_stuff_1, org_stuff_2)::text) AS hash_identifier
FROM table;

Assume in example below that your unique "org" rows are the first 3 fields:

1. Load your data file into a loading table like so:
CREATE TABLE loader (org_name text, org_stuff_1 text, org_stuff_2 text,
person_name text);
\copy loader from 'my_data.csv' with csv header
ALTER TABLE loader ADD COLUMN org_id INT;

Example data:
INSERT INTO loader VALUES ('a', ' ', ' ', 'Jerry');
INSERT INTO loader VALUES ('a', ' ', 'b', 'Bob');
INSERT INTO loader VALUES ('a', ' ', 'b', 'Janice');
INSERT INTO loader VALUES ('a', ' ', 'c', 'Chris');
INSERT INTO loader VALUES ('b', ' ', 'c', 'Jason');
INSERT INTO loader VALUES ('a', ' ', ' ', 'Alice');

2. Load org table:
test=# CREATE TABLE organizations (org_id serial primary key, org_name
text, org_stuff_1 text, org_stuff_2 text);
CREATE TABLE
test=# INSERT INTO organizations (org_name, org_stuff_1, org_stuff_2)
test-# SELECT DISTINCT org_name, org_stuff_1, org_stuff_2
test-# FROM loader;
INSERT 0 4

3. Build mapping directly and update:
-- build hash of org fields in loader table, take ctid in order to map back
later
WITH map_source AS (
SELECT ctid, md5(row(org_name, org_stuff_1, org_stuff_2)::text) AS
hash_identifier
FROM loader)

-- build hash of org fields in organizations table to join back to loader
and bring in org_id of course
, map_org AS (
SELECT org_id, md5(row(org_name, org_stuff_1, org_stuff_2)::text) AS
hash_identifier
FROM organizations)

-- map by joining together on hash_identifier
, final_map AS (
SELECT org_id, ctid
FROM map_source l
INNER JOIN map_org o USING (hash_identifier)
)

-- Perform update
UPDATE loader l
SET org_id = fm.org_id
FROM final_map fm
WHERE fm.ctid = l.ctid;

Final data ready for the person table to be populated:
test=# table organizations;
 org_id | org_name | org_stuff_1 | org_stuff_2
+--+-+-
  1 | a| | b
  2 | a| |
  3 | a| | c
  4 | b| | c
(4 rows)

test=# table loader;
 org_name | org_stuff_1 | org_stuff_2 | person_name | org_id
--+-+-+-+
 a| | | Jerry   |  2 |
 a| | b   | Bob |  1 |
 a| | b   | Janice  |  1 |
 a| | c   | Chris   |  3 |
 b| | c   | Jason   |  4 |
 a| | | Alice   |  2 |
(6 rows)


Hope this helps!
Thanks,
Jeremy


Re: Loading table with indexed jsonb field is stalling

2019-05-21 Thread Will Hartung


> On May 20, 2019, at 5:31 PM, Tom Lane  wrote:
> 
> Well, you're the only one who's seen this problem, and none of the
> rest of us have any idea how to reproduce it.  So if you want something
> to get done in a timely fashion, it's up to you to show us a test case.

So, we had success.

We looked in to maintenance_work_mem.

By default, this value is 64MB.

Our test on AWS was against a small machine, 2 CPU, 16G.

My VM is however many cores it gets (not that it matters) and 4G of RAM.

My VM used the 64MB value for maintenance_work_mem. The AWS small VM used 247M.

We killed the rebuild on the small machine, it was pushing 67 hours.

I was running a load on my machine, and it was still making progress over 
night, but was at, like 17 hours. But it was moving, but had only loaded 2.2M 
rows in that time.

We grabbed one of the huge AWS instances. 64 CPU, 488G of ram. Just, silly.

But, that’s fine — I’m good with silly.

It’s mainteance_work_mem was ~8G.

And we loaded all of my files on that instance in about 5.5 hours, about 9M 
rows per hour.

So, obviously, maintenance_work_mem was the smoking gun. Since I don’t know the 
details of GIN indexing, its not clear to me how the maintenance_work_mem is 
utilized with GIN index builds, but, obviously 64M is “not enough”, nor is 
247M. And 8G is certainly enough.

We’re cautious just setting these values “to 11” because of the fact that 
multiple connections can utilize them, so it seems to me that it’s important 
that they be “generous enough”, but not overly generous.

So this is good news, no bug, and, perhaps, left to its own devices, the DB 
would have eventually built this index. Whether it would have done so before 
universal heat death, is a different question.

Can anyone discuss how the maintenance_work_mem is utilized during GIN index 
creation? On our production systems, this value is set to 1G. And we don’t seem 
to have any problems for day to day work. This is a very busy table, and we 
have thousands of inserts/updates daily which seem to proceed well enough. I 
have not tried to rebuild this index on this system, so I can’t say if 1G is 
enough to rebuild this index efficiently or not. But its efficient enough for 
our transaction load.

Now, that said, should I ever be in that situation of having to recover this 
table like this, I’d have no problem cranking that value up high since it would 
be the only real connection on the system anyway.

But I’m still curious how the memory is utilized during index builds just to 
have a better understanding of the nuances of the system.

Thanks all for your help.






distinguish update from insert (on conflict)

2019-05-21 Thread Justin Pryzby
Is it still impossible to distinguish whether a row was inserted vs updated ?

The latest I can see is here:
https://wiki.postgresql.org/wiki/UPSERT#RETURNING_behavior

..but I'm hopeful that the 4 year old wiki page is out of date.

Justin




Re: distinguish update from insert (on conflict)

2019-05-21 Thread Adrian Klaver

On 5/21/19 6:34 PM, Justin Pryzby wrote:

Is it still impossible to distinguish whether a row was inserted vs updated ?


You will need to be more specific.

On a hunch, see transition relation info here:
https://www.postgresql.org/docs/10/sql-createtrigger.html



The latest I can see is here:
https://wiki.postgresql.org/wiki/UPSERT#RETURNING_behavior

..but I'm hopeful that the 4 year old wiki page is out of date.

Justin






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: distinguish update from insert (on conflict)

2019-05-21 Thread Justin Pryzby
On Tue, May 21, 2019 at 06:57:36PM -0700, Adrian Klaver wrote:
> On 5/21/19 6:34 PM, Justin Pryzby wrote:
> >Is it still impossible to distinguish whether a row was inserted vs updated ?
> 
> You will need to be more specific.

Sorry, I mean with UPSERT / "INSERT .. ON CONFLICT DO UPDATE", is it possible
to tell whether a row was inserted vs. updated ?

Thanks,
Justin