Re: Problem with stored procedure and nested transactions

2018-11-03 Thread Peter Eisentraut
On 30/10/2018 15:03, p.piero...@mmbb.it wrote:
> I thought that the “BEGIN/END” block was used to create new transactions
> and that each of them could be managed individually.

In PL/pgSQL, BEGIN/END just create syntactic blocks, they don't manage
transactions.

COMMIT and ROLLBACK manage top-level transactions, but those cannot be
nested (since they are top-level).

In order to create a nested transaction structure, you need to use
subtransactions.  In PL/pgSQL, you can use BEGIN/END blocks with an
exception clause to create subtransactions.

I'm not sure what your code is actually trying to do, but you might need
to reorganize it a bit.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread Karsten Hilbert
On Fri, Nov 02, 2018 at 11:56:58PM -0400, Tom Lane wrote:

> > On Thu, Nov 01, 2018 at 11:43:56AM -0400, Tom Lane wrote:
> >> Yeah, apparently we've passed a null OLD tuple to an RI_FKey_cascade_del
> >> trigger, which surely shouldn't happen.  It'd be interesting to look at
> >> the set of triggers on this table.  I don't entirely trust psql's \d
> >> to show us reality if there's something screwed up about the triggers,
> >> so in addition to \d output, could we see
> >> select * from pg_trigger where tgrelid  = 'ref.auto_hint'::regclass;
> 
> > [ pretty normal-looking trigger entries ]
> 
> I was feeling baffled about this, but it suddenly occurs to me that maybe
> the bug fixed in 040a1df61/372102b81 explains this.  If the particular
> tuple you're trying to delete predates the last ALTER TABLE ADD COLUMN
> operation on the table, then this bug would result in t_self getting
> set to "invalid", which'd cause AfterTriggerSaveEvent to save "ate_ctid1"
> as "invalid", which'd lead to a null tuple getting passed when the trigger
> eventually gets invoked.

Now, that is a theory I can follow up on -- all the
bootstrapping SQL scripts are under version control so I can
pin down the exact sequence of events.

... goes off to do some checks ...

And sure enough there's an ALTER TABLE ADD COLUMN related to
that table:

alter table ref.auto_hint
add column recommendation_query text;

-- (audit log table needs to get the new column, too)
alter table audit.log_auto_hint
add column recommendation_query text;

before the DELETE of the pre-existing tuple.

When running a rigged upgrade that stops right before those
ALTER TABLEs and then doing the following:

begin;
-- instrument:
insert into ref.auto_hint (title, query, hint, source, lang) values (
'DELETE test',
'select 1;',
'insertion before ADD COLUMN',
'testing',
'en'
);
-- works:
DELETE from ref.auto_hint where title = 'DELETE test';
-- instrument, again:
insert into ref.auto_hint (title, query, hint, source, lang) values (
'DELETE test',
'select 1;',
'insertion before ADD COLUMN',
'testing',
'en'
);
alter table ref.auto_hint
add column recommendation_query text;
-- audit log table needs to get the message
alter table audit.log_auto_hint
add column recommendation_query text;
-- fails:
DELETE from ref.auto_hint where title = 'DELETE test';
rollback;

the expected segfault does indeed occur.

Conversely, moving the offending

DELETE FROM ref.auto_hint WHERE title = 'Kontraindikation: ACE/Sartan 
<-> Schwangerschaft';

to right before the ALTER TABLEs makes the full upgrade run
through without further problems.

Looking at 040a1df61/372102b81 feels like it fits the bill.

So, I guess I can work around the issue by the above
manoeuvre and report back once 040a1df61/372102b81 is
released.

Anything else you'd want me to look into ?

Many thanks,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread Tom Lane
Karsten Hilbert  writes:
> On Fri, Nov 02, 2018 at 11:56:58PM -0400, Tom Lane wrote:
>> I was feeling baffled about this, but it suddenly occurs to me that maybe
>> the bug fixed in 040a1df61/372102b81 explains this.

> So, I guess I can work around the issue by the above
> manoeuvre and report back once 040a1df61/372102b81 is
> released.
> Anything else you'd want me to look into ?

Don't suppose you'd want to build a server with 372102b81 applied
and see if it works?

regards, tom lane



Re: Logical replication hangs up.

2018-11-03 Thread Jerry Sievers
Aleš Zelený  writes:

> Hello,
>
> we are suing logical replication on 10.4  and it now hangs. After
> some timeout it is retarted again, replaying 18GB of data and then
> hangs (while 7GB of wals remains to be proceeded).

Timeout...

Have a look at the 2 setting wal sender/receiver timeout and you
probably need to raise the sender timeout value.

HTH

>
> The backlog of 18GB comes from a failed migration adding new table to
> replication while replication user was not granted to select the
> table. This was clear from log files and once resolved by adding
> select privilege, I thought that all will work as usual (same
> happened in test env. many times and adding missing grant for select
> was sufficient to get it working... these were issues on tests).
>
> RDBMS Version:
> PostgreSQL 10.4 (Ubuntu 10.4-2.pgdg16.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609,
> 64-bit
>
> Publication (master) instance error message from log:
>
> 2018-11-02 18:34:14 UTC 7974 5bdc8d27.1f26 7 192.168.23.11(58884)
> master_prod repusr 0 5/0 sub_eur idle [0]:LOG:  terminating
> walsender process due to replication timeout
>
> Subscription instance log:
>
> 2018-11-02 18:34:14 UTC 8657 5bdc8d26.21d1 2    0 3/0  
> [XX000]:ERROR:  could not receive data from WAL stream: SSL
> connection has been closed unexpectedly
> 2018-11-02 18:34:14 UTC 24699 5b923b1c.607b 1209397    0   
> [0]:LOG:  worker process: logical replication worker for
> subscription 37932 (PID 8657) exited with exit code 1
>
> While it seems to be a network issue, it is may be not - we have
> checked the network and even monitoring, all the time some packets
> were exchanged.
>
> We do have 3 subscriptions(thus repl. slots) for one publication, one
> subscriber instance is within same datacenter as master, remainig
> subscribers are remote.
>
>
>>> select * from pg_replication_slots
> ++--+---++-+---+++--+--+--+-+
> | slot_name  | plugin   | slot_type | datoid | database    |
> temporary | active | active_pid | xmin | catalog_xmin | restart_lsn 
> | confirmed_flush_lsn |
> ++--+---++-+---+++--+--+--+-+
> | sub_usd    | pgoutput | logical   | 16421  | master_prod |
> False | True   | 16604  |  | 5536488  | 426/AAE55A68
> | 426/AAE55A68    |
> | sub_cad    | pgoutput | logical   | 16421  | master_prod |
> False | True   | 22875  |  | 5536488  | 426/AAE55A68
> | 426/AAE55A68    |
> | sub_eur    | pgoutput | logical   | 16421  | master_prod |
> False | True   | 16605  |  | 5536488  | 426/AAE55A68
> | 426/AAE55A68    |
> ++--+---++-+---+++--+--+--+-+
>
>
> Once  after the error occurs, wal senders are re/started and they
> reads wal files until they reach restart_lsn wal file:
>
>>> select pg_walfile_name('426/AAE55A68')
> +--+
> | pg_walfile_name  |
> +--+
> | 0001042600AA |
> +--+
>
> # changing file names until they reach this one:
> root@master-db:/pgsql/pgcluster/10/master_prod# lsof -p 1560 -p 5758
> -p 5790| grep pg_wal
> postgres 1560 postgres   10r  REG  259,3  16777216 115766007 /
> pgsql/pgcluster/10/master_prod/pg_wal/000104290069
> postgres 5758 postgres   10r  REG  259,3  16777216 115766007 /
> pgsql/pgcluster/10/master_prod/pg_wal/000104290069
> postgres 5790 postgres   10r  REG  259,3  16777216 115766007 /
> pgsql/pgcluster/10/master_prod/pg_wal/000104290069
>
>
> At this moment sent_lsn stops growing and nothing is happening for a
> while.
>
> select * from pg_stat_replication;
>
>   pid  | usesysid | usename |  application_name  | client_addr   |
> client_hostname | client_port | backend_start |
> backend_xmin |  state  |   sent_lsn   |  write_lsn   |  flush_lsn   |
>   replay_lsn  | write_lag | flush_lag | replay_lag | sync_priority |
> sync_state
> ---+--+-++---+-+-+---+--+-+--+--+--+--+---+---++---+
>  16604 |    37868 | repusr  | sub_usd    | 192.168.21.11 |
>  |   35010 | 2018-11-02 23:52:22.059157+00 |
>   | catchup | 429/69E9CC60 | 426/AAE55A68 | 426/AAE55A68
> | 426/AAE55A68 |   |   |    | 0 |
> async
>  16605 |    37868 | repusr  | sub_eur    | 192.168.23.11 |
>  |   36388 | 2018-11-0

Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread Karsten Hilbert
On Sat, Nov 03, 2018 at 11:39:49AM -0400, Tom Lane wrote:

> Karsten Hilbert  writes:
> > On Fri, Nov 02, 2018 at 11:56:58PM -0400, Tom Lane wrote:
> >> I was feeling baffled about this, but it suddenly occurs to me that maybe
> >> the bug fixed in 040a1df61/372102b81 explains this.
> 
> > So, I guess I can work around the issue by the above
> > manoeuvre and report back once 040a1df61/372102b81 is
> > released.
> > Anything else you'd want me to look into ?
> 
> Don't suppose you'd want to build a server with 372102b81 applied
> and see if it works?

Yeah, I feared that answer ;-)

However, I lack the skills and the machine(s) to do so ...

I will, at any rate, report back when the existing fix is released.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread obouda
Hello,



we reached the exactly same problem after upgrading to PostgreSQL 11 - the
server crashed on a DELETE statement with a trigger. We also observed an
AFTER DELETE trigger receiving NULL values in OLD. Now I see the problem
seems to be solved (theoretically). Unfortunately, we are not able to build
the server with the patch, so we cannot confirm that. However, when we just
copied the database (within the same server), the same DELETE executed on 
the copy with no problems.




I would like to ask, however: could the same problem arise from an UPDATE
statement (also on a table with an AFTER trigger), or would that be another
bug (technical details below)?

As the bug causes our production servers segfault several times a day, we'd
like to solve the problem as soon as possible. Do I understand it correctly
that if we dump and restore the database, the bug should not occur (until 
the next ALTER TABLE ADD COLUMN - which we'll avoid until the patch is
released)?




With the update, we caught this (we also have the core dump and could
provide it privately if useful):





#0  __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:793

No locals.

#1  0x7f6a6b993296 in datumIsEqual (value1=7, value2=7, typByVal=<
optimized out>, typLen=16) at /build/postgresql-11-oDwOcQ/postgresql-11-
11.0/build/../src/backend/utils/adt/datum.c:249

        size1 = 16

        size2 = 

        s1 = 0x7 

        s2 = 0x7 

        res = 

#2  0x7f6a6b6b89de in ProjIndexIsUnchanged (newtup=0x7f6a6d504e80,
oldtup=0x7ffc9c1c1270, relation=0x7f6a6b4c1738) at /build/postgresql-11-
oDwOcQ/postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4539

        att = 

        indexOid = 

        indexDesc = 0x7f6a6b4c5008

        indexInfo = 0x7f6a6d505ca8

        i = 0

        indexoidlist = 

        econtext = 0x7f6a6d4c5080

        new_isnull = {false, false, false, 156, 252, 127, false, false, 44,
29, 128, 107, 106, 127, false, false, 116, 54, 76, 107, 106, 127, false, 
false, 123, false, false, false, false, false, false, false}

        slot = 0x7f6a6d507b78

        equals = true

        old_isnull = {false, false, false, 109, 106, 127, false, false, 165,
199, 80, 109, 106, 127, false, false, 160, 199, 80, 109, 106, 127, false, 
false, 168, 199, 80, 109, 106, 127, false, false}

        new_values = {7, 4, 140095077240522, 16, 140722927572064,
140095077271456, 140095077272720, 140722927572079, 0, 0, 140722927572048, 
140095054460096, 140095077060032, 140095077242520, 5, 4, 140722927572096, 
140095046814123, 

          140722927572096, 140095046825162, 140095043410520,
140095077060880, 140722927572224, 140095046865098, 60129542543, 23274985272,
140095043410520, 140095077060032, 140095043409720, 140095077060304,
140095077237568, 

          140095043414504}

        indexno = 1

        l = 0x7f6a6d50d180

        estate = 0x7f6a6d4c4e70

        old_values = {7, 5, 140078657614882, 1, 140095077060032, 1024,
140095077271456, 0, 140722927571760, 140095049517261, 140722927571808,
140722927571776, 140722927571792, 140095046814329, 18, 140095077271464, 
140095076986224, 

          140095077237568, 22, 100, 140722927571856, 140722927571920,
140095046831813, 140722927571856, 140095046842624, 0, 140095077265656, 3, 
140095077271456, 140095077271792, 140095077271456, 140095077242520}

#3  heap_update (relation=relation@entry=0x7f6a6b4c1738, otid=otid@entry=0x7
ffc9c1c1690, newtup=newtup@entry=0x7f6a6d504e80, cid=0, crosscheck=<
optimized out>, wait=wait@entry=true, hufd=hufd@entry=0x7ffc9c1c15b0, 

    lockmode=lockmode@entry=0x7ffc9c1c15a4) at /build/postgresql-11-oDwOcQ/
postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4230

        result = 

        xid = 245028971

        hot_attrs = 0x7f6a6d50d0a0

        proj_idx_attrs = 0x7f6a6d50d0b8

        key_attrs = 0x7f6a6d50d0d0

        id_attrs = 0x7f6a6d50d0e8

        interesting_attrs = 0x7f6a6d50d100

        lp = 

        oldtup = {t_len = 81, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 9},
ip_posid = 71}, t_tableOid = 65471913, t_data = 0x7f669aa0b3f0}

        heaptup = 0x7f6a6d504e80

        old_key_tuple = 0x0

        old_key_copied = false

        page = 0x7f669aa0b180 "=="

        block = 

        mxact_status = 

        buffer = 

        newbuf = 17858

        vmbuffer = 0

        vmbuffer_new = 0

        need_toast = 

        newtupsize = 

        pagefree = 

        have_tuple_lock = false

        iscombo = false

        use_hot_update = false

        hot_attrs_checked = 

        key_intact = 

        all_visible_cleared = false

        all_visible_cleared_new = false

        checked_lockers = 

        locker_remains = 

        xmax_new_tuple = 

        xmax_old_tuple = 245028971

        infomask_old_tuple = 0

        infomask2_old_tuple = 8192

        infomask_new_tuple = 144

        infomask2_new_tuple = 0

        __func__ = "heap_update"

#4  0x7f6a6b82f2c9 in ExecUpdate (mtstate=mtstate@en

Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread Tom Lane
 writes:
> we reached the exactly same problem after upgrading to PostgreSQL 11 - the
> server crashed on a DELETE statement with a trigger. We also observed an
> AFTER DELETE trigger receiving NULL values in OLD. Now I see the problem
> seems to be solved (theoretically). Unfortunately, we are not able to build
> the server with the patch, so we cannot confirm that. However, when we just
> copied the database (within the same server), the same DELETE executed on 
> the copy with no problems.

Yeah, dump/reload would make the problem go away.

> I would like to ask, however: could the same problem arise from an UPDATE
> statement (also on a table with an AFTER trigger), or would that be another
> bug (technical details below)?

Same bug.

regards, tom lane



Trouble Upgrading Postgres

2018-11-03 Thread Charles Martin
I'd be grateful for some help. I am trying to move a large database from
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on
Centos 7. I can't do a pg_dump because it always fails on the largest
table. So tried to do pb_basebackup and copy that to the new PG 11 server.
Except that pg_upgrade expects the new and old versions of PG to be
side-by-side. So I installed 9.6 on the new server, ran initdb, verified
that it started, then stopped it and edited postgresql.conf data path to
the location of the pg_basebackup files. Then 9.6 would no longer start. So
how can I get my PG 9.6 data into a new PG 11 database?

Probably related to my troubles are my attempts to get replication set up.
But before I dive back into that, I thought I'd better try getting my 9.6
data into the new 9.6 server, then run PG 11's pg_upgrade and mount the
data in PG 11. Then maybe I can get replication started.

I've read that logical replication can be used to migrate from 9.6 to 11,
but haven't found any documentation on doing that.

Chuck Martin


Re: Trouble Upgrading Postgres

2018-11-03 Thread Ron

On 11/03/2018 02:57 PM, Charles Martin wrote:
I'd be grateful for some help. I am trying to move a large database from 
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on 
Centos 7. I can't do a pg_dump because it always fails on the largest table.


What error message?


--
Angular momentum makes the world go 'round.


Re: backend crash on DELETE, reproducible locally

2018-11-03 Thread Ron

On 11/03/2018 02:19 PM, obo...@email.cz wrote:

Hello,

we reached the exactly same problem after upgrading to PostgreSQL 11 - the 
server crashed on a DELETE statement with a trigger.We also observed an 
AFTER DELETE trigger receiving NULL values in OLD. Now I see the problem 
seems to be solved (theoretically). Unfortunately, we are not able to 
build the server with the patch, so we cannot confirm that. However, when 
we just copied the database (within the same server), the same DELETE 
executed on the copy with no problems.


I would like to ask, however: could the same problemarise from an UPDATE 
statement (also on a table with an AFTER trigger), or would that be 
another bug (technical details below)?
As the bug causes our production servers segfault several times a day, 
we'd like to solve the problem as soon as possible. Do I understand it 
correctly that if we dump and restore the database, the bug should not 
occur (until the next ALTER TABLE ADD COLUMN - which we'll avoid until the 
patch is released)?


You can dump that single table, truncate the table (presuming no FKs) and 
then restore it.


--
Angular momentum makes the world go 'round.


Re: Trouble Upgrading Postgres

2018-11-03 Thread Adrian Klaver

On 11/3/18 12:57 PM, Charles Martin wrote:
I'd be grateful for some help. I am trying to move a large database from 
PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL 11 on 
Centos 7. I can't do a pg_dump because it always fails on the largest 
table. 


I would answer Ron's question on this first as solving it would be the 
easiest fix.


So tried to do pb_basebackup and copy that to the new PG 11 
server. Except that pg_upgrade expects the new and old versions of PG to 
be side-by-side. So I installed 9.6 on the new server, ran initdb, 


The is probably the issue, you now have two 9.6 data directory 
instances, the one you created with initdb and the one that came over 
with pg_basebackup. I am guessing the editing below has left the server 
in a confused state about which directory to use. The error messages you 
got when trying to restart the server would be helpful.


verified that it started, then stopped it and edited postgresql.conf 
data path to the location of the pg_basebackup files. Then 9.6 would no 
longer start. So how can I get my PG 9.6 data into a new PG 11 database?


Probably related to my troubles are my attempts to get replication set 
up. But before I dive back into that, I thought I'd better try getting 
my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade and 
mount the data in PG 11. Then maybe I can get replication started.


I've read that logical replication can be used to migrate from 9.6 to 
11, but haven't found any documentation on doing that.


Chuck Martin



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Trouble Upgrading Postgres

2018-11-03 Thread Charles Martin
When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: PQgetCopyData()
failed.

pg_dump: Error message from server: server closed the connection
unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey,
docfileoriginalname, ordernumber, versionnum, docfilecontents, docfilepath,
docfileextension, enddatetime, endby, editnum, insby, insdatetime, modby,
moddatetime, active, doc_fkey) TO stdout;

I've looked and been unable to find where Centos 7, or Postgres 9.6, stores
the path to the config/data directory outside the data/postgresql.conf
file. But I agree there must be something somewhere.

Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver 
wrote:

> On 11/3/18 2:56 PM, Charles Martin wrote:
>
> Please reply to list also.
> Ccing list.
>
> > Yes, if I could get pg_dump to work, I think I'd be fine. Maybe. But it
> > doesn't.
>
> Post the error you got to the list and we maybe able to help.
> Also which version of Postgres where you using to take the dump?
>
> >
> > I agree that I've confused Postgres, but I don't know how to resolve the
> > confusion. It is complicated by the fact that my original Centos 7
> > install included Postgres 9.2, so those files are hanging around, along
> > with 9.6 and 11.
> >
> > I posted the error messages I got when postgresql.conf had the data
> > directory set to my basebackup data:
> >
> > *postgresql-9.6.service: main process exited, code=exited,
> status=1/FAILURE*
> >
> > *
> > *
> >
> > Not very helpful.
> >
> >
> > systemctl status postgresql-9.6 provided a bit more info:
> >
> > *●*postgresql-9.6.service - PostgreSQL 9.6 database server
> >
> > Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
> > disabled; vendor preset: disabled)
> >
> > Active: *failed*(Result: exit-code) since Sat 2018-11-03 15:05:30
> > EDT; 15s ago
> >
> >   Docs: https://www.postgresql.org/docs/9.6/static/
> >
> >Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D ${PGDATA}
> > *(code=exited, status=1/FAILURE)*
> >
> >Process: 32563
> > ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
> > (code=exited, status=0/SUCCESS)
> >
> >   Main PID: 32570 (code=exited, status=1/FAILURE)
> >
> >
> > Yet this went away, and PG 9.6 started, when I changed postgresql.conf
> > to point to the new (empty) data directory, which is confusing.
>
> No not confusing. Not that familiar  with RPM packaging as I am with the
> Debian/Ubunto packaging. Still if I remember correctly it also allows
> multiple instances of Postgres to run. To do that it has its own system
> of tracking the data directories. Where you created the new data
> directory is obviously where the package scripts expect to find it.  The
> pg_basebackup directory is not.
>
> >
> > Chuck
> >
> >
> > On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver  > > wrote:
> >
> > On 11/3/18 12:57 PM, Charles Martin wrote:
> >  > I'd be grateful for some help. I am trying to move a large
> > database from
> >  > PostgreSQL 9.6 on Centos 6 to a different server using PostgreSQL
> > 11 on
> >  > Centos 7. I can't do a pg_dump because it always fails on the
> > largest
> >  > table.
> >
> > I would answer Ron's question on this first as solving it would be
> the
> > easiest fix.
> >
> >  >So tried to do pb_basebackup and copy that to the new PG 11
> >  > server. Except that pg_upgrade expects the new and old versions
> > of PG to
> >  > be side-by-side. So I installed 9.6 on the new server, ran initdb,
> >
> > The is probably the issue, you now have two 9.6 data directory
> > instances, the one you created with initdb and the one that came over
> > with pg_basebackup. I am guessing the editing below has left the
> server
> > in a confused state about which directory to use. The error messages
> > you
> > got when trying to restart the server would be helpful.
> >
> >  > verified that it started, then stopped it and edited
> postgresql.conf
> >  > data path to the location of the pg_basebackup files. Then 9.6
> > would no
> >  > longer start. So how can I get my PG 9.6 data into a new PG 11
> > database?
> >  >
> >  > Probably related to my troubles are my attempts to get
> > replication set
> >  > up. But before I dive back into that, I thought I'd better try
> > getting
> >  > my 9.6 data into the new 9.6 server, then run PG 11's pg_upgrade
> and
> >  > mount the data in PG 11. Then maybe I can get replication started.
> >  >
> >  > I've read that logical replication can be used to migrate from
> > 9.6 to
> >  > 11, but haven't found any documentation on doing that.
> >  >
> >  > Chuck Martin
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com 

Re: Trouble Upgrading Postgres

2018-11-03 Thread Adrian Klaver

On 11/3/18 3:47 PM, Charles Martin wrote:

When I do a pg_dump using PG 9.6, I got this:

pg_dump: Dumping the contents of table "docfile" failed: 
PQgetCopyData() failed.


pg_dump: Error message from server: server closed the connection 
unexpectedly


Is this error the client reporting?

Is this the same that is showing up in the server log?



This probably means the server terminated abnormally


So where is the server located relative to the pg_dump client?

On the same machine?

If so is it a virtual machine e.g AWS?

Across a local or remote network?



before or while processing the request.

pg_dump: The command was: COPY public.docfile (docfile_pkey, 
docfileoriginalname, ordernumber, versionnum, docfilecontents, 
docfilepath, docfileextension, enddatetime, endby, editnum, insby, 
insdatetime, modby, moddatetime, active, doc_fkey) TO stdout;


I've looked and been unable to find where Centos 7, or Postgres 9.6, 
stores the path to the config/data directory outside the 
data/postgresql.conf file. But I agree there must be something somewhere.


Chuck

On Sat, Nov 3, 2018 at 6:06 PM Adrian Klaver > wrote:


On 11/3/18 2:56 PM, Charles Martin wrote:

Please reply to list also.
Ccing list.

 > Yes, if I could get pg_dump to work, I think I'd be fine. Maybe.
But it
 > doesn't.

Post the error you got to the list and we maybe able to help.
Also which version of Postgres where you using to take the dump?

 >
 > I agree that I've confused Postgres, but I don't know how to
resolve the
 > confusion. It is complicated by the fact that my original Centos 7
 > install included Postgres 9.2, so those files are hanging around,
along
 > with 9.6 and 11.
 >
 > I posted the error messages I got when postgresql.conf had the data
 > directory set to my basebackup data:
 >
 > *postgresql-9.6.service: main process exited, code=exited,
status=1/FAILURE*
 >
 > *
 > *
 >
 > Not very helpful.
 >
 >
 > systemctl status postgresql-9.6 provided a bit more info:
 >
 > *●*postgresql-9.6.service - PostgreSQL 9.6 database server
 >
 >     Loaded: loaded (/usr/lib/systemd/system/postgresql-9.6.service;
 > disabled; vendor preset: disabled)
 >
 >     Active: *failed*(Result: exit-code) since Sat 2018-11-03
15:05:30
 > EDT; 15s ago
 >
 >       Docs: https://www.postgresql.org/docs/9.6/static/
 >
 >    Process: 32570 ExecStart=/usr/pgsql-9.6/bin/postmaster -D
${PGDATA}
 > *(code=exited, status=1/FAILURE)*
 >
 >    Process: 32563
 > ExecStartPre=/usr/pgsql-9.6/bin/postgresql96-check-db-dir ${PGDATA}
 > (code=exited, status=0/SUCCESS)
 >
 >   Main PID: 32570 (code=exited, status=1/FAILURE)
 >
 >
 > Yet this went away, and PG 9.6 started, when I changed
postgresql.conf
 > to point to the new (empty) data directory, which is confusing.

No not confusing. Not that familiar  with RPM packaging as I am with
the
Debian/Ubunto packaging. Still if I remember correctly it also allows
multiple instances of Postgres to run. To do that it has its own system
of tracking the data directories. Where you created the new data
directory is obviously where the package scripts expect to find it. 
The

pg_basebackup directory is not.

 >
 > Chuck
 >
 >
 > On Sat, Nov 3, 2018 at 5:17 PM Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > >> wrote:
 >
 >     On 11/3/18 12:57 PM, Charles Martin wrote:
 >      > I'd be grateful for some help. I am trying to move a large
 >     database from
 >      > PostgreSQL 9.6 on Centos 6 to a different server using
PostgreSQL
 >     11 on
 >      > Centos 7. I can't do a pg_dump because it always fails on the
 >     largest
 >      > table.
 >
 >     I would answer Ron's question on this first as solving it
would be the
 >     easiest fix.
 >
 >      >So tried to do pb_basebackup and copy that to the new PG 11
 >      > server. Except that pg_upgrade expects the new and old
versions
 >     of PG to
 >      > be side-by-side. So I installed 9.6 on the new server, ran
initdb,
 >
 >     The is probably the issue, you now have two 9.6 data directory
 >     instances, the one you created with initdb and the one that
came over
 >     with pg_basebackup. I am guessing the editing below has left
the server
 >     in a confused state about which directory to use. The error
messages
 >     you
 >     got when trying to restart the server would be helpful.
 >
 >      > verified that it started, then stopped it and edited
postgresql.conf
 >      > data path to the location of the pg_basebackup files. Then 9.6