PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread TalGloz
Hi,

I've searched information about my problem in the archives and on the
internet, but it didn't help. I have this small myfunc.cpp

1:   #include 
2:   #include 
3:   #include 
4:   #include 
5:   #include 
6:   #include 
7:   #include 
8:   #include 
9:   #include 
10: #include 
11: #include 
12: #include 
13: #include 
14: #include  // external compiled c++ library linked on
running 'make'
15:
16: #ifdef PG_MODULE_MAGIC
17: PG_MODULE_MAGIC;
18: #endif
19: 
20: Datum sum_of_numbers(PG_FUNCTION_ARGS);
21: PG_FUNCTION_INFO_V1(sum_of_numbers);
22: 
23: extern "C" {
24: int64_t sum_of_numbers(){
25: std::vector numbers {23, 445, 64};
26: auto sum = 0;
27: for (auto &item : numbers){
28: sum += item;
29: }
30: return sum;
31: }
32: }

And I compile without any problem suing this Makefiles:

1:   MODULES = myfunc
2: 
3:   PG_CONFIG = /usr/pgsql-10/bin/pg_config
4:   PGXS = $(shell $(PG_CONFIG) --pgxs)
5:   INCLUDEDIR = $(shell $(PG_CONFIG) --includedir-server)
6:   INCLUDE_SEAL = /usr/local/include/seal
7:   INCLUDE_SEAL_LIB = /usr/local/lib/libseal.a
8:   INCLUDE_CPPCODEC = /usr/local/include/cppcodec
9:   #CFLAGS = -std=c11
10: #CFLAGS = --std=c++14 -fPIC -Wall -Werror -g3 -O0
11: include $(PGXS)
12: myfunc.so: myfunc.o
13: g++ -shared -o myfunc.so myfunc.o
14: 
16: myfunc.o: myfunc.cpp
17:g++ --std=c++17 -fPIC -Wall -Werror -g3 -O0 -o myfunc.o -c
myfunc.cpp -I$(INCLUDEDIR) -L$(INCLUDE_SEAL_LIB) -I$(INCLUDE_SEAL)
-I$(INCLUDE_CPPCODEC)
18: 

After copying the myfunc.so file to the PostgreSQL lib folder and executing:

1: CREATE OR REPLACE FUNCTION 
2:sum_of_numbers() RETURNS integer AS 'myfunc' 
3: LANGUAGE C STRICT;

I get this error:
*ERROR:  incompatible library "/usr/pgsql-10/lib/myfunc.so": missing magic
block
HINT:  Extension libraries are required to use the PG_MODULE_MAGIC macro.
SQL state: XX000*

I've added the files and line numbers for an easier discussion.

Thanks a lot,
Tal

myfunc.cpp   
Makefiles.Makefiles
  



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread Tom Lane
TalGloz  writes:
> I've searched information about my problem in the archives and on the
> internet, but it didn't help. I have this small myfunc.cpp
> [ that doesn't work ]

> 16: #ifdef PG_MODULE_MAGIC
> 17: PG_MODULE_MAGIC;
> 18: #endif

Hmm ... don't use an #ifdef there.  If you don't have the macro defined,
you want to fail, not silently build an extension without it.

It's possible you need extern "C" { ... } around the macro, too.

regards, tom lane



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread Igor Korot
Hi,

On Sun, Aug 12, 2018 at 12:05 PM, TalGloz  wrote:
> Hi,
>
> I've searched information about my problem in the archives and on the
> internet, but it didn't help. I have this small myfunc.cpp
>
> 1:   #include 
> 2:   #include 
> 3:   #include 
> 4:   #include 
> 5:   #include 
> 6:   #include 
> 7:   #include 
> 8:   #include 
> 9:   #include 
> 10: #include 
> 11: #include 
> 12: #include 
> 13: #include 
> 14: #include  // external compiled c++ library linked on
> running 'make'
> 15:
> 16: #ifdef PG_MODULE_MAGIC

I think you're missing  "n" here.
Should be:

#ifndef PG_MODULE_MAGIC

.

Thank you.

> 17: PG_MODULE_MAGIC;
> 18: #endif
> 19:
> 20: Datum sum_of_numbers(PG_FUNCTION_ARGS);
> 21: PG_FUNCTION_INFO_V1(sum_of_numbers);
> 22:
> 23: extern "C" {
> 24: int64_t sum_of_numbers(){
> 25: std::vector numbers {23, 445, 64};
> 26: auto sum = 0;
> 27: for (auto &item : numbers){
> 28: sum += item;
> 29: }
> 30: return sum;
> 31: }
> 32: }
>
> And I compile without any problem suing this Makefiles:
>
> 1:   MODULES = myfunc
> 2:
> 3:   PG_CONFIG = /usr/pgsql-10/bin/pg_config
> 4:   PGXS = $(shell $(PG_CONFIG) --pgxs)
> 5:   INCLUDEDIR = $(shell $(PG_CONFIG) --includedir-server)
> 6:   INCLUDE_SEAL = /usr/local/include/seal
> 7:   INCLUDE_SEAL_LIB = /usr/local/lib/libseal.a
> 8:   INCLUDE_CPPCODEC = /usr/local/include/cppcodec
> 9:   #CFLAGS = -std=c11
> 10: #CFLAGS = --std=c++14 -fPIC -Wall -Werror -g3 -O0
> 11: include $(PGXS)
> 12: myfunc.so: myfunc.o
> 13: g++ -shared -o myfunc.so myfunc.o
> 14:
> 16: myfunc.o: myfunc.cpp
> 17:g++ --std=c++17 -fPIC -Wall -Werror -g3 -O0 -o myfunc.o -c
> myfunc.cpp -I$(INCLUDEDIR) -L$(INCLUDE_SEAL_LIB) -I$(INCLUDE_SEAL)
> -I$(INCLUDE_CPPCODEC)
> 18:
>
> After copying the myfunc.so file to the PostgreSQL lib folder and executing:
>
> 1: CREATE OR REPLACE FUNCTION
> 2:sum_of_numbers() RETURNS integer AS 'myfunc'
> 3: LANGUAGE C STRICT;
>
> I get this error:
> *ERROR:  incompatible library "/usr/pgsql-10/lib/myfunc.so": missing magic
> block
> HINT:  Extension libraries are required to use the PG_MODULE_MAGIC macro.
> SQL state: XX000*
>
> I've added the files and line numbers for an easier discussion.
>
> Thanks a lot,
> Tal
>
> myfunc.cpp 
> Makefiles.Makefiles
> 
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
>



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread TalGloz
OK now I have this code:

1:   extern "C" {  // The C header should go here
2:   #include 
3:   #include 
4:   #include 
5:   #include 
6:   #include 
7:   #include 
8:   #include 
9:   #include 
10:  
11: PG_MODULE_MAGIC;
12: }
13:
14: // CPP header without extern "C"
15: #include 
16: #include 
17: #include 
18: #include 
19: #include  // external compiled c++ library linked on
running 'make'
20:
21: Datum sum_of_numbers(PG_FUNCTION_ARGS);
22: PG_FUNCTION_INFO_V1(sum_of_numbers);
33: 
34: extern "C" { // CPP function
35: int64_t sum_of_numbers(){
36: std::vector numbers {23, 445, 64};
37: auto sum = 0;
38: for (auto &item : numbers){
39: sum += item;
40: }
41: return sum;
42: }
43: }

The error this time for PostgreSQL is:
*ERROR:  could not find function information for function "sum_of_numbers"
HINT:  SQL-callable functions need an accompanying
PG_FUNCTION_INFO_V1(funcname).
SQL state: 42883*

Thanks,
Tal




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread Tom Lane
TalGloz  writes:
> The error this time for PostgreSQL is:
> *ERROR:  could not find function information for function "sum_of_numbers"
> HINT:  SQL-callable functions need an accompanying
> PG_FUNCTION_INFO_V1(funcname).
> SQL state: 42883*

Probably need extern "C" around the PG_FUNCTION_INFO_V1 macro, too.
Both those macros produce C function definitions under the hood,
and said functions need to not be name-mangled by C++.

regards, tom lane



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread TalGloz
I did it with the macros

extern "C" {
Datum sum_of_numbers(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(sum_of_numbers);
}

But now my compiler throws some other errors when running make:

g++ --std=c++17 -fPIC -Wall -Werror -g3 -O0 -o myfunc.o -c myfunc.cpp
-I/usr/pgsql-10/include/server -L"/usr/local/lib/libseal.a"
-I"/usr/local/include/seal" -I"/usr/local/include/cppcodec"
myfunc.cpp:29:10: error: conflicting declaration of C function ‘int64_t
sum_of_numbers()’
  int64_t sum_of_numbers(){
  ^~
In file included from /usr/pgsql-10/include/server/utils/array.h:64,
 from /usr/pgsql-10/include/server/utils/acl.h:38,
 from
/usr/pgsql-10/include/server/catalog/objectaddress.h:18,
 from
/usr/pgsql-10/include/server/catalog/pg_publication.h:21,
 from /usr/pgsql-10/include/server/utils/rel.h:21,
 from myfunc.cpp:3:
myfunc.cpp:25:21: note: previous declaration ‘Datum
sum_of_numbers(FunctionCallInfo)’
 PG_FUNCTION_INFO_V1(sum_of_numbers);
 ^~
/usr/pgsql-10/include/server/fmgr.h:374:14: note: in definition of macro
‘PG_FUNCTION_INFO_V1’
 extern Datum funcname(PG_FUNCTION_ARGS); \
  ^~~~
make: *** [Makefile:16: myfunc.o] Error 1

Why is it so hard to use C++ with PostgerSQL for a C extension? :)

Regards,
Tal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread Dmitry Igrishin
вс, 12 авг. 2018 г. в 21:40, TalGloz :
>
> I did it with the macros
>
> extern "C" {
> Datum sum_of_numbers(PG_FUNCTION_ARGS);
> PG_FUNCTION_INFO_V1(sum_of_numbers);
> }
>
> But now my compiler throws some other errors when running make:
>
> g++ --std=c++17 -fPIC -Wall -Werror -g3 -O0 -o myfunc.o -c myfunc.cpp
> -I/usr/pgsql-10/include/server -L"/usr/local/lib/libseal.a"
> -I"/usr/local/include/seal" -I"/usr/local/include/cppcodec"
> myfunc.cpp:29:10: error: conflicting declaration of C function ‘int64_t
> sum_of_numbers()’
>   int64_t sum_of_numbers(){
>   ^~
> In file included from /usr/pgsql-10/include/server/utils/array.h:64,
>  from /usr/pgsql-10/include/server/utils/acl.h:38,
>  from
> /usr/pgsql-10/include/server/catalog/objectaddress.h:18,
>  from
> /usr/pgsql-10/include/server/catalog/pg_publication.h:21,
>  from /usr/pgsql-10/include/server/utils/rel.h:21,
>  from myfunc.cpp:3:
> myfunc.cpp:25:21: note: previous declaration ‘Datum
> sum_of_numbers(FunctionCallInfo)’
>  PG_FUNCTION_INFO_V1(sum_of_numbers);
>  ^~
> /usr/pgsql-10/include/server/fmgr.h:374:14: note: in definition of macro
> ‘PG_FUNCTION_INFO_V1’
>  extern Datum funcname(PG_FUNCTION_ARGS); \
>   ^~~~
> make: *** [Makefile:16: myfunc.o] Error 1
>
> Why is it so hard to use C++ with PostgerSQL for a C extension? :)
It is not.



Re: Replication failure, slave requesting old segments

2018-08-12 Thread Phil Endecott

Hi Stephen,

Stephen Frost wrote:

* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:

archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
   scp %p backup:backup/postgresql/archivedir/%f'


This is really not a sufficient or particularly intelligent
archive_command.  In general, I'd strongly recommend against trying to
roll your own archiving/backup solution for PostgreSQL.  In particular,
the above is quite expensive, doesn't address the case where a file is
only partially copied to the backup server, and doesn't ensure that the
file is completely written and fsync'd on the backup server meaning that
a failure will likely result in WAL going missing.  There's much better
solutions out there and which you really should be using instead of
trying to build your own.  In particular, my preference is pgbackrest
(though I'm one of the folks in its development, to be clear), but
there's other options such as barman or WAL-E which I believe also
address the concerns raised above.


Hmmm, well I based it on what I read in the documentation:

  https://www.postgresql.org/docs/9.6/static/continuous-archiving.html

The expense is not an issue in this case.  I would be surprised if
scp were to exit(0) after only partially copying a file but perhaps
you know something I don't about that.

I know that rsync creates a temporary file and moves it into place when
all the data has been transferred, which scp doesn't do; perhaps that
would be safer.

The master was only writing WAL files maybe 3 or 4 times per day at
the time, so any scenario that requires its crash to have occurred
exactly as it was doing an scp involves a substantial coincidence.

I am limited in what I can run on the backup server.



Having it go back to D0 is at least a bit interesting.  I wonder if
somehow it was partial..  Have you looked at the files on the archive
server to make sure they're all 16MB and the same size?


Yes, all the files on the backup including D0 were 16 MB.  But:


2018-08-11 00:12:15.536 UTC [7954] LOG:  restored log file 
"0001000700D0" from archive
2018-08-11 00:12:15.797 UTC [7954] LOG:  redo starts at 7/D0F956C0
2018-08-11 00:12:16.068 UTC [7954] LOG:  consistent recovery state reached at 
7/D0FFF088


Are the last two log lines above telling us anything useful?  Is that
saying that, of the 16 MB (0x100 byte) WAL file it restored only as
far as byte 0xf956c0 or 0xfff088?  Is that what we would expect?  Is it
trying to use the streaming connection to get the missing bytes from
088 to ?  Is that just an empty gap at the end of the file
due to the next record being too big to fit?


db=> create table waste_space as select * from large_table;

That caused four new WAL segments to be copied to the backup, and restored 
by the slave:


Note again that this was sufficient to make it start working again,
which I think is important.


Really hard to tell from this.  On the one hand, it seems like maybe the
primary was done with D0, but hadn't finished with D1 yet, but the
replica hadn't made that transistion from D0 -> D1, hence why it was
still asking for D0.


Yes.  What is actually supposed to happen?  What exactly do you mean by
"make the transition"?


I'm certainly concerned that there might have
been some amount of partial-WAL-file copying happening, but PG is pretty
good at dealing with a variety of odd cases and so I tend to doubt that
the replica actually ended up being corrupted, but I'll again strongly
advocate for using a proven solution which avoids these risks by doing
things like copying into temporary files on the backup server and then
using an atomic 'mv' to make sure that anyone reading from the archive
wouldn't ever see a partially-written file, and, just to double-check,
also calculate a hash of every WAL file written into the archive and
re-verify that hash when reading it back out.  That's what we do in
pgbackrest, at least.


Noted.  But I don't see any evidence that that was actually the root
cause here, and it relies on an unlikely timing coincidence.  This is
the very first time that I've had a replica recover after a significant
downtime, and it failed.


Regards, Phil.







Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread Tom Lane
TalGloz  writes:
> But now my compiler throws some other errors when running make:

> myfunc.cpp:29:10: error: conflicting declaration of C function ‘int64_t
> sum_of_numbers()’
>   int64_t sum_of_numbers(){
>   ^~

Well, yeah, you forgot to repeat the argument list here.  For that
matter, spelling Datum as int64_t is a recipe for trouble, even if
it works on some particular platform.  Should be

Datum sum_of_numbers(PG_FUNCTION_ARGS) { ...

> Why is it so hard to use C++ with PostgerSQL for a C extension? :)

Probably because you've failed to break your bad C++ habits.

regards, tom lane



Re: Replication failure, slave requesting old segments

2018-08-12 Thread Phil Endecott

Hi Adrian,

Adrian Klaver wrote:

On 08/11/2018 12:42 PM, Phil Endecott wrote:

Hi Adrian,

Adrian Klaver wrote:
Looks like the master recycled the WAL's while the slave could not 
connect.


Yes but... why is that a problem?  The master is copying the WALs to
the backup server using scp, where they remain forever.  The slave gets


To me it looks like that did not happen:

2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
"0001000700D0" from archive
scp: backup/postgresql/archivedir/0001000700D1: No such file 
or directory
2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from 
primary at 7/D000 on timeline 1
2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from 
WAL stream: ERROR:  requested WAL segment 0001000700D0 has 
already been removed


Above 0001000700D0 is gone/recycled on the master and the 
archived version does not seem to be complete as the streaming 
replication is trying to find it.


The files on the backup server were all 16 MB.


Below you kick the master and it coughs up the files to the archive 
including *D0 and *D1 on up to *D4 and then the streaming picks using *D5.


When I kicked it, the master wrote D1 to D4 to the backup.  It did not
change D0 (its modification time on the backup is from before the "kick").
The slave re-read D0, again, as it had been doing throughout this period,
and then read D1 to D4.



Best guess is the archiving did not work as expected during:

"(During this time the master was also down for a shorter period.)"


Around the time the master was down, the WAL segment names were CB and CC.
Files CD to CF were written between the master coming up and the slave
coming up.  The slave had no trouble restoring those segments when it started.
The problematic segments D0 and D1 were the ones that were "current" 
when the

slave restarted, at which time the master was up consistently.


Regards, Phil.







Re: Replication failure, slave requesting old segments

2018-08-12 Thread Phil Endecott
Phil Endecott wrote:
> On the master, I have:
>
> wal_level = replica
> archive_mode = on
> archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
>scp %p backup:backup/postgresql/archivedir/%f'
>
> On the slave I have:
>
> standby_mode = 'on'
> primary_conninfo = 'user=postgres host=master port=5432'
> restore_command = 'scp backup:backup/postgresql/archivedir/%f %p'
>
> hot_standby = on

> 2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
> "0001000700D0" from archive
> scp: backup/postgresql/archivedir/0001000700D1: No such file or 
> directory
> 2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from primary 
> at 7/D000 on timeline 1
> 2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from WAL 
> stream: ERROR:  requested WAL segment 0001000700D0 has already 
> been removed


I am wondering if I need to set wal_keep_segments to at least 1 or 2 for 
this to work.  I currently have it unset and I believe the default is 0.

My understanding was that when using archive_command/restore_command to copy 
WAL segments it would not be necessary to use wal_keep_segments to retain 
files in pg_xlog on the server; the slave can get everything using a 
combination of copying files using the restore_command and streaming. 
But these lines from the log:

2018-08-11 00:12:15.797 UTC [7954] LOG: redo starts at 7/D0F956C0
2018-08-11 00:12:16.068 UTC [7954] LOG: consistent recovery state reached at 
7/D0FFF088

make me think that there is an issue when the slave reaches the end of the 
copied WAL file.  I speculate that the useful content of this WAL segment 
ends at FFF088, which is followed by an empty gap due to record sizes.  But 
the slave tries to start streaming from this point, D0FFF088, not D100. 
If the master still had a copy of segment D0 then it would be able to stream 
this gap followed by the real content in the current segment D1.

Does that make any sense at all?


Regards, Phil.





Re: Replication failure, slave requesting old segments

2018-08-12 Thread Adrian Klaver

On 08/12/2018 12:25 PM, Phil Endecott wrote:

Hi Adrian,

Adrian Klaver wrote:

On 08/11/2018 12:42 PM, Phil Endecott wrote:

Hi Adrian,

Adrian Klaver wrote:
Looks like the master recycled the WAL's while the slave could not 
connect.


Yes but... why is that a problem?  The master is copying the WALs to
the backup server using scp, where they remain forever.  The slave gets


To me it looks like that did not happen:

2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
"0001000700D0" from archive
scp: backup/postgresql/archivedir/0001000700D1: No such 
file or directory
2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from 
primary at 7/D000 on timeline 1
2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from 
WAL stream: ERROR:  requested WAL segment 0001000700D0 has 
already been removed


Above 0001000700D0 is gone/recycled on the master and the 
archived version does not seem to be complete as the streaming 
replication is trying to find it.


The files on the backup server were all 16 MB.


WAL files are created/recycled as 16 MB files, which is not the same as 
saying they are complete for the purposes of restoring. In other words 
you could be looking at a 16 MB file full of 0's.





Below you kick the master and it coughs up the files to the archive 
including *D0 and *D1 on up to *D4 and then the streaming picks using 
*D5.


When I kicked it, the master wrote D1 to D4 to the backup.  It did not
change D0 (its modification time on the backup is from before the "kick").
The slave re-read D0, again, as it had been doing throughout this period,
and then read D1 to D4.


Well something happened because the slave could not get all the 
information it needed from the D0 in the archive and was trying to get 
it from the masters pg_xlog.






Best guess is the archiving did not work as expected during:

"(During this time the master was also down for a shorter period.)"


Around the time the master was down, the WAL segment names were CB and CC.
Files CD to CF were written between the master coming up and the slave
coming up.  The slave had no trouble restoring those segments when it 
started.
The problematic segments D0 and D1 were the ones that were "current" 
when the

slave restarted, at which time the master was up consistently.


Regards, Phil.








--
Adrian Klaver
adrian.kla...@aklaver.com



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread TalGloz
Let me see if I understood you correctly. I cant have a code like this:

extern "C" {
int sum_of_numbers(){
std::vector numbers {23, 445, 64};
int sum = 0;
for (auto &item : numbers){
sum += item;
}
return sum;
}
}

And I have to declare my function with inside the 
Datum sum_of_numbers(PG_FUNCTION_ARGS){
// function code here
}

Sorry for the log mails. I'm having problems to understand, how to change
the construction of myfunc.cpp to run the code. 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: Replication failure, slave requesting old segments

2018-08-12 Thread Adrian Klaver

On 08/12/2018 12:53 PM, Phil Endecott wrote:

Phil Endecott wrote:

On the master, I have:

wal_level = replica
archive_mode = on
archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
scp %p backup:backup/postgresql/archivedir/%f'

On the slave I have:

standby_mode = 'on'
primary_conninfo = 'user=postgres host=master port=5432'
restore_command = 'scp backup:backup/postgresql/archivedir/%f %p'

hot_standby = on



2018-08-11 00:05:50.364 UTC [615] LOG:  restored log file 
"0001000700D0" from archive
scp: backup/postgresql/archivedir/0001000700D1: No such file or 
directory
2018-08-11 00:05:51.325 UTC [7208] LOG:  started streaming WAL from primary at 
7/D000 on timeline 1
2018-08-11 00:05:51.325 UTC [7208] FATAL:  could not receive data from WAL 
stream: ERROR:  requested WAL segment 0001000700D0 has already been 
removed



I am wondering if I need to set wal_keep_segments to at least 1 or 2 for
this to work.  I currently have it unset and I believe the default is 0.


Given that WAL's are only 16 MB I would probably bump it up to be on 
safe side, or use:


https://www.postgresql.org/docs/9.6/static/warm-standby.html

26.2.6. Replication Slots

Though the above does not limit storage of WAL's, so a long outage could 
result in WAL's piling up.




My understanding was that when using archive_command/restore_command to copy
WAL segments it would not be necessary to use wal_keep_segments to retain
files in pg_xlog on the server; the slave can get everything using a
combination of copying files using the restore_command and streaming.
But these lines from the log:

2018-08-11 00:12:15.797 UTC [7954] LOG: redo starts at 7/D0F956C0
2018-08-11 00:12:16.068 UTC [7954] LOG: consistent recovery state reached at 
7/D0FFF088

make me think that there is an issue when the slave reaches the end of the
copied WAL file.  I speculate that the useful content of this WAL segment
ends at FFF088, which is followed by an empty gap due to record sizes.  But
the slave tries to start streaming from this point, D0FFF088, not D100.
If the master still had a copy of segment D0 then it would be able to stream
this gap followed by the real content in the current segment D1.

Does that make any sense at all?


Regards, Phil.







--
Adrian Klaver
adrian.kla...@aklaver.com



Re: PostgreSQL C Language Extension with C++ Code

2018-08-12 Thread TalGloz
OK It worked. This is how I did it, hopefully it is right

extern "C" {
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

PG_MODULE_MAGIC;
}

#include 
#include 
#include 
#include 
#include  // external compiled c++ library linked on running
'make'


extern "C" {
Datum sum_of_numbers(PG_FUNCTION_ARGS){
std::vector numbers {23, 445, 64};
int sum = 0;
for (auto &item : numbers){
sum += item;
}
return sum;
};
PG_FUNCTION_INFO_V1(sum_of_numbers);
}

I've managed to create and execute the function in my PostgreSQL database.

So basically I can execute any CPP code as long as I declare my functions
like this:

extern "C" {
Datum function_name(PG_FUNCTION_ARGS){
 // CPP code here
};

PG_FUNCTION_INFO_V1(function_name);
}

In addition tho thath, all the C headers should be * inside a extern "C" {
}* block and all the CPP headers *outside the extern "C" { }* block, did I
get it right? 

Thanks,
Tal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: Replication failure, slave requesting old segments

2018-08-12 Thread Stephen Frost
Greetings,

* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:
> Stephen Frost wrote:
> >* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:
> >>archive_command = 'ssh backup test ! -f backup/postgresql/archivedir/%f &&
> >>   scp %p backup:backup/postgresql/archivedir/%f'
> >
> >This is really not a sufficient or particularly intelligent
> >archive_command.  In general, I'd strongly recommend against trying to
> >roll your own archiving/backup solution for PostgreSQL.  In particular,
> >the above is quite expensive, doesn't address the case where a file is
> >only partially copied to the backup server, and doesn't ensure that the
> >file is completely written and fsync'd on the backup server meaning that
> >a failure will likely result in WAL going missing.  There's much better
> >solutions out there and which you really should be using instead of
> >trying to build your own.  In particular, my preference is pgbackrest
> >(though I'm one of the folks in its development, to be clear), but
> >there's other options such as barman or WAL-E which I believe also
> >address the concerns raised above.
> 
> Hmmm, well I based it on what I read in the documentation:
> 
>   https://www.postgresql.org/docs/9.6/static/continuous-archiving.html
> 
> The expense is not an issue in this case.  I would be surprised if
> scp were to exit(0) after only partially copying a file but perhaps
> you know something I don't about that.

The scp might not, but so what?  If a partial file remains on the far
side, then a subsequent invokation of that archive_command will come
back saying that the file is there.  Further, the file isn't going to
actually be sync'd to the filesystem on the far side, so even if the scp
returns success, a failure on the backup server could result in loss of
the WAL file.

> I know that rsync creates a temporary file and moves it into place when
> all the data has been transferred, which scp doesn't do; perhaps that
> would be safer.

No, what would be better is using a tool which will actually fsync() the
file on the backup server after writing it out there, making sure it's
been stored to the filesystem, and then to perform an atomic mv
operation to put it in place, and then to fsync() the directory to
ensure that change has also been sync'd to disk.

Even better is to calculate a hash of the file and to reverify that on
restore, to make sure that it doesn't end up getting corrupted on the
backup server somehow.

> The master was only writing WAL files maybe 3 or 4 times per day at
> the time, so any scenario that requires its crash to have occurred
> exactly as it was doing an scp involves a substantial coincidence.

While I get that it's unlikely to be relevant to this particular case, I
wouldn't recommend depending on never having that coincidence happen.
What I've seen quite a bit of, just so you know, is that things tend to
all break down at about the same time- so right when something bad is
happening on the primary, the backup server runs into issues too.  This
might be due to networking breaking down in a particular area all at
once or shared storage having some issue but it's happened often enough
that relying on uncommon things to avoid each other is something I've
got a hard time doing these days.

> I am limited in what I can run on the backup server.

You could consider backing up locally using a tool which is built for
PostgreSQL and then sync'ing the results of that backup over to the
backup server, perhaps.  Or point out that you need a backup server
where you don't have those restirctions in order to properly have
backups.  Or use an external service like s3 (or a clone) to back up to.

> >Having it go back to D0 is at least a bit interesting.  I wonder if
> >somehow it was partial..  Have you looked at the files on the archive
> >server to make sure they're all 16MB and the same size?
> 
> Yes, all the files on the backup including D0 were 16 MB.  But:

Ok.  Did you see any failures on the sending side when archive_command
was being run?  PG shouldn't start trying to archive the file until it's
done with it, of course, so unless something failed right at the end of
the scp and left the file at 16MB but it wasn't actually filled..

> >>2018-08-11 00:12:15.536 UTC [7954] LOG:  restored log file 
> >>"0001000700D0" from archive
> >>2018-08-11 00:12:15.797 UTC [7954] LOG:  redo starts at 7/D0F956C0
> >>2018-08-11 00:12:16.068 UTC [7954] LOG:  consistent recovery state reached 
> >>at 7/D0FFF088
> 
> Are the last two log lines above telling us anything useful?  Is that
> saying that, of the 16 MB (0x100 byte) WAL file it restored only as
> far as byte 0xf956c0 or 0xfff088?  Is that what we would expect?  Is it
> trying to use the streaming connection to get the missing bytes from
> 088 to ?  Is that just an empty gap at the end of the file
> due to the next record being too big to fit?

The short answer is that, yes, the next reco

Re: Replication failure, slave requesting old segments

2018-08-12 Thread Phil Endecott

Stephen Frost wrote:

* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:

Stephen Frost wrote:
>* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:
>>2018-08-11 00:12:15.536 UTC [7954] LOG:  restored log file 
"0001000700D0" from archive
>>2018-08-11 00:12:15.797 UTC [7954] LOG:  redo starts at 7/D0F956C0
>>2018-08-11 00:12:16.068 UTC [7954] LOG:  consistent recovery state reached at 
7/D0FFF088

Are the last two log lines above telling us anything useful?  Is that
saying that, of the 16 MB (0x100 byte) WAL file it restored only as
far as byte 0xf956c0 or 0xfff088?  Is that what we would expect?  Is it
trying to use the streaming connection to get the missing bytes from
088 to ?  Is that just an empty gap at the end of the file
due to the next record being too big to fit?


The short answer is that, yes, the next record was likely too big to
fit, but that's what the replica was trying to figure out and couldn't
because D0 was gone from the primary already.  One thing which should
help this would be to use physical replication slots on the primary,
which would keep it from throwing away WAL files until it knows the
replica has them, but that runs the risk of ending up with lots of extra
WAL on the primary if the replica is gone for a while.  If you'd prefer
to avoid that then having wal_keep_segments set to '1' would avoid this
particular issue as well, I'd expect.

I do wonder if perhaps we should just default to having it as '1' to
avoid exactly this case, as it seems like perhaps PG archived D0 and
then flipped to D1 and got rid of D0, which is all pretty reasonable,
except that a replica trying to catch up is going to end up asking for
D0 from the primary because it didn't know if there was anything else
that should have been in D0..


OK.  I think this is perhaps a documentation bug, maybe a missing
warning when the master reads its configuration, and maybe (as you say)
a bad default value.

Specifically, section 26.2.5 of the docs says:

"If you use streaming replication without file-based continuous archiving,
the server might recycle old WAL segments before the standby has received
them. If this occurs, the standby will need to be reinitialized from a new
base backup. You can avoid this by setting wal_keep_segments to a value
large enough to ensure that WAL segments are not recycled too early, or by
configuring a replication slot for the standby. If you set up a WAL archive
that's accessible from the standby, these solutions are not required, since
the standby can always use the archive to catch up provided it retains enough
segments."

OR, maybe the WAL reader that process the files that restore_command fetches
could be smart enough to realise that it can skip over the gap at the end?

Anyway.  Do others agree that my issue was the result of 
wal_keep_segments=0 ?



Regards, Phil.







Re: Replication failure, slave requesting old segments

2018-08-12 Thread Stephen Frost
Greetings,

* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:
> OK.  I think this is perhaps a documentation bug, maybe a missing 
> warning when the master reads its configuration, and maybe (as you say) 
> a bad default value.

If we consider it to be an issue worthy of a change then we should
probably just change the default value, and maybe not even allow it to
be set lower than '1'.

> Specifically, section 26.2.5 of the docs says:
> 
> "If you use streaming replication without file-based continuous archiving, 
> the server might recycle old WAL segments before the standby has received 
> them. If this occurs, the standby will need to be reinitialized from a new 
> base backup. You can avoid this by setting wal_keep_segments to a value 
> large enough to ensure that WAL segments are not recycled too early, or by 
> configuring a replication slot for the standby. If you set up a WAL archive 
> that's accessible from the standby, these solutions are not required, since 
> the standby can always use the archive to catch up provided it retains enough 
> segments."
> 
> OR, maybe the WAL reader that process the files that restore_command fetches 
> could be smart enough to realise that it can skip over the gap at the end?

That strikes me as a whole lot more complication in something we'd
rather not introduce additional complications into without very good
reason.  Then again, there was just a nearby discussion about how it'd
be nice if the backend could realize when a WAL file is complete, and I
do think that'll be more of an issue when users start configuring larger
WAL files, so perhaps we should figure out a way to handle this..

> Anyway.  Do others agree that my issue was the result of wal_keep_segments=0 ?

Yeah, I've not spent much time actually looking at code around this, so
it'd be nice to get:

a) a reproducible case demonstrating it's happening
b) testing to see how long it's been this way
c) if setting wal_keep_segments=1 fixes it
d) perhaps some thought around if we could address this some other way

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Replication failure, slave requesting old segments

2018-08-12 Thread Adrian Klaver

On 08/12/2018 02:56 PM, Phil Endecott wrote:

Stephen Frost wrote:




Specifically, section 26.2.5 of the docs says:

"If you use streaming replication without file-based continuous archiving,
the server might recycle old WAL segments before the standby has received
them. If this occurs, the standby will need to be reinitialized from a new
base backup. You can avoid this by setting wal_keep_segments to a value
large enough to ensure that WAL segments are not recycled too early, or by
configuring a replication slot for the standby. If you set up a WAL archive
that's accessible from the standby, these solutions are not required, since
the standby can always use the archive to catch up provided it retains 
enough

segments."

OR, maybe the WAL reader that process the files that restore_command 
fetches

could be smart enough to realise that it can skip over the gap at the end?

Anyway.  Do others agree that my issue was the result of 
wal_keep_segments=0 ?


Only as a sub-issue of the slave losing contact with the master. The 
basic problem is maintaining two separate operations, archiving and 
streaming, in sync. If either or some combination of both lose 
synchronization then it is anyone's guess on what is appropriate for 
wal_keep_segments.






Regards, Phil.









--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Replication failure, slave requesting old segments

2018-08-12 Thread Adrian Klaver

On 08/12/2018 03:54 PM, Stephen Frost wrote:

Greetings,

* Phil Endecott (spam_from_pgsql_li...@chezphil.org) wrote:

OK.  I think this is perhaps a documentation bug, maybe a missing
warning when the master reads its configuration, and maybe (as you say)
a bad default value.


If we consider it to be an issue worthy of a change then we should
probably just change the default value, and maybe not even allow it to
be set lower than '1'.



I would say leave the default at 0 as it leaves no doubt that you are 
performing without a net. A setting of '1' implies you are covered and 
for a fast moving cluster or slow moving one with sufficient downtime 
that would not be the case. Better to let the end user know this is not 
a simple problem and some thought needs to go into configuration.



--
Adrian Klaver
adrian.kla...@aklaver.com



Safe operations?

2018-08-12 Thread Samuel Williams
I wish the documentation would include performance details, i.e. this
operation is O(N) or O(1) relative to the number of rows.

I found renaming a table was okay.

How about renaming a column? Is it O(1) or proportional to the amount of
data?

Is there any documentation about this?

Thanks
Samuel


Re: Safe operations?

2018-08-12 Thread Adrian Klaver

On 08/12/2018 05:41 PM, Samuel Williams wrote:
I wish the documentation would include performance details, i.e. this 
operation is O(N) or O(1) relative to the number of rows.


I found renaming a table was okay.

How about renaming a column? Is it O(1) or proportional to the amount of 
data?


Is there any documentation about this?


https://www.postgresql.org/docs/10/static/sql-altertable.html

"RENAME

The RENAME forms change the name of a table (or an index, sequence, 
view, materialized view, or foreign table), the name of an individual 
column in a table, or the name of a constraint of the table. There is no 
effect on the stored data.

"



Thanks
Samuel



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Safe operations?

2018-08-12 Thread Tim Cross
On Mon, 13 Aug 2018 at 11:24, Adrian Klaver 
wrote:

> On 08/12/2018 05:41 PM, Samuel Williams wrote:
> > I wish the documentation would include performance details, i.e. this
> > operation is O(N) or O(1) relative to the number of rows.
> >
> > I found renaming a table was okay.
> >
> > How about renaming a column? Is it O(1) or proportional to the amount of
> > data?
> >
> > Is there any documentation about this?
>
> https://www.postgresql.org/docs/10/static/sql-altertable.html
>
> "RENAME
>
>  The RENAME forms change the name of a table (or an index, sequence,
> view, materialized view, or foreign table), the name of an individual
> column in a table, or the name of a constraint of the table. There is no
> effect on the stored data.
> "
>
> Just wondering - what about the case when the column being renamed is also
> referenced in an index or check constraint? (I would guess you cannot
> rename a column used in a check constraint without first removing it, but
> for an index, would this result in the index being rebuilt (or do you have
> to take care of that manually or are such references abstracted such that
> the column name "text" is irrelevant tot he actual structure of the
> index?).



>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>

-- 
regards,

Tim

--
Tim Cross


Re: Safe operations?

2018-08-12 Thread Olivier Gautherot
On Sun, Aug 12, 2018 at 11:06 PM, Tim Cross  wrote:

>
> On Mon, 13 Aug 2018 at 11:24, Adrian Klaver 
> wrote:
>
>> On 08/12/2018 05:41 PM, Samuel Williams wrote:
>> > I wish the documentation would include performance details, i.e. this
>> > operation is O(N) or O(1) relative to the number of rows.
>> >
>> > I found renaming a table was okay.
>> >
>> > How about renaming a column? Is it O(1) or proportional to the amount
>> of
>> > data?
>> >
>> > Is there any documentation about this?
>>
>> https://www.postgresql.org/docs/10/static/sql-altertable.html
>>
>> "RENAME
>>
>>  The RENAME forms change the name of a table (or an index, sequence,
>> view, materialized view, or foreign table), the name of an individual
>> column in a table, or the name of a constraint of the table. There is no
>> effect on the stored data.
>> "
>>
>> Just wondering - what about the case when the column being renamed is
>> also referenced in an index or check constraint? (I would guess you cannot
>> rename a column used in a check constraint without first removing it, but
>> for an index, would this result in the index being rebuilt (or do you have
>> to take care of that manually or are such references abstracted such that
>> the column name "text" is irrelevant tot he actual structure of the
>> index?).
>
>
Tim, as far as I know, names are only an attribute tagged to an OID.
Internal relations are though these OIDs, not names, so renaming a column
is really one-shot. Names are mainly a more convenient way of referring to
objects.

Olivier


Re: Safe operations?

2018-08-12 Thread Tim Cross
On Mon, 13 Aug 2018 at 12:23, Olivier Gautherot 
wrote:

> On Sun, Aug 12, 2018 at 11:06 PM, Tim Cross  wrote:
>
>>
>> On Mon, 13 Aug 2018 at 11:24, Adrian Klaver 
>> wrote:
>>
>>> On 08/12/2018 05:41 PM, Samuel Williams wrote:
>>> > I wish the documentation would include performance details, i.e. this
>>> > operation is O(N) or O(1) relative to the number of rows.
>>> >
>>> > I found renaming a table was okay.
>>> >
>>> > How about renaming a column? Is it O(1) or proportional to the amount
>>> of
>>> > data?
>>> >
>>> > Is there any documentation about this?
>>>
>>> https://www.postgresql.org/docs/10/static/sql-altertable.html
>>>
>>> "RENAME
>>>
>>>  The RENAME forms change the name of a table (or an index, sequence,
>>> view, materialized view, or foreign table), the name of an individual
>>> column in a table, or the name of a constraint of the table. There is no
>>> effect on the stored data.
>>> "
>>>
>>> Just wondering - what about the case when the column being renamed is
>>> also referenced in an index or check constraint? (I would guess you cannot
>>> rename a column used in a check constraint without first removing it, but
>>> for an index, would this result in the index being rebuilt (or do you have
>>> to take care of that manually or are such references abstracted such that
>>> the column name "text" is irrelevant tot he actual structure of the
>>> index?).
>>
>>
> Tim, as far as I know, names are only an attribute tagged to an OID.
> Internal relations are though these OIDs, not names, so renaming a column
> is really one-shot. Names are mainly a more convenient way of referring to
> objects.
>
> Olivier
>

thanks Olivier, that is what I suspected and your explanation fits with my
mental model. I had assumed table/column names are convenience for humans
and that the system would use OIDs etc for internal references.
-- 
regards,

Tim

--
Tim Cross


Re: Safe operations?

2018-08-12 Thread Tom Lane
Tim Cross  writes:
> On Mon, 13 Aug 2018 at 12:23, Olivier Gautherot 
>> On Sun, Aug 12, 2018 at 11:06 PM, Tim Cross  wrote:
>>> Just wondering - what about the case when the column being renamed is
>>> also referenced in an index or check constraint?

>> Tim, as far as I know, names are only an attribute tagged to an OID.
>> Internal relations are though these OIDs, not names, so renaming a column
>> is really one-shot. Names are mainly a more convenient way of referring to
>> objects.

> thanks Olivier, that is what I suspected and your explanation fits with my
> mental model. I had assumed table/column names are convenience for humans
> and that the system would use OIDs etc for internal references.

Right, catalog internal references are all via OIDs or column numbers,
so that the only thing the system thinks it needs to do is update the
"name" field in a single catalog row.  (A problem with this is that
user-defined function bodies are stored as text; so you may well have
to run around and fix your functions by hand.  But that doesn't
contribute to the cost of the RENAME operation per se.)

Getting back to Samuel's original question, the reason we don't try
to document performance issues like this is that there are just too
many moving parts.  Yeah, the update of the catalog row should be
more or less O(1), and then the required updates of the catalog's
indexes will be more or less O(log N) (N being the number of rows
in that catalog).  But in practice the main constraint is often the
need to obtain locks on the relevant database objects, and that's
really hard to give a prediction for.

regards, tom lane