Re: Binding Postgres to port 0 for testing

2023-03-27 Thread Markus Pilman
Thanks for the suggestions. I didn't know about pg_virtualenv, that's
interesting. Though it seems to achieve something similar as to
containerize the test (potentially in a more platform independent way).
Though it seems pg_virtualenv is mostly doing what my test driver is
currently doing. Trying out the ports is obviously possible, but it seems a
bit hacky to me (though if there's no better way I don't think that's a
good show-stopper).

But I am still wondering: Is there a reason PostgreSQL doesn't allow me to
bind against port 0? I understand that in a production environment this is
almost never the thing you want to do, but I wouldn't consider this option
very dangerous.

Best Markus


On Sat, Mar 25, 2023 at 12:10 PM Erik Wienhold  wrote:

> > On 25/03/2023 18:01 CET Markus Pilman  wrote:
> >
> > I am building a simple integration test framework for an application
> that I
> > am building. For this project I am planning to use PostgreSQL.
> >
> > For testing purposes I'd like to always start with an empty database,
> > populate data, and, if the test was successful, delete everything. These
> > tests are single process tests.
> >
> > I'd like to run many processes in parallel and have one postgres server
> > process running for each. I realize that I could use one and use multiple
> > databases but I don't want to do this for a variety of reasons (one being
> > that I also want to test the control plane logic, the application is
> > multi-tenant and uses a database per tenant, having separate databases
> > simplifies debugging etc).
> >
> > Now the problem is that I need to find a TCP port for each running
> postgres
> > instance. There's multiple ways to do this, but by far the easiest one I
> > know is to bind to port 0. So my plan was to start postgres with "-p 0"
> and
> > then parse stdout to figure out which port it actually uses. But that
> > doesn't seem to work:
> >
> > >postgres -D data/ -p 0
> >  2023-03-25 16:39:54.271 GMT [13924] FATAL: 0 is outside the valid range
> for parameter "port" (1 .. 65535)
> >
> > What would be the recommended way of addressing my issue?
>
> I would try to start Postgres with every port number in a for loop starting
> with port number 1024.  The first one that works is your port number.  And
> you
> may not even have to parse stdout if you can pass that port number to your
> tests.
>
> Maybe you can also use pg_virtualenv[0] from Debian's postgresql-common.
> It
> tries every port number starting from 5432.
>
> [0]
> https://manpages.debian.org/testing/postgresql-common/pg_virtualenv.1.en.html
>
> --
> Erik
>


Re: Binding Postgres to port 0 for testing

2023-03-27 Thread Markus Pilman
Thanks Michael and Erik for the help, I really appreciate it!

Thanks for explaining the context why PostgreSQL doesn't allow binding
against port 0.

I somehow didn't consider looking at the postgres tests, though it makes
sense that they need to solve this problem. If I read the perl code
correctly though it seems that this could, in theory, cause a race? The
script checks first whether the port has been assigned to a test, then
binds a socket to check whether it is used by someone else, closes this
test socker, and then starts a server process. I guess it's unlikely
enough, but isn't there a risk that some other process (that isn't
controlled by this perl script) binds to the found port right after this
test bind but right before postgres calls bind? I guess it should be rare
enough so that it wouldn't cause flaky tests.

I decided to implement the following (this strategy works, though it might
be a bit brittle if PostgreSQL changes the error output format in the
future):
1. Loop, starting from port 5432, incrementing each iteration
2. Start postgres with the given port
3. Parse the output to check whether postgres either writes a line that
ends with "could not create any TCP/IP sockets" (in which case I continue)
or with "database system is ready to accept connections" (in which case I
break).

This is definitely not the most elegant solution, but it should do for now.
At the moment I want to be able to set up everything in one process. In my
experience this makes debugging problems a bit easier but comes at the cost
of a more complex test driver (I recognize that it is a bit weird that the
application layer initializes the runtime environment in this case).

Also, this is a hobby-project and I am more interested in fun learning than
reducing work :) Generally I would agree that reusing existing and testing
code to run this would be better unless there's a really good reason not to
do that.

On Sun, Mar 26, 2023 at 7:27 PM Michael Paquier  wrote:

> On Sat, Mar 25, 2023 at 11:01:33AM -0600, Markus Pilman wrote:
> > Now the problem is that I need to find a TCP port for each running
> postgres
> > instance. There's multiple ways to do this, but by far the easiest one I
> > know is to bind to port 0. So my plan was to start postgres with "-p 0"
> and
> > then parse stdout to figure out which port it actually uses. But that
> > doesn't seem to work:
>
> Note that you can find some inspiration about that in the code tree
> within src/test/perl/PostgreSQL/Test/Cluster.pm, particularly
> get_free_port(), where we have now accumulated a couple of years of
> experience in designing something that's rather safe, even if it comes
> with its own limits.  It is in perl so perhaps you could just reuse it
> rather than reinvent the wheel?  Of course, still it should not be
> complicated to translate that in a different language, but there may
> be no need to reinvent the wheel.  And seeing your first message with
> the requirements you list, this does what you are looking for:
> - Create an empty cluster.
> - Freely create databases, tablespaces, queries, etc.
> - Wipe out the whole.
>
> The test cases around src/test/recovery/t/ could be a good starting
> point, as well.
> --
> Michael
>


C function returning a tuple containing an array of tuples

2023-03-27 Thread Alberto
Hello,

I have a function written in C language returning a tuple. One of the fields of
that tuple is an array of tuples.

I obtain the TupleDesc of the outer tuple using get_call_result_type(fcinfo, ...
and is working fine.

I can't figure how to obtain the TupleDesc of the tuple of the array (the 4th
attribute of the outer tuple, in my case).
For example, fcinfo->args[] array has some information about the attributes but
I haven't found anything useful to call get_call_result_type().

Any suggest?

Best regards

Alberto





Re: PostgreSQL vs MariaDB

2023-03-27 Thread Thomas Guyot

On 2023-03-24 07:07, Inzamam Shafiq wrote:

Hi Team,

Hope you are doing well.

Can someone please list pros and cons of MariaDB vs PostgreSQL that 
actually needs serious consideration while choosing the right database 
for large OLTP DBs (Terabytes)?





Hi Inzamam,

I will have my take as well, but note I have much more experience with 
MySQL/MariaDB and mostly from 10 years ago (although I did use both in 
the last decade too, mostly for hobby and a bit of PostgreSQL at work, 
and I have both running on my workstation).



First of all unless you plan on licensing Oracle for MySQL support, you 
should probably go with MariaDB (which is what you seem to consider 
already). I've known and used MySQL before the MariaDB fork (and even 
before Sun's acquisition), and MariaDB is still heavily developed with 
open bug trackers and many 3rd party companies specializing in 
MySQL/MariaDB support.



Having a sysadmin background, I find MariaDB to be easier to understand 
and administer as a server application. In the main engines, tables are 
straight up files on disk (for InnoDB which is now the default engine, a 
file-per-table option also makes this possible). There isn't really a 
concept of tablespaces, OTOH you can just move some files and symlink 
them (while the DB is down of course) to get some tables onto bigger or 
faster disks.


Recent versions of InnoDB (shortly after the MariaDB fork at least) have 
had a lot of scalability and instrumentation improvement (a lot of it 
from Percona's XtraDB fork), and also allow you to further separate the 
common data files such as using separate files for the doublewrite 
buffer and redo logs (write-only except during crash recovery; perfect 
for spinning disks) from other read/write data files (containing undo 
logs and system tables amongst others, and table data when not using 
file-per-table).


There's obviously the plugable engines (it appears PostgreSQL is 
implementing this too now), I'm less familiar with the latest 
development of those and have mostly used InnoDB/XtraDB but there's 
quite a few very specialized engines too. One I find particularly 
interesting is MyRocks which is optimized for flash storage with 
compression and can do high performance bulk inserts from files.



OTOH my experience with PostgreSQL is that it seems to have greater 
support for some SQL features and concepts, or at least used to. I'm not 
sufficiently SQLiterate to give many specifics but I remember seeing a 
few examples in the past, one was lack of sequences which appears to 
have been added about 5 years ago (before that one could use 
auto_increment keys to get similar functionality).


From my perspective PostgreSQL appears to be more similar to other 
database engines when it comes to managing tablespaces, schemas, etc., 
that said I had only limited experience with using Oracle, Sybase, DB2 
and MSSQL, and not really anything about managing tablespaces/schemas. 
Also unlike MariaDB, Postgresql can version DDL too (in InnoDB they 
cause an implicit commit and rollbacks are no longer possible for the 
transaction executing it).


I feel there may also likely more edge cases that you may have to be 
aware for some specific operations with MariaDB (it's well documented 
too) esp. with replication... but maybe that's just me knowing it 
better, and it's mostly from 10y old experience (it tend to be getting 
better over time and I haven't worked on any replicated setup lately).



So, TL;DR if you're a real DBA with experience with other commercial DB 
engines, I think you will find yourself more at ease with PostgreSQL, 
and it will likely be easier to port statements from other engines.


Someone with a strong sysadmin background, will likely be more 
comfortable setting up and maintaining MariaDB, and some of its plugable 
engines may also be worth considering, but that really depend on the 
type of load and hardware you will be using.


I know there's very good instrumentation to troubleshoot performance 
issues with MariaDB/InnoDB, something I'm absolutely not familiar with 
PostgreSQL...


Regards,

Thomas




Re: PostgreSQL vs MariaDB

2023-03-27 Thread Achilleas Mantzios - cloud

On 3/28/23 06:44, Thomas Guyot wrote:


On 2023-03-24 07:07, Inzamam Shafiq wrote:

Hi Team,

Hope you are doing well.

Can someone please list pros and cons of MariaDB vs PostgreSQL that 
actually needs serious consideration while choosing the right 
database for large OLTP DBs (Terabytes)?






Someone with a strong sysadmin background, will likely be more 
comfortable setting up and maintaining MariaDB, and some of its 
plugable engines may also be worth considering, but that really depend 
on the type of load and hardware you will be using.


I believe this as well. Also PostgreSQL is more loved by developers due 
to its academic and scientific origins.


Regarding the rest of commercial DB systems, using my PgSQL experience I 
had no problems living with MS SQL Server, and I kinda felt at home when 
it came to monitor MS SQL Server as well. Of course our main DB is 
PostgreSQL.




Thomas







Re: Oracle to PostgreSQL Migration

2023-03-27 Thread Laurenz Albe
On Fri, 2023-03-24 at 08:00 +, Inzamam Shafiq wrote:
> Thanks, oracle_fdw can be used, but it doesn't provide real time sync, we 
> have to
> schedule jobs to insert data in actual schema from foreign tables, so some 
> delay might be there.

In addition to the commercial tools that Christophe mentioned, and which might 
be
your best option, let me point out that ora_migrator
(https://github.com/cybertec-postgresql/ora_migrator), which is open source and 
based
on oracle_fdw, offers some degree of what you are looking for using a 
cursor-based soution.
Certainly net refined and certainly not feasible in cases with high 
transactional
workload, but perhaps it can be useful for you.

Yours,
Laurenz Albe