Cluster for an appliance-type deployment

2023-11-06 Thread Matthias Leisi
Dear all,

I’m reasonably experienced with Postgres with simple (single, „rebuild and 
restore“) requirements, but would need some hints on what to look for in a more 
complex situation - deploying Postgres as the backend for a (virtual) 
appliance. 

This appliance can scale horizontally from a single to dozens of VMs 
(theoretically more, but most installations are small-ish). It is feasible to 
configure VMs for particular purposes (eg „you are [also] a DB node“), but 
basically all instances will/should be able to perform their main tasks besides 
(also) being a DB node. As the VMs may be installed in very different 
environments, network-based solutions are less feasible and we would prefer a 
DB-level solution. We assume that for most cases, primary/stand-by 
configurations would be sufficient in terms of availability / latency / 
throughput. 

We must also assume that there is no person who would be able to touch things 
if an error occurs. Data consistency and (as much as possible) automated 
recovery from error situations („VM down“, „network lost“, …) are therefor more 
important than „n nines". We can assume that the VMs can talk to each other 
over TCP (eg using SSH tunnels, direct Postgres connection, or some other 
suitable protocol). Scripting „around“ the database is available to initialize 
instances and for similar tasks.

Would Postgres’ own log-shipping (file-based + streaming replication, possibly 
with remote_write) be sufficient for such a set of requirements?

What aspects would you consider important for such a scenario? 

PS: In case it matters, OpenBSD is the base platform of the VMs.

Thanks for your inputs!

— Matthias






Database schema for "custom fields"

2024-09-09 Thread Matthias Leisi
I’m looking for input on a database design question. 

Suppose you have an application that allows the user to add some kind of field 
to the application („custom fields“, „user defined fields“, „extended fields“, 
…), which could be of different types (eg string, int, bool, date, array of 
, …), and which would have some additional properties (like a 
display name or description, or some access control flags).

The application would need to be able to do CRUD on field content, and 
potentially use them in queries („search in custom field“ or similar). It’s not 
expected to be a high-transaction database, and not go beyond ~100k records. 
Data integrity is more important than performance.


How would you design this from a DB point of view? I see a few options, but all 
have some drawbacks:

1) Allow the application to add actual database columns to a „custom fields 
table". Drawback: needs DDL privileges for the application user, makes future 
schema updates potentially more difficult. Pro: „proper“ DB-based approach, can 
use all features of the DB.

2) Use a text-based or JSON field to store the „extended“ data. Drawback: type 
validation, query efficiency?. Pro: Very flexible?

3) Use a „data table“ with one column per potential type (fieldid, valstring, 
valint, valbool, …). Drawback: complex to query, waste of storage? Pro: use all 
DB features on „true“ columns, but without needing DDL privileges.

Are these the right drawbacks and pro arguments? Do you see other options?

Thanks for your insights,
— Matthias

-- 
Matthias Leisi
Katzenrütistrasse 68, 8153 Rümlang
Mobile +41 79 377 04 43
matth...@leisi.net



repmgr(d) versions 5.5 vs 5.4 from apt.postgresql.org

2024-12-24 Thread Matthias Leisi
I installed Postgres 16/17 from apt.postgresql.org <http://apt.postgresql.org/> 
(with the intent of using version 17) on Debian 12/Bookworm. Both have the same 
version of repmgr:

postgres@dev-pgcl01:~$ /usr/lib/postgresql/17/bin/repmgr  --version
repmgr 5.5.0
postgres@dev-pgcl01:~$ /usr/lib/postgresql/16/bin/repmgr  --version
repmgr 5.5.0

root@dev-pgcl01:~# apt-cache show postgresql-17-repmgr | grep "Filename"
Filename: pool/main/r/repmgr/postgresql-17-repmgr_5.5.0-2.pgdg120+1_arm64.deb

In postgresql.conf, I explicitly load the version 17 .so (just to be sure it’s 
not loading something potentially different):

shared_preload_libraries = '/usr/lib/postgresql/17/lib/repmgr.so'

repmgr.conf is minimalistic:

node_id=1
node_name=node1
conninfo='host=10.20.70.26 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/17/'
failover=automatic
use_replication_slots=true
promote_command='/usr/bin/repmgr standby promote -f 
/etc/postgresql/17/main/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f 
/etc/postgresql/17/main/repmgr.conf --log-to-file --upstream-node-id=%n‘

(plus a few configs about the logs)

I created the cluster with

/usr/lib/postgresql/17/bin/repmgr -f /etc/postgresql/17/main/repmgr.conf 
primary register

And it shows what I would expect:

postgres@dev-pgcl01:~$ /usr/lib/postgresql/17/bin/repmgr -f 
/etc/postgresql/17/main/repmgr.conf cluster show
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr 
host=10.20.70.26 fallback_application_name=repmgr options=-csearch_path="
DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr 
host=10.20.70.26 fallback_application_name=repmgr options=-csearch_path="
 ID | Name  | Role| Status| Upstream | Location | Priority | Timeline | 
Connection string
+---+-+---+--+--+--+--+
 1  | node1 | primary | * running |  | default  | 100  | 1| 
host=10.20.70.26 user=repmgr dbname=repmgr connect_timeout=2


I attempt to start repmgrd with

postgres@dev-pgcl01:~$ /usr/lib/postgresql/17/bin/repmgrd -v -f 
/etc/postgresql/17/main/repmgr.conf --verbose

Which gives the following log output:

[2024-12-24 09:21:25] [NOTICE] repmgrd (repmgrd 5.5.0) starting up
[2024-12-24 09:21:25] [INFO] connecting to database "host=10.20.70.26 
user=repmgr dbname=repmgr connect_timeout=2"
[2024-12-24 09:21:25] [DEBUG] connecting to: "user=repmgr connect_timeout=2 
dbname=repmgr host=10.20.70.26 fallback_application_name=repmgr 
options=-csearch_path="
[2024-12-24 09:21:25] [DEBUG] set_config():
  SET synchronous_commit TO 'local'
[2024-12-24 09:21:25] [DEBUG] expected extension version: 50500; extension 
version: 50400
[2024-12-24 09:21:25] [ERROR] this "repmgr" version is newer than the installed 
"repmgr" extension version
[2024-12-24 09:21:25] [DETAIL] "repmgr" version 5.5.0 providing extension 
version 5.5.0 is installed but extension is version 5.4
[2024-12-24 09:21:25] [HINT] update the installed extension version by 
executing "ALTER EXTENSION repmgr UPDATE" in the repmgr database


SELECT * FROM pg_catalog.pg_extension WHERE extname = 'repmgr'

shows version 5.4

The suggested ALTER EXTENSION from the log hint does not do anything I could 
see.

The same effect if I try the same using version 16.

How to continue from here? What am I missing?

Thanks for your help,
— Matthias

-- 
Matthias Leisi
Katzenrütistrasse 68, 8153 Rümlang
Mobile +41 79 377 04 43
matth...@leisi.net





Re: repmgr(d) versions 5.5 vs 5.4 from apt.postgresql.org

2024-12-24 Thread Matthias Leisi

> 
> This [HINT] says that YOU are supposed to run the ALTER EXTENSION command:
> [quote]
> update the installed extension version by executing "ALTER EXTENSION repmgr 
> UPDATE"
> [/quote]

I did run that, but it seemed to have no effect. How should the update work, 
what should it do?

— Matthias