Re: PostgreSQL extension for processing Graph queries (Apache AGE)
On 2022-Nov-29, Young Seung Andrew Ko wrote: > Hello PostgreSQL users, > > https://github.com/apache/age > Apache AGE is an Apache 2-licensed open source PostgreSQL extension for > storing Graph data. > > The current version of Apache AGE is to enable PostgreSQL users to use > Neo4j's openCypher-based graph queries in unison with existing relational > tables Can you show some examples of this feature in action? What sort of data would I use it for, how would I query it? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Finding free time period on non-continous tstzrange field values
Hi Given the following table, how do I find free time period. CREATE TABLE test_time_range ( id SERIAL PRIMARY KEY, time_range tstzrange); Insert into test_time_range(time_range) values('[2022-11-28 08:00:00, 2022-11-28 20:00:00]'); Insert into test_time_range(time_range) values('[2022-11-29 12:30:00, 2022-11-29 22:00:00]'); Insert into test_time_range(time_range) values('[2022-11-30 05:00:00, 2022-11-30 19:00:00]'); In the above example, I would like the query to return something like this: "2022-11-28 20:01:00 2022-11-29 11:29:00" "2022-11-29 22:01:00 2022-11-30 04:59:00" Apologies if this is a dumb question, but trying to use range for the first time , and can't get my head around it. Using PG14, can upgrade to 15 if that matters. Amitabh
Re: Finding free time period on non-continous tstzrange field values
> > Given the following table, how do I find free time period. > https://www.crunchydata.com/blog/better-range-types-in-postgres-14-turning-100-lines-of-sql-into-3
Re: Finding free time period on non-continous tstzrange field values
On Wed, Nov 30, 2022 at 7:20 PM Marcos Pegoraro wrote: > Given the following table, how do I find free time period. >> > > > https://www.crunchydata.com/blog/better-range-types-in-postgres-14-turning-100-lines-of-sql-into-3 > Thanks Marcos .. Had seen this earlier but somehow slipped my mind to use it for implementation. Will try implementing using tstzmultirange.
postgres large database backup
Hi, I have a 10TB database running on postgres 11 version running on centos 7 "on premises", I need to schedule the backup of this database in a faster way. The scheduled backup will be used for PITR purposes. So please let me know how I should do it in a quicker backup for my 10TB database ? Is there any tool to take backups and subsequently incremental backups in a faster way and restore it for PITR in a faster way when required. What should be the exact approach for scheduling such backups so that it can be restored in a faster way ? Regards.
Re: postgres large database backup
On Wed, Nov 30, 2022 at 8:40 AM Atul Kumar wrote: > > Hi, > > I have a 10TB database running on postgres 11 version running on centos 7 "on > premises", I need to schedule the backup of this database in a faster way. > > The scheduled backup will be used for PITR purposes. > > So please let me know how I should do it in a quicker backup for my 10TB > database ? Is there any tool to take backups and subsequently incremental > backups in a faster way and restore it for PITR in a faster way when required. > > What should be the exact approach for scheduling such backups so that it can > be restored in a faster way ? Faster than *what*? If speed is the primary criteria, filesystem snapshots by using pg_start_backup() to tell the DB cluster to be in a binary ready backup mode, snapshot, then pg_stop_backup(), capture the WALs generated alongside your FS snapshot, all on the same machine or shared storage would be the fastest to restore. To restore, bring back the old snapshot+ the WALs captured with the DB shutdown/stopped, startup is normal "crash recovery" or you can select PITR/LSN in the short pg_start_backup() ... pg_stop_backup() window. If you're properly archiving WALs outside of JUST the full backup you can PITR to any point after the full backup snapshot, but the more transactions/WAL it has to process to get to the desired point the longer the recovery. pgbackrest can backup a PG cluster in multiple ways (including taking a base backup while/and actively streaming WALs or being the WAL archiver), and a restore on the same machine as the backup repository would be basically limited by I/O (well, unless you've got all NVMe, then CPU, bus, or memory bandwidth constraints become the limiting factor). Basically no matter how you backup, 10TB takes a long time to copy, and except in the "local FS snapshot" method I outlined above, that's going to be your limiting factor, is how fast you can move the data back to where you need it. For critical DBs of this nature I've actually done almost exactly the method I just outlined, only the backup/snapshot process happens on a replica. *NORMAL* failure recovery in that replicated cluster is by failovers, but, for actual backup restore due to disaster or need to go back in time (which is...extremely rare...) there's some manual intervention to bring up a snapshot and play back WALs to the point in time that we want the DB cluster. > > > > Regards. -- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler
Re: postgres large database backup
> On Nov 30, 2022, at 07:40, Atul Kumar wrote: > So please let me know how I should do it in a quicker backup for my 10TB > database ? Is there any tool to take backups and subsequently incremental > backups in a faster way and restore it for PITR in a faster way when required. For an initial backup, you'll need to copy 10TB to another medium; there's no getting around that. pgBackRest supports incremental and differential backups, which are especially useful if there are tables that are largely for archival purposes and do not get frequently modified: https://pgbackrest.org
Re: postgres large database backup
On Wed, Nov 30, 2022, 9:10 PM Atul Kumar wrote: > Hi, > > I have a 10TB database running on postgres 11 version running on centos 7 > "on premises", I need to schedule the backup of this database in a faster > way. > > The scheduled backup will be used for PITR purposes. > > So please let me know how I should do it in a quicker backup for my 10TB > database ? Is there any tool to take backups and subsequently incremental > backups in a faster way and restore it for PITR in a faster way when > required. > > What should be the exact approach for scheduling such backups so that it > can be restored in a faster way ? > We had a 96 cpu, 385gb ram, nvme storage and 10g network baremetal server. We used pgbackrest for full backup. It supports pitr and differential backup. 28tb db took 2.5 hours for backup on remote storage, and restore from the remote storage took 3.5 hours when immediately restored (a lot of time is later due to wal replay to catch up) pg_basebackup took 9 hours.
Re: postgresql 13.1: precision of spatial operations
On 11/30/2022 9:48 AM, Вадим Самохин wrote: Thank you so much Ivan, it worked! Can you give any more detail on which approach you took, for the sake of future followers?
Re: postgres large database backup
On 11/30/22 10:40, Atul Kumar wrote: Hi, I have a 10TB database running on postgres 11 version running on centos 7 "on premises", I need to schedule the backup of this database in a faster way. The scheduled backup will be used for PITR purposes. So please let me know how I should do it in a quicker backup for my 10TB database ? Is there any tool to take backups and subsequently incremental backups in a faster way and restore it for PITR in a faster way when required. What should be the exact approach for scheduling such backups so that it can be restored in a faster way ? Well, that depends on your configuration details. Is your DB located on a SAN device or JBOD? If it's SAN, than it can do snapshots. All enterprise backup utilities like NetBackup, TSM, Avamar or Commvault support snapshots. They all can do full and incremental backups and they can also do PITR. Your database is an enterprise sized database and needs an enterprise level tool. If you want a freebie, pgbackrest is the best thing no money can buy. Pgbackrest supports parallelism, full and incremental backups. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: postgres large database backup
On 11/30/22 11:41, Vijaykumar Jain wrote: On Wed, Nov 30, 2022, 9:10 PM Atul Kumar wrote: Hi, I have a 10TB database running on postgres 11 version running on centos 7 "on premises", I need to schedule the backup of this database in a faster way. The scheduled backup will be used for PITR purposes. So please let me know how I should do it in a quicker backup for my 10TB database ? Is there any tool to take backups and subsequently incremental backups in a faster way and restore it for PITR in a faster way when required. What should be the exact approach for scheduling such backups so that it can be restored in a faster way ? We had a 96 cpu, 385gb ram, nvme storage and 10g network baremetal server. We used pgbackrest for full backup. It supports pitr and differential backup. 28tb db took 2.5 hours for backup on remote storage, and restore from the remote storage took 3.5 hours when immediately restored (a lot of time is later due to wal replay to catch up) Another vote for pgbackrest. It supports AES encryption, and multiple forms of compression. -- Angular momentum makes the world go 'round.
Re: postgres large database backup
Am 30.11.22 um 20:01 schrieb Mladen Gogala: On 11/30/22 10:40, Atul Kumar wrote: Hi, I have a 10TB database running on postgres 11 version running on centos 7 "on premises", I need to schedule the backup of this database in a faster way. The scheduled backup will be used for PITR purposes. Well, that depends on your configuration details. Is your DB located on a SAN device or JBOD? You could also use a filesystem that can do atomic snapshots - like ZFS. However, I'm wondering why none of the previous respondents mentioned it? Sure, ZFS may have its own performance implications... but my experience is that it makes "safety snapshots" and "on-demand throw-away testing environments" very convenient to use. Best regards, -hannes
Re: postgres large database backup
On 11/30/22 18:19, Hannes Erven wrote: You could also use a filesystem that can do atomic snapshots - like ZFS. Uh, oh. Not so sure about that. Here is a page from the world of the big O: https://blog.docbert.org/oracle-on-zfs/ However, similar can be said about ZFS. ZFS snapshots will slow down the I/O considerably. I would definitely prefer snapshots done in hardware and not in software. My favorite file systems, depending on the type of disk, are F2FS and XFS. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: postgres large database backup
On Wed, Nov 30, 2022 at 18:03 Mladen Gogala wrote: > On 11/30/22 18:19, Hannes Erven wrote: > > You could also use a filesystem that can do atomic snapshots - like ZFS. > > Uh, oh. Not so sure about that. Here is a page from the world of the big > O: https://blog.docbert.org/oracle-on-zfs/ > > However, similar can be said about ZFS. ZFS snapshots will slow down the > I/O considerably. I would definitely prefer snapshots done in hardware and > not in software. My favorite file systems, depending on the type of disk, > are F2FS and XFS. > ZFS snapshots don’t typically have much if any performance impact versus not having a snapshot (and already being on ZFS) because it’s already doing COW style semantics. Postgres write performance using ZFS is difficult because it’s super important to match up the underlying I/O sizes to the device/ZFS ashift, the ZFS recordsize, and the DB’s page/wal page sizes though, but not getting this right also cause performance issues without any snapshots, because again COW. If you’re constantly breaking a record block or sector there’s going to be a big impact. It won’t be any worse (in my own testing) regardless of if you have snapshots or not. Snapshots on ZFS don’t cause any crazy write amplification by themselves (I’m not sure they cause any extra writes at all, I’d have to do some sleuthing) ZFS will yes be slower than a raw disk (but that’s not an option for Pg anyway), and may or may not be faster than a different filesystem on a HW RAID volume or storage array volume. It absolutely takes more care/clue/tuning to get Pg write performance on ZFS, and ZFS does duplicate some of Pg’s resiliency so there is duplicate work going on. I’d say really that 2016 article is meaningless as ZFS, Oracle, and Postgres have all evolved dramatically in six years. Even further since there’s nothing remotely like ASM for Postgres. > > -- > Mladen Gogala > Database Consultant > Tel: (347) 321-1217https://dbwhisperer.wordpress.com > > -- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler
Re: postgres large database backup
On 11/30/22 19:41, Michael Loftis wrote: On Wed, Nov 30, 2022 at 18:03 Mladen Gogala wrote: On 11/30/22 18:19, Hannes Erven wrote: You could also use a filesystem that can do atomic snapshots - like ZFS. Uh, oh. Not so sure about that. Here is a page from the world of the big O: https://blog.docbert.org/oracle-on-zfs/ However, similar can be said about ZFS. ZFS snapshots will slow down the I/O considerably. I would definitely prefer snapshots done in hardware and not in software. My favorite file systems, depending on the type of disk, are F2FS and XFS. ZFS snapshots don’t typically have much if any performance impact versus not having a snapshot (and already being on ZFS) because it’s already doing COW style semantics. Postgres write performance using ZFS is difficult because it’s super important to match up the underlying I/O sizes to the device/ZFS ashift, the ZFS recordsize, and the DB’s page/wal page sizes though, but not getting this right also cause performance issues without any snapshots, because again COW. If you’re constantly breaking a record block or sector there’s going to be a big impact. It won’t be any worse (in my own testing) regardless of if you have snapshots or not. Snapshots on ZFS don’t cause any crazy write amplification by themselves (I’m not sure they cause any extra writes at all, I’d have to do some sleuthing) ZFS will yes be slower than a raw disk (but that’s not an option for Pg anyway), and may or may not be faster than a different filesystem on a HW RAID volume or storage array volume. It absolutely takes more care/clue/tuning to get Pg write performance on ZFS, and ZFS does duplicate some of Pg’s resiliency so there is duplicate work going on. I wonder what percentage of /Big Databases/ (like Op's and Vijaykumar's) are still on physical servers, as opposed to VMs connected to SANs. Even many physical servers are connected to SANs. (That is, of course, in the dreaded Enterprise environment.) -- Angular momentum makes the world go 'round.
Librephotos on Synology DSM with Docker Problem with PostGres
Hello, looking for some help. I am trying to put Librephotos on my NAS that's running DSM using Docker but I'm getting this error on the postgres image/container I have the "librephotos-docker-main" folder placed into the docker folder on my host machine, and I tried creating a directory "db" in the docker folder as well. That didn't work. Does anyone know where this directory should be and why it's not putting it in place on its own? [image: image.png] Here is that container section of the "docker-compose.yml" also db: image: postgres:13 container_name: db restart: unless-stopped environment: - POSTGRES_USER=${dbUser} - POSTGRES_PASSWORD=${dbPass} - POSTGRES_DB=${dbName} volumes: - ${data}/db:/var/lib/postgresql/data command: postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=off -c random_page_cost=1.0 #Checking health of Postgres db healthcheck: test: psql -U ${dbUser} -d ${dbName} -c "SELECT 1;" interval: 5s timeout: 5s retries: 5
Re: Finding free time period on non-continous tstzrange field values
On Wed, Nov 30, 2022 at 7:13 PM Amitabh Kant wrote: > Hi > > Given the following table, how do I find free time period. > > CREATE TABLE test_time_range ( > id SERIAL PRIMARY KEY, > time_range tstzrange); > > > Insert into test_time_range(time_range) values('[2022-11-28 08:00:00, > 2022-11-28 20:00:00]'); > Insert into test_time_range(time_range) values('[2022-11-29 12:30:00, > 2022-11-29 22:00:00]'); > Insert into test_time_range(time_range) values('[2022-11-30 05:00:00, > 2022-11-30 19:00:00]'); > > > In the above example, I would like the query to return something like this: > "2022-11-28 20:01:00 2022-11-29 11:29:00" > "2022-11-29 22:01:00 2022-11-30 04:59:00" > > > Apologies if this is a dumb question, but trying to use range for the > first time , and can't get my head around it. > > Using PG14, can upgrade to 15 if that matters. > > > Amitabh > Based on Marcos suggestions ( https://www.crunchydata.com/blog/better-range-types-in-postgres-14-turning-100-lines-of-sql-into-3 ), I tried the following query : SELECT tstzmultirange(tstzrange('2022-11-25 00:00:00', '2022-11-30 00:00:00', '[]')) - range_agg(time_range) AS availability FROM test_time_range WHERE time_range && tstzrange('2022-11-25 00:00:00', '2022-11-30 00:00:00', '[]'); but then I receive the following error. My guess is I need to cast the tstzrange output, but can't seem to find the correct cast. ERROR: function tstzmultirange(tstzrange) does not exist LINE 1: SELECT tstzmultirange(tstzrange('2022-11-25 00:00:00', '2022... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. SQL state: 42883 Character: 8