Re: Question About PostgreSQL Extensibility

2023-10-17 Thread Chris Travers
On Mon, Oct 16, 2023 at 10:59 PM Laurenz Albe 
wrote:

> On Fri, 2023-10-13 at 13:55 +, felix.quin...@yahoo.com wrote:
> > For the same reason that you can use python or perl in postgresql. It's
> just another language.
> > I have .net code running on several sql servers and to change the
> database to postgresql I have to reprogram them.
>
> Yes, you'll have to rewrite them.
>

Or pick up the pldotnet handler and patch it to work on the new version.
My experience forward porting such things is that it is usually trivial.

Of course there may be other reasons to rewrite but it really depends on a
lot of factors.


>
> Yours,
> Laurenz Albe
>
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


PITR partial timing

2023-10-17 Thread Rama Krishnan
Hi team,

I  m testing PITR

1.if I am doing full recovery what ever the new tables were create will not
be appear how to handle it

2. I have given my logs which time log I have to choose for PITR partial
recovery


2023-10-16 16:22:35 UTC [16561]: [13-1] user=postgres,db=postgresLOG:
 statement: create database pumastore;
2023-10-16 16:22:35 UTC [16561]: [14-1] user=postgres,db=postgresLOG:
 duration: 70.433 ms
2023-10-16 16:23:02 UTC [16561]: [15-1] user=postgres,db=postgresLOG:
 statement: CREATE TABLE categories (
   category integer NOT NULL,
   categoryname character varying(50) NOT NULL
);
2023-10-16 16:23:02 UTC [16561]: [16-1] user=postgres,db=postgresLOG:
 duration: 6.328 ms
2023-10-16 16:23:28 UTC [16561]: [17-1] user=postgres,db=postgresLOG:
 statement: COPY categories (category, categoryname) FROM stdin;
2023-10-16 16:23:41 UTC [16561]: [18-1] user=postgres,db=postgresLOG:
 duration: 12695.122 ms
2023-10-16 16:23:52 UTC [16561]: [19-1] user=postgres,db=postgresLOG:
 statement: table categories;
2023-10-16 16:23:52 UTC [16561]: [20-1] user=postgres,db=postgresLOG:
 duration: 0.573 ms
2023-10-16 16:24:20 UTC [16561]: [21-1] user=postgres,db=postgresLOG:
 statement: select pg_switch_wal();
2023-10-16 16:24:20 UTC [16561]: [22-1] user=postgres,db=postgresLOG:
 duration: 61.904 ms

2023-10-16 16:26:25 UTC [16561]: [23-1] user=postgres,db=postgresLOG:
 statement: drop table categories;
2023-10-16 16:26:25 UTC [16561]: [24-1] user=postgres,db=postgresLOG:
 duration: 5.165 ms


Re: PITR partial timing

2023-10-17 Thread Ron

Don't write your own PITR.  Use PgBackRest, barman, etc.

On 10/16/23 12:08, Rama Krishnan wrote:

Hi team,

I  m testing PITR

1.if I am doing full recovery what ever the new tables were create will 
not be appear how to handle it


2. I have given my logs which time log I have to choose for PITR partial 
recovery



2023-10-16 16:22:35 UTC [16561]: [13-1] user=postgres,db=postgresLOG: 
 statement: create database pumastore;
2023-10-16 16:22:35 UTC [16561]: [14-1] user=postgres,db=postgresLOG: 
 duration: 70.433 ms
2023-10-16 16:23:02 UTC [16561]: [15-1] user=postgres,db=postgresLOG: 
 statement: CREATE TABLE categories (

   category integer NOT NULL,
   categoryname character varying(50) NOT NULL
);
2023-10-16 16:23:02 UTC [16561]: [16-1] user=postgres,db=postgresLOG: 
 duration: 6.328 ms
2023-10-16 16:23:28 UTC [16561]: [17-1] user=postgres,db=postgresLOG: 
 statement: COPY categories (category, categoryname) FROM stdin;
2023-10-16 16:23:41 UTC [16561]: [18-1] user=postgres,db=postgresLOG: 
 duration: 12695.122 ms
2023-10-16 16:23:52 UTC [16561]: [19-1] user=postgres,db=postgresLOG: 
 statement: table categories;
2023-10-16 16:23:52 UTC [16561]: [20-1] user=postgres,db=postgresLOG: 
 duration: 0.573 ms
2023-10-16 16:24:20 UTC [16561]: [21-1] user=postgres,db=postgresLOG: 
 statement: select pg_switch_wal();
2023-10-16 16:24:20 UTC [16561]: [22-1] user=postgres,db=postgresLOG: 
 duration: 61.904 ms


2023-10-16 16:26:25 UTC [16561]: [23-1] user=postgres,db=postgresLOG: 
 statement: drop table categories;
2023-10-16 16:26:25 UTC [16561]: [24-1] user=postgres,db=postgresLOG: 
 duration: 5.165 ms




--
Born in Arizona, moved to Babylonia.

Logical Replication - Adding Not-Null Column Without Default Value

2023-10-17 Thread Avi Weinberg
Hi Experts,

I'm using logical replication with Postgres 15.

I added not-null column without default value to one of my published tables.  
The way I so it on publisher side is to add the column as nullable, then update 
the value based on some logic, and finally make the column not-null.

On the subscriber side, if add the column as nullable then add dummy value and 
finally make it not-null (all in a single truncation), will it be safe to say 
that the correct values of this column will be logically synchronized from 
publisher to subscriber and will override the dummy values.  Would, at the end, 
the two tables on publisher and subscriber have exactly the same values for 
this column.

Do you see a scenario that the tables will not be the same?  Is there a better 
way to add a not-null column without default value on subscriber side?

Thanks

IMPORTANT - This email and any attachments is intended for the above named 
addressee(s), and may contain information which is confidential or privileged. 
If you are not the intended recipient, please inform the sender immediately and 
delete this email: you should not copy or use this e-mail for any purpose nor 
disclose its contents to any person.


Parsing libpq PQtrace files

2023-10-17 Thread Dominique Devienne
Hi. I'm tracing my client-side connection(s) for the first time.

The format seems to be line oriented.
Selected text columns with embedded newlines have those newlines
encoded as \x0a to remain single-line in the trace. So far so good.

But then on Parse, the query text, which also contains embedded newlines
but also embedded double-quotes, for column aliases, or table names, or
whatever,
even why not a single char literal like '"' (i.e. single-quote,
double-quote, single-quote),
seems to be double-quoted only. So how am I supposed to know when the SQL
ends?

The message length (320 in the sample trace fragment below) does not seems
to correspond
to the length in the trace file itself (perhaps was the original length
before encoding in the trace?).

So how am I supposed to parse the trace given those multi-line-SQL Parse
entries?
Which could have an arbitrary number of embedded double-quotes, supposed to
be
the end of the SQL to parse?

I'm probably missing something...
As I suppose to trace format is supposed to be easily machine parseable, no?

I'd appreciate any insights. Thanks, --DD

PS: And on the off chance there's code out there that parses these traces,
  if someone knows a public source of similar code, I'd appreciate a link
or two.

```
2023-10-17 15:27:36.334749 B 5 ReadyForQuery I
2023-10-17 15:27:36.334881 F 320 Parse "" "

select ..., ... as "name-mismatch", ...
  from "Name needing quoting".ztable1 s
  full outer join " Name needing quoting".ztable2 m using(col1)
 where ...
 order by ...

" 0
2023-10-17 15:27:36.334889 F 14 Bind "" "" 0 0 1 1
2023-10-17 15:27:36.334895 F 6 Describe P ""
2023-10-17 15:27:36.334900 F 9 Execute "" 0
2023-10-17 15:27:36.334904 F 4 Sync
2023-10-17 15:27:36.338536 B 4 ParseComplete
2023-10-17 15:27:36.338579 B 4 BindComplete
2023-10-17 15:27:36.338601 B 146 RowDescription 5 "col2" 333038706 9 25
65535 -1 1 "col3" 0 0 25 65535 -1 1 "name-mismatch" 0 0 16 1 -1 1 "col4" 0
0 25 65535 -1 1 "col5" 333038718 2 19 64 -1 1
2023-10-17 15:27:36.338659 B 108 DataRow 5 -1 7 ...
2023-10-17 15:27:36.338690 B 108 DataRow 5 -1 7 ...
2023-10-17 14:46:40.576933 B 226 DataRow 5 ... 139 'line1\x0aline2' 1
'\x01' 1 '\x00'
...
2023-10-17 15:27:36.338958 B 14 CommandComplete "SELECT 10"
2023-10-17 15:27:36.338971 B 5 ReadyForQuery I
```


Re: Parsing libpq PQtrace files

2023-10-17 Thread Alvaro Herrera
On 2023-Oct-17, Dominique Devienne wrote:

> But then on Parse, the query text, which also contains embedded
> newlines but also embedded double-quotes, for column aliases, or table
> names, or whatever, even why not a single char literal like '"' (i.e.
> single-quote, double-quote, single-quote), seems to be double-quoted
> only. So how am I supposed to know when the SQL ends?

Yeah, I've never heard of anybody trying to machine-parse these lines,
so if those points are not addressed in the trace output, that probably
explains why.  It would probably be a good idea to change that.  This
stuff is relatively new, and most likely has thus far only used by
visual inspecting the contents of the trace file (at least that's how
*I* have used them), and only with very simple dummy queries.

So, patches welcome, I guess.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)




Re: Parsing libpq PQtrace files

2023-10-17 Thread Dominique Devienne
On Tue, Oct 17, 2023 at 6:12 PM Alvaro Herrera 
wrote:

> On 2023-Oct-17, Dominique Devienne wrote:
>
> > But then on Parse, the query text, which also contains embedded
> > newlines but also embedded double-quotes, for column aliases, or table
> > names, or whatever, even why not a single char literal like '"' (i.e.
> > single-quote, double-quote, single-quote), seems to be double-quoted
> > only. So how am I supposed to know when the SQL ends?
>
> Yeah, I've never heard of anybody trying to machine-parse these lines,
> so if those points are not addressed in the trace output, that probably
> explains why.  It would probably be a good idea to change that.  This
> stuff is relatively new, and most likely has thus far only used by
> visual inspecting the contents of the trace file (at least that's how
> *I* have used them), and only with very simple dummy queries.
>

Bummer... In fact, I'm not a fan of the whole FILE* based API,
I'd have preferred a callback based API where I decide what to do with
the message, w/ info about which connection the tracing is for, so I can
safely multiplex concurrent connections (from the same process) to the
same file for example. Or log thread-related information too. Or decide
to split high level info from large row-info from a GB-sized COPY, possibly
keeping only the high level one.

The doc doesn't mention where using the same file for several concurrent
connections is safe or not. When things are flushed or not. Whether I can
insert
my own lines on that FILE*, w/o interspacing them with the fwrite()s from
libpq.


> So, patches welcome, I guess.
>

I often dump trace files to SQLite for analysis, in full or in part.
That's my main reason for wanting to machine parse those files.

I can of course use some heuristics to work-around the multi-line Parse
lines,
but I'm not a fan of brittle code, and I'm surprised tracing's output is
not more consistent and always line based.

I'm not a LibPQ or PostgreSQL dev I'm afraid, so not likely to provide any.
Merely a user of both. Thanks for the input Alvaro. --DD


Index based search have issues with 11.20

2023-10-17 Thread Hafeez Rahim
Hi

This is to check regarding an issue we came across on postgreSQL community
version 11 with minor patch 11.20 along with few other changes
change details mention on end part of this mail

The issue is notice after the minor patching from postgreSQL 11.16 to 11.20
The particular database runs on a dockerized platform

Issue :

One of the table query not woks when string values are filtered with =
clause
where as the query works with like clause using '%' flag  or using
trim(column)
below are query examples describes the problem

not working

select * from atlas_order.gdp_ship_logistic_food cgslf where
shipping_group_id = '2VxFRCOG9tc7SjQ2Ogg0bveve96';

working

select * from atlas_order.gdp_ship_logistic_food cgslf where
shipping_group_id like '%2VxFRCOG9tc7SjQ2Ogg0bveve96%';
select * from atlas_order.gdp_ship_logistic_food cgslf where
shipping_group_id like '2VxFR%COG9tc7SjQ2Ogg0bveve96';
select * from atlas_order.gdp_ship_logistic_food cgslf where
trim(shipping_group_id) = '2VxFR%COG9tc7SjQ2Ogg0bveve96';

Later as per the feedback from dev we have dropped and recreated the
constraints and rebuild the indexes of the affected tables
and the data is accessible after on

the similar pattern is noticed across other tables related to primary -
foreign key relationships

Could you please provide a feedback whether similar issues are reported on
11.20 release
or what could be the reason for the error

Changes executed

   1. PostgreSQL minor version upgrade from 11.16 to 11.20
   2. Change docker container base image from Debian to Alpine  (Debian
   Bookworm - alpine3.18)
   3. Adding PGSENTINEL extension to collect active sessions details
   4. Changing session limit for superuser from 3 to 20
   5. Kernel Patching from version 3.10.0-1160.76.1.el7.x86_64 to
   3.10.0-1160.99.1.el7.x86_64

Note : The OS got restarted as part of the maintenance


-- 



*Regards,Hafeez Database Technical Lead*

Mindcurv TSPL
#6F, T-1, World Trade Centre,
Infopark CSEZ, P-1,
Kakkanad, Cochin - 682042

t: +91 (0)484 2100164
m: +91 9605445282
*www.mindcurv.com *

* CONFIDENTIALITY NOTICE: This e-mail and any attachments hereto are
intended only for use by the addressee(s) named herein and may contain
legally privileged and/or confidential information. If you are not the
intended recipient of this e-mail, you are hereby notified that any
dissemination, distribution or copying of this e-mail, and any attachments
thereto, is strictly prohibited. If you have received this e-mail in error,
please notify the sender by replying to this message and permanently delete
the original and any copy of this e-mail and any printout thereof.*

-- 
--
CONFIDENTIALITY NOTICE: This e-mail and any attachments hereto are 
intended only for use by the addressee(s) named herein and may contain 
legally privileged and/or confidential information. If you are not the 
intended recipient of this e-mail, you are hereby notified that any 
dissemination, distribution or copying of this e-mail, and any attachments 
thereto, is strictly prohibited. If you have received this e-mail in error, 
please notify the sender by replying to this message and permanently delete 
the original and any copy of this e-mail and any printout thereof.


RE: Postgresql HA cluster

2023-10-17 Thread Jason Grammenos
Hello Jehan-Guillaume de Rorthais

Perhaps you could help me understand a few things better about how 
primary-standby works in postgresql with streaming replication.

If you have 2 PostgreSQL nodes hooked up to a Load balancer (haproxy), and you 
move take node1 out of load balancing, you now have connections on node1 and 
connections on node2, as the Load balancer drains the connections off node1 and 
over to node2. How does PostgreSQL handle this scenario when there are writes 
happening on both nodes?

If instead you have 2 PostgreSQL nodes behind pacemaker (controlling a floating 
ip), what happens when you initiate a failover and move the floating ip? You 
want the connections to drain off node1 and move to node2. Again in this 
scenario, both nodes would be sustaining writes at some point in time. How does 
the write only replica get changed out of write mode during this failover? How 
does the primary node get switched to read only after the connection drain is 
complete?

Overall, I am trying to understand what it looks like operationally to run a 2 
node postgresql "cluster" and how patching both nodes would work (and mentally 
comparing it to what I currently do with mysql).

You recommended that primary-standby could be sufficient and is much simpler, 
so I am simply trying to wrap my head around what exactly running it would look 
like. If primary standby is simple enough to failover, patch , reboot, 
maintain, etc. Then you could be correct that master->master may not be needed.

Thank You
Jason

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.gramme...@agilitypr.com
agilitypr.com
Learn new PR tips from our free resources.

-Original Message-
From: Jehan-Guillaume de Rorthais  
Sent: Friday, October 13, 2023 9:10 AM
To: Jason Grammenos 
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Postgresql HA cluster

On Fri, 13 Oct 2023 12:02:53 +
Jason Grammenos  wrote:

> Thank you for the feedback,
> 
> I have used pacemaker for other purposes previously so am a little 
> familiar with it.

So you might be familiar with shared-storage cluster, that are the simpler one 
you could deploy (baring you have a reliable HA storage available). But it's 
not a multi-primary cluster.

> It appears that in this scenario pacemaker is being used to manage a 
> floating ip as well as deal with split brain scenarios.

There's also two different resource agents dealing with PostgreSQL itself:
pgsql and PAF. Both handle multi-status differently from the administration 
point of view.

> What isn’t clear is how effective master-> master replication is being 
> accomplished.

There's no master-master in PostgreSQL core. There's few external solutions out 
there though, but double check you real needs, the real primary-standby 
capacity to answer you needs, and the various constraints M-M imply before 
rushing there.

> Postgresql streaming replication to the best of my limited knowledge 
> only replicates in one direction, from the active to the standby 
> servers. The issue this presents to me is that once you failover from 
> the active to the standby (or one of the standby’s depending on how 
> many you have) none of the data written on the standby is replicated back to 
> the formerly active server.

It depend if this is a "controlled failover" (aka. "switchover") or a real 
failover triggered by some failure. If this is a controlled failover, you can 
hook back your old primary as a standby with no trouble. PAF even handle this 
for you.

Moreover, even with a failure scenario, there's some solutions around to 
quickly fix your old primary data and get it back in production quickly as a 
standby (pg_rewind, PITR/pgbackrest, etc).

You just have to plan for failure and write you procedures accordingly to get 
the cluster back on feet quickly after a failover.

> Let us say that I have only 2 postgresql servers (absolute minimum 
> number) and I want to patch server A. Ideally, I would use a load 
> balancer (or other failover mechanism like pacemaker) and repoint the 
> floating ip to server B.
> Now traffic would “drain” down off server A, and slowly (or rapidly) 
> move to B. During the move some clients would still be writing to A 
> and some clients would be writing to B.

This doesn't exist as PostgreSQL has no multi-primary solution in core. You can 
do rolling upgrade, but you'll have to pause the production during the 
switchover between the primary and the standby.

> In the above scenario, I do not understand how streaming replication 
> would handle the part of the scenario when there are clients writing to A and 
> B.

It will not.

> It would seem that something like `pgpool-ii` or `pgEdge` would be 
> required, but with my limited knowledge it is unclear if or which would be 
> appropriate.

External multi-primary solution exists, pgpool-II, Bucardo, BDR, etc. But 
you'll have to ask and evaluate these t

Re: Index based search have issues with 11.20

2023-10-17 Thread Tom Lane
Hafeez Rahim  writes:
> Issue :

> One of the table query not woks when string values are filtered with =
> clause
> where as the query works with like clause using '%' flag  or using
> trim(column)
> below are query examples describes the problem

You've not really provided enough detail, but I'm going to guess
that your OS update included a change in collation definitions
that left any indexes on text columns essentially corrupt.
The LIKE queries probably weren't affected because they didn't
rely on indexes.

Reindexing the indexes should have been enough to fix it.
See

https://wiki.postgresql.org/wiki/Locale_data_changes

regards, tom lane




Re: Index based search have issues with 11.20

2023-10-17 Thread Adrian Klaver

On 10/17/23 11:24, Hafeez Rahim wrote:


Hi

This is to check regarding an issue we came across on postgreSQL 
community version 11 with minor patch 11.20 along with few other changes

change details mention on end part of this mail

The issue is notice after the minor patching from postgreSQL 11.16 to 11.20
The particular database runs on a dockerized platform

Issue :

One of the table query not woks when string values are filtered with = 
clause
where as the query works with like clause using '%' flag  or using 
trim(column)

below are query examples describes the problem

not working

select * from atlas_order.gdp_ship_logistic_food cgslf where 
shipping_group_id = '2VxFRCOG9tc7SjQ2Ogg0bveve96';


working

select * from atlas_order.gdp_ship_logistic_food cgslf where 
shipping_group_id like '%2VxFRCOG9tc7SjQ2Ogg0bveve96%';
select * from atlas_order.gdp_ship_logistic_food cgslf where 
shipping_group_id like '2VxFR%COG9tc7SjQ2Ogg0bveve96';
select * from atlas_order.gdp_ship_logistic_food cgslf where 
trim(shipping_group_id) = '2VxFR%COG9tc7SjQ2Ogg0bveve96';


Later as per the feedback from dev we have dropped and recreated the 
constraints and rebuild the indexes of the affected tables

and the data is accessibleafter on


I would say the above is due to the distro change below.


 2. Change docker container base image from Debian to Alpine  (Debian
Bookworm - alpine3.18)


My guess something along lines of:

https://wiki.postgresql.org/wiki/Locale_data_changes

Changing distro's mid stream increases the likely hood there will be issues.


--
Adrian Klaver
adrian.kla...@aklaver.com





Inheritance in PostgreSQL

2023-10-17 Thread Luis Bruno
Hello, I'm in the process of developing a basic database structure that
utilizes inheritance as part of a test for my work. The database consists
of two tables: ‘PERSON' and ‘CUSTOMER' with the ‘PERSON' table serving as
the parent table for ‘CUSTOMER' .

Initially, I defined the 'CREATE TABLE' statement as follows:

CREATE TABLE PERSON (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
dob DATE
);

CREATE TABLE CUSTOMER (
registration_date DATE NOT NULL,
contact VARCHAR(255)
) INHERITS (person);

INSERT INTO PERSON VALUES (1, 'Fulano', '1965-06-07');
INSERT INTO CUSTOMER VALUES (2, 'Beltrano', '1980-10-07', '2023-10-10',
'55619');

With these ‘INSERTS’, we have three records, as expected:


The problem occurs when we try add the ‘Fulano’ as a customer:

INSERT INTO CUSTOMER (id, name, dob, registration_date, contact)
SELECT id, name, dob, '2023-10-17', 'cont...@example.com'
FROM person
WHERE id = 1;

The 'CUSTOMER' table look like this:


However, this issue arises in the 'PERSON' table:

The primary key is duplicated when I attempted to add 'Fulano' as a
customer.

After that, I attempted a slightly different approach in creating the
‘CUSTOMER’ table, as I'll show below:

CREATE TABLE customer (
  "id" int4 NOT NULL PRIMARY KEY DEFAULT nextval('person_id_seq'::regclass),
  name VARCHAR(255) NOT NULL,
  dob DATE,
  registration_date DATE,
  contact varchar(255)
) INHERITS (person);

But, when I run the same ‘INSERTS’ above, the same problem occurs with the
‘PERSON’ table:


I would like to know where I might be going wrong with these simple
queries, and reinforce that my main question is: how to create a record for
a ‘CUSTOMER’ who already exists in the ‘PERSON’ table?

A question that arose was to see that in the 'PERSON' table, there was a
duplication of the record with the same 'id', considering that 'id' is a
primary key.

I'm particularly interested in the advantages of the inheritance concept in
PostgreSQL, considering that it can be easily applied to my business rules.
I'd also like to know if inheritance is commonly used. Any insights and
recommendations would be appreciated. Thank you.

My environment:

Oracle Linux Server 8.8

Postgres 15.4

This test was also performed in this environment:

Windows 10 Pro

Postgres 16


Re: Inheritance in PostgreSQL

2023-10-17 Thread David G. Johnston
On Tue, Oct 17, 2023 at 2:49 PM Luis Bruno  wrote:

> I'm particularly interested in the advantages of the inheritance concept
> in PostgreSQL
>

There really are none nowadays and the entire feature should be avoided,
IMO.  No one else has tried to implement it in their engines for good
reasons.  Stick with relational models in a relational database.

Also of note:
https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_varchar.28n.29_by_default

I'm tempted to add "Don't use inheritance" to that page...but fortunately
it doesn't come up that often.

David J.


Re: Inheritance in PostgreSQL

2023-10-17 Thread Ron

On 10/17/23 16:42, Luis Bruno wrote:


Hello, I'm in the process of developing a basic database structure that 
utilizes inheritance as part of a test for my work. The database consists 
of two tables: ‘PERSON' and ‘CUSTOMER' with the ‘PERSON' table serving as 
the parent table for ‘CUSTOMER' .



Initially, I defined the 'CREATE TABLE' statement as follows:

CREATETABLEPERSON (
idSERIAL PRIMARY KEY,
nameVARCHAR(255) NOTNULL,
  dob DATE
);

CREATETABLECUSTOMER (
  registration_date DATE NOTNULL,
  contact VARCHAR(255)
) INHERITS (person);

INSERTINTOPERSON VALUES(1, 'Fulano', '1965-06-07');
INSERTINTOCUSTOMER VALUES(2, 'Beltrano', '1980-10-07', '2023-10-10', 
'55619');



With these ‘INSERTS’, we have three records, as expected:


The problem occurs when we try add the ‘Fulano’ as a customer:

INSERTINTOCUSTOMER (id, name, dob, registration_date, contact)
SELECTid, name, dob, '2023-10-17', 'cont...@example.com'
FROMperson
WHEREid= 1;


The 'CUSTOMER' table look like this:


However, this issue arises in the 'PERSON' table:

The primary key is duplicated when I attempted to add 'Fulano' as a customer.


After that, I attempted a slightly different approach in creating the 
‘CUSTOMER’ table, as I'll show below:


CREATETABLEcustomer (
"id"int4 NOTNULLPRIMARY KEYDEFAULTnextval('person_id_seq'::regclass),
nameVARCHAR(255) NOTNULL,
dob DATE,
registration_date DATE,
contact varchar(255)
) INHERITS (person);


But, when I run the same ‘INSERTS’ above, the same problem occurs with the 
‘PERSON’ table:



I would like to know where I might be going wrong with these simple 
queries, and reinforce that my main question is: how to create a record 
for a ‘CUSTOMER’ who already exists in the ‘PERSON’ table?



A question that arose was to see that in the 'PERSON' table, there was a 
duplication of the record with the same 'id', considering that 'id' is a 
primary key.



I'm particularly interested in the advantages of the inheritance concept 
in PostgreSQL, considering that it can be easily applied to my business 
rules. I'd also like to know if inheritance is commonly used. Any insights 
and recommendations would be appreciated. Thank you.




Data Normalization was "invented" to eliminate this problem (and many others).

CREATE TABLE person (
    id *BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY*,
    name *TEXT*,
    dob DATE );

CREATE TABLE customer (
    id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
    person_id BIGINT REFERENCES person(id),
    registration_date DATE NOT NULL,
    contact_info TEXT);


foo=# INSERT INTO person (name, dob) VALUES ('Fulano', '1965-06-07')
foo-# RETURNING id;
 id

  1
(1 row)

INSERT 0 1
foo=#
foo=# INSERT INTO customer (person_id, registration_date, contact_info)
foo-# values (1, '2023-10-10', '867-5309');
INSERT 0 1
foo=#
foo=# INSERT INTO person (name, dob) VALUES ('Beltrano', '1980-10-07')
    RETURNING id;
 id

  2
(1 row)

INSERT 0 1
foo=#
foo=# INSERT INTO customer (person_id, registration_date, contact_info)
values (2, '2023-10-12', '555-1212');
INSERT 0 1
foo=#
foo=# SELECT p.*, c.*
foo-# FROM person p, customer c
foo-# WHERE p.id = c.person_id;
 id |   name   |    dob | id | person_id | registration_date | 
contact_info

+--+++---+---+--
  1 | Fulano   | 1965-06-07 |  1 | 1 | 2023-10-10    | 867-5309
  2 | Beltrano | 1980-10-07 |  2 | 2 | 2023-10-12    | 555-1212
(2 rows)

foo=# INSERT INTO customer (person_id, registration_date, contact_info)
values (8, '2023-09-11', '(212)555-1212');
ERROR:  insert or update on table "customer" violates foreign key constraint 
"customer_person_id_fkey"

DETAIL:  Key (person_id)=(8) is not present in table "person".

--
Born in Arizona, moved to Babylonia.

RE: Inheritance in PostgreSQL

2023-10-17 Thread Jeff Laing
“Don’t use table inheritance” IS on that page

From: David G. Johnston 

Also of note:
https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_varchar.28n.29_by_default

I'm tempted to add "Don't use inheritance" to that page...but fortunately it 
doesn't come up that often.

David J.



Re: Inheritance in PostgreSQL

2023-10-17 Thread David G. Johnston
On Tue, Oct 17, 2023 at 3:33 PM Jeff Laing 
wrote:

> “Don’t use table inheritance” IS on that page
>
>
>

Indeed - oddly under "tool usage"...along with rules and, an actual tool,
psql -W

David J.


Need efficient way to remove (almost) duplicate rows from a table

2023-10-17 Thread David Gauthier
v11.5 on linux (I know, I know, IT is working on the upgrade to 15.2)

I have a table with millions of records and there are a lot of "almost"
duplicates that I want to get rid of in an efficient way.  Best to
illustrate with a simple example...

We'll talk about deleting leaves on a tree where each twig can have many
leaves, but never 2 or more leaves of the exact same surface area.  What I
have how are a lot of twigs with leaves having the same surface area (and
some different one too) and I want to get rid of the duplicates for that
twig.

create table twigs (limb_id int, branch_id int, twig_id int, surfarea
float);
insert into twigs (linb_id,branch_id,twig_id,surfarea) values
(1,1,1,100.0),
*(1,1,2,103.7),*
*(1,1,3,103.7),*
(1,1,4,110.4),

(1,2,1,120.0),
(1,2,2,123.6),
*(1,2,3,128.1),*
*(1,2,4,128.1),*
*(1,2,5,128.1),*

(2,1,1,100.0),
(2,1,3,104.4),
(2,1,4,131.9);

You can see the duplicates in red.  I want to get rid of all but one of
the dups.  Which "twig_id" that's left behind doesn't matter.

This would do it...
delete from twigs where limb_id=1 and branch_id=1 and twig_id=23;
delete from twigs where limb_id=1 and branch_id=2 and twig_id in (4,5);

But there are millions of these duplicates and it'll take forever like this.

I was going to approach this with a perl/DBI script, shoving the duplicate
record identifiers (limb_id, branch_id, twig_id) into perl arrays and then
submitting the delete command in a prepared statement that accepts the
arrays as values ... (a-la... my $cnt = $sth->execute_array({
ArrayTupleStatus => \my
@tuple_status},\@limb_id_arr,\@branch_id_arr,\@twig_id_arr) or die "-F-
Failed to execute '$sqf'\n";)   But I'm not sure that'll be faster.
Depends on how perl/DBI handles it I suppose.

Then I was thinking it might just be better to create a parallel table and
insert records in there by copying from the table with the dups, taking
care to avoid inserting duplicates.  Once done, delete the original table
and rename the copy to the original's name.  If that's a good approach,
then how exactly do I articulate the insert statement to avoid the
duplicates ?  This approach might also be good in that I can do it outside
a transaction.  If the target table gets goofed up, the original data is
still OK, no harm done, can try again.

Any other ideas ?

Again, millions of duplicates and I don't want to overload any PG system
stuff in the process.

Thanks in Advance !


Re: Need efficient way to remove (almost) duplicate rows from a table

2023-10-17 Thread Ron

https://dba.stackexchange.com/a/138327/63913

On 10/17/23 17:48, David Gauthier wrote:

v11.5 on linux (I know, I know, IT is working on the upgrade to 15.2)

I have a table with millions of records and there are a lot of "almost" 
duplicates that I want to get rid of in an efficient way.  Best to 
illustrate with a simple example...


We'll talk about deleting leaves on a tree where each twig can have many 
leaves, but never 2 or more leaves of the exact same surface area.  What I 
have how are a lot of twigs with leaves having the same surface area (and 
some different one too) and I want to get rid of the duplicates for that twig.


create table twigs (limb_id int, branch_id int, twig_id int, surfarea float);
insert into twigs (linb_id,branch_id,twig_id,surfarea) values
(1,1,1,100.0),
*(1,1,2,103.7),*
*(1,1,3,103.7),*
(1,1,4,110.4),

(1,2,1,120.0),
(1,2,2,123.6),
*(1,2,3,128.1),*
*(1,2,4,128.1),*
*(1,2,5,128.1),*
*
*
(2,1,1,100.0),
(2,1,3,104.4),
(2,1,4,131.9);

You can see the duplicates in red.  I want to get rid of all but one of 
the dups.  Which "twig_id" that's left behind doesn't matter.


This would do it...
delete from twigs where limb_id=1 and branch_id=1 and twig_id=23;
delete from twigs where limb_id=1 and branch_id=2 and twig_id in (4,5);

But there are millions of these duplicates and it'll take forever like this.

I was going to approach this with a perl/DBI script, shoving the duplicate 
record identifiers (limb_id, branch_id, twig_id) into perl arrays and then 
submitting the delete command in a prepared statement that accepts the 
arrays as values ... (a-la... my $cnt = $sth->execute_array({ 
ArrayTupleStatus => \my 
@tuple_status},\@limb_id_arr,\@branch_id_arr,\@twig_id_arr) or die "-F- 
Failed to execute '$sqf'\n";)   But I'm not sure that'll be faster.  
Depends on how perl/DBI handles it I suppose.


Then I was thinking it might just be better to create a parallel table and 
insert records in there by copying from the table with the dups, taking 
care to avoid inserting duplicates.  Once done, delete the original table 
and rename the copy to the original's name.  If that's a good approach, 
then how exactly do I articulate the insert statement to avoid the 
duplicates ?  This approach might also be good in that I can do it outside 
a transaction.  If the target table gets goofed up, the original data is 
still OK, no harm done, can try again.


Any other ideas ?

Again, millions of duplicates and I don't want to overload any PG system 
stuff in the process.


Thanks in Advance !


--
Born in Arizona, moved to Babylonia.

Re: Need efficient way to remove (almost) duplicate rows from a table

2023-10-17 Thread Paul Jungwirth

On 10/17/23 15:48, David Gauthier wrote:
I have a table with millions of records and there are a lot of "almost" 
duplicates that I want to get rid of in an efficient way.  Best to 
illustrate with a simple example...


This sounds like a classic use-case for a "correlated subquery". 
Assuming you want to keep the lowest twig_id, you could do this:


```
paul=# select * from twigs;
 limb_id | branch_id | twig_id | surfarea
-+---+-+--
   1 | 1 |   1 |  100
   1 | 1 |   2 |103.7
   1 | 1 |   3 |103.7
   1 | 1 |   4 |110.4
   1 | 2 |   1 |  120
   1 | 2 |   2 |123.6
   1 | 2 |   3 |128.1
   1 | 2 |   4 |128.1
   1 | 2 |   5 |128.1
   2 | 1 |   1 |  100
   2 | 1 |   3 |104.4
   2 | 1 |   4 |131.9
(12 rows)

Time: 0.218 ms
paul=# delete from twigs as t1
where exists (select 1 from twigs as t2 where (t1.limb_id, t1.branch_id, 
t1.surfarea) = (t2.limb_id, t2.branch_id, t2.surfarea) and t1.twig_id > 
t2.twig_id);

DELETE 3
Time: 8.555 ms
paul=# select * from twigs;
 limb_id | branch_id | twig_id | surfarea
-+---+-+--
   1 | 1 |   1 |  100
   1 | 1 |   2 |103.7
   1 | 1 |   4 |110.4
   1 | 2 |   1 |  120
   1 | 2 |   2 |123.6
   1 | 2 |   3 |128.1
   2 | 1 |   1 |  100
   2 | 1 |   3 |104.4
   2 | 1 |   4 |131.9
(9 rows)

Time: 0.231 ms
```

That says "delete from t1 where there is a matching t2 with a lower 
twig_id."


If you want to keep the highest-numbered twig_id, the sql is easy to adjust.

Regards,

--
Paul  ~{:-)
p...@illuminatedcomputing.com




Re: Inheritance in PostgreSQL

2023-10-17 Thread Merlin Moncure
On Tue, Oct 17, 2023 at 5:36 PM Jeff Laing 
wrote:

> “Don’t use table inheritance” IS on that page
>

Yeah, inheritance probably would have been removed a long time ago except
that it was underneath the partitioning feature.  The basic idea of
implementing polymorphic record storage essentially hacks SQL in a way it
doesn't want to be hacks, thus the feature never really worked properly.
If you want variant storage, use hstore, json, etc. or use the various
techniques that split the entity across multiple tables.

The only thing you can't really do in SQL easily without writing nasty
triggers are things like, 'this table must be linked from one and only one
of these candidate tables'.  I think the language probably ought to support
this, but I don't think postgres would unless the standard did.  I try to
avoid handwriting RI when possible in triggers, but in this case it's the
only approach that works within language constraints and can
formally validate the model.

merlin

>


Fwd: Need efficient way to remove (almost) duplicate rows from a table

2023-10-17 Thread Michael Nolan
I'm not clear which rows are duplicates.

It looks like limb_id, branch_id and twig_id make up a unique key.

If there was a row (2, 2, 7,103.7) would it be a duplicate of rows
(1,1,2,103.7) and (1,1,3,103.7)?
--
MIke Nolan




Partition Lock Issue

2023-10-17 Thread Saravanan P K
I have a postgres database shared by two applications A and B. A is used
often (data centric) and B is used once in a while to create customers
(only for this purpose and nothing else) for Application A.

When a customer is created in application B, monthly partitions are created
for the next 3 years or so, mostly for the data tables used by application
A. Default partitions are created during the initial database deployment.

Assume, during the initial deployment of these applications, we have few
customers created in B and application A starts functioning as expected.
Application A gets data based on customers' usage and data is huge.

Here is the problem. A new customer has to be added in B (say after 6
months or so) and as part of the process, we create partitions for the next
3 years. At the same time, the data tables are used by application A for
select/edit or CRUD operations simultaneously. Due to shared/exclusive
access locks made by A, it makes the customer creation a little slower or
until the locks are released by various features that access data in A, the
customer could not be created immediately in B. Default partitions often
have the lock.

How do we get through this situation? Is there a better way for approaching
this partition behaviour. Customers could be created based on opportunity
and partitions have to be created for them, say for next few years.