Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Alban Hertroys



> On 7 Apr 2023, at 2:49, Louis Tian  wrote:

(…)

> I am not expecting an error here. The problem is with no conflict it always 
> go down the insert path first and results in a not null constraint error.
> While I am expecting the insert is never executed in the first place when 
> that row already exist (as identified by it primary key). So the update 
> execute without error.
> I hope the pesudo code above is enough to clarify the difference? 

Your assumption on what the problem is, is not correct. The problem is not with 
the conflict resolution, it is with your statement violating a not null 
constraint.

It doesn’t matter whether you insert first or update first, either operation is 
going to violate that constraint. You’re specifying a NULL value for a column 
that doesn’t accept that because it has a NOT NULL constraint. That is your 
problem.

Alban Hertroys
--
There is always an exception to always.








Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Francisco Olarte
(not the OP on idempotency)

On Sat, 8 Apr 2023 at 18:33, Louis Tian  wrote:
> > In general UPSERT (or any definition of it that I can think of) does
> > not imply idempotency.
> "Idempotence is the property of certain operations in mathematics and 
> computer science whereby they can be applied multiple times without changing 
> the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. 
> https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you 
> going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database 
> remains the same as if only execute once.
> If a row already exists, the first statement will update the row so does any 
> subsequent statements. executing the same update multiple time is the same as 
> executing it only once.
> If the row doesn't exist, the first statement will insert that row and any 
> subsequent will try to update, but the update has no real effect since it the 
> value is exactly the same as the insert.
> So by defintion, upsert is idempotent.

Only on a narrow definition of upsert.

You are thinking on a narrow ( but very frequent ) use of "upsert"
statements, something like:

insert on users(id,name) values (1,'x') on conflict(id) update set name='x'

But upsert can be used for things like:

   insert into last_access(id,cuando) values (1,current_timestamp) on
conflict(id) set cuando=current_timestamp

   insert into access_count(id, access_count) values (1,1) on
conflict(id) set access_count=access_count+1

Which are not idempotent ( and also frequent, I use both variants )

Francisco Olarte.




Replicate data from one standby server to another standby

2023-04-09 Thread Atul Kumar
Hi,

I have Configured "sync" streaming replication  to replicate the data from
one primary node to one slave node.

Now I want one to add one more node as slave that will replicate the data
from previously created slave replica (not from primary replica as we do
traditionally).

So please let me know whether do we have any such flexibility where we can
replicate the data from one slave replica to another slave replica ?



Regards.


Re: Replicate data from one standby server to another standby

2023-04-09 Thread Inzamam Shafiq
Yes, you can replicate data from one slave to another.

Sent from Outlook for Android

From: Atul Kumar 
Sent: Sunday, April 9, 2023 3:14:06 PM
To: pgsql-general 
Subject: Replicate data from one standby server to another standby

Hi,

I have Configured "sync" streaming replication  to replicate the data from one 
primary node to one slave node.

Now I want one to add one more node as slave that will replicate the data from 
previously created slave replica (not from primary replica as we do 
traditionally).

So please let me know whether do we have any such flexibility where we can 
replicate the data from one slave replica to another slave replica ?



Regards.


Postgresql Upgrade from 10 to 14

2023-04-09 Thread thayanban thay
  Hi Team,

I need some help or suggestions for the below issue. Please provide answers
what to do now. We re in critical position to handle the issue.

Issue: We upgraded postgresql from 10 to 14 version, post to the upgrade ,
queries are running very slower than normal expected time. As well as the
queries are not getting completed . We did vacuum analyze, index recreation
and analyze statements etc but no luck still the job queries are running
slower like 10hrs , 14hrs . So please help what we need to do to fix the
issue.

Actually we have spark scala application that will use JDBC driver to make
connection to the postgresql and then execute the queries. Before upgrade
the same application used to complete in very lesser time like 10mts but
now its taking 14hrs and not getting completed.

Not sure why upgrade making slower in performance. Previously the job used
to complete in lesser time.

Please advise and provide suggestion what inorder to do fix the issue.
Kindly help and waiting for good reply with exact solutions


Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Louis Tian



From: Israel Brewster 
Sent: Sunday, April 9, 2023 3:09 AM
To: Louis Tian 
Cc: Peter Geoghegan ; pgsql-general@lists.postgresql.org 

Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

Thanks Israel. Your example really helped me to understand where we differ.

> Not necessarily. Consider the following UPSERT statement:
> INSERT INTO customer_order_counts (customer_id,order_count) VALUES (123456, 
> 1) ON CONFLICT (customer_id) DO UPDATE SET 
> order_count=customer_order_counts.order_count+1;

Yes, you are 100% right that is not idempotent statement. But that's not 
because UPSERT is idempotent ​*operation*. I am making a distinction between an 
operation and statement here.
This statement is not idempotent is not because UPSERT operation is not 
idempotent, it is because the value being upsert is different in the statement 
each time you run that statement.

> Not really. Generally when I am doing an UPSERT, I am NOT using the primary 
> key, but rather some other UNIQUE-ly indexed column(s). My primary key is 
> typically an ID column that is defined as a serial, > and automatically 
> generated by the database. The unique column I use for the upset, however, is 
> generally something that would identify the row to a human - such as Station 
> ID and timestamp
> columns in a database I have of seismic readings. Each reading gets a unique 
> ID (the primary key) that identifies it to the database and is used in joins. 
> However, occasionally a reading gets updated, so I > update the row, based 
> not on the primary key (which I don’t know for incoming data), but on the 
> station and timestamp. UPSERT, but not based on the primary key.

I am using "primary key" to refer to the "row identifier" in a conceptual way.
Definition from Wiki, "In the relational 
model of 
databases, a primary key is a specific 
choice of a minimal set of attributes 
(columns) that uniquely 
specify a tuple (row) in a 
relation 
(table)."
So, it's not necessarily the primary key you declared on the table (for 
example, the id column in your example).
To put it in another way, the Station ID and timestamp columns work in the 
capability of a primary key.
I was merely setting a scene in my orginal email. What I was trying to say is 
"UPSERT only makes sense when each row is uniquely identifiable".
So please don't get me wrong there, I am not arguing that being able to do on 
conflict on any unique index is wrong or bad.

Cheers,
Louis Tian

> On Apr 6, 2023, at 4:49 PM, Louis Tian  wrote:
>
> Hi Peter,
>
> Thanks for your reply. Appreciate the help and discussion.
>
>>  In general UPSERT (or any definition of it that I can think of) does
>> not imply idempotency.
>
> "Idempotence is the property of certain operations in mathematics and 
> computer science whereby they can be applied multiple times without changing 
> the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. 
> https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you 
> going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database 
> remains the same as if only execute once.

Not necessarily. Consider the following UPSERT statement:

INSERT INTO customer_order_counts (customer_id,order_count) VALUES (123456, 1) 
ON CONFLICT (customer_id) DO UPDATE SET 
order_count=customer_order_counts.order_count+1;

That is completely valid (I tested it), and actually makes sense as something 
you might want to do - keep track of how many orders a customer has placed, for 
example if you only keep the order records for 6 months, but still want to know 
the total number of orders the customer has placed. If it is a new customer, 
you insert a record for the customer with an order count of 1. Otherwise, you 
update the record to increment the order count. Clearly this is NOT an 
idempotent operation - every time you run it, it changes the order count, so 
the state of the database does NOT remain the same as if you only execute it 
once.

> If a row already exists, the first statement will update the row so does any 
> subsequent statements. executing the same update multiple time is the same as 
> executing it only once.
> If the row doesn't exist, the first statement will insert that row and any 
> subsequent will try to update, but the update has no real effect since it the 
> value is exactly the same as the insert.
> So by defintion, upsert is idempotent.
>
>> It could just be a unique index or a unique constraint. So you can
>> upsert on any individual unique constraint/index, or the primary 

Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Louis Tian
Hi Adrian, 

> No it is not as Israel Brewster pointed out.
I think we need make a distinction between an operation and a statement here
The examples Israel presented and yours are both non-idempotent statements 
because the different value being upserted each time the statement is executed. 
Not because upsert as an operation is not idempotent. 
Likewise, In HTTP, PUT method is defined as idempotent. No one would expect 
different PUT requests with different payload to be idempotent. 

Also, I don't think I am not the only one who think upsert is logically 
idempotent. Just a few seconds of searching on google.
- "The upsert action is a combination of insert and update. It allows us to 
write idempotent statements". 
(https://jonmeyers.io/blog/use-on-conflict-to-upsert-in-postgresql. )
-"One common example of an idempotent operation is an upsert (update or insert) 
statement in a database." 
(https://www.arecadata.com/core-data-engineering-concepts-idempotency)
- "A POST request means that the request is not idempotent, but an "upsert" 
request is idempotent." 
(https://softwareengineering.stackexchange.com/questions/426225/rest-how-to-upsert-a-resource-without-an-specific-resource-url#:~:text=A%20POST%20request%20means%20that,%22upsert%22%20request%20is%20idempotent.)
While no as direct as the article above, but the issue reported in those places 
are essentially stem from the same understanding. 
- 
https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
- 
https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f
So at least I am not only one imaging things?
- "a ready made statement for idempotent data like Mongo's upsert" 
(https://www.compose.com/articles/the-potency-of-idempotent-with-rabbitmq-and-mongodb-upsert/)

> And therein lies your problem, you are imagining something that does not
> exist and more to the point will most likely not exist as it would break all 
> code that depends on above behavior.
I never argue the way postgresql does for insert ... on conflict is wrong and 
need to change. 
I said "I think it is doing what it reads quite literally" just like in your 
words "It does what is advertised on the tin". So there is no disagreement 
here. So I am not saying "insert on conflict" need to change. 
What I am point out here is, the "insert...on conflict do update" is not a true 
"upsert". 
I am saying in my opinion it would be nice see UPSERT statement to be added to 
PostgreSQL so upsert can work "out of box" even when there're not null 
constraints on that table.

It might never happen judging from the replies I am getting but that's fine. 
The solution you have showed me is good enough as a workaround (to not 
depreciate it's value, just not as easy/bullet proof as it could be in my 
opinion). 
It was just a question just to confirm my understanding so I got what I need, 
so thank you all for that. 


Cheers,
Louis Tian


From: Adrian Klaver 
Sent: Sunday, April 9, 2023 7:51 AM
To: Louis Tian ; Peter Geoghegan 
Cc: pgsql-general@lists.postgresql.org 
Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres 
 
CAUTION: This email originated from outside of Envirada. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


On 4/6/23 17:49, Louis Tian wrote:
> Hi Peter,
>
> Thanks for your reply. Appreciate the help and discussion.
>
>>   In general UPSERT (or any definition of it that I can think of) does
>> not imply idempotency.
>
> "Idempotence is the property of certain operations in mathematics and 
> computer science whereby they can be applied multiple times without changing 
> the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. 
> https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you 
> going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database 
> remains the same as if only execute once.
> If a row already exists, the first statement will update the row so does any 
> subsequent statements. executing the same update multiple time is the same as 
> executing it only once.
> If the row doesn't exist, the first statement will insert that row and any 
> subsequent will try to update, but the update has no real effect since it the 
> value is exactly the same as the insert.
> So by defintion, upsert is idempotent.

No it is not as Israel Brewster pointed out.

To his example I would add:

alter some_table add column ts_upsert_update timestamptz;

insert into some_table values('foo', 'bar') on conflict(tbl_id) do
update set foo_fld = excluded.foo_fld, bar_fld = some_table.bar_fld,
ts_upsert_update = now();

You are substituting whatever definition you have in your head for the
definition as it actually exists.

>
>> It could just be a unique index or a uni

Re: Postgresql Upgrade from 10 to 14

2023-04-09 Thread Adrian Klaver

On 4/8/23 18:03, thayanban thay wrote:

   Hi Team,

I need some help or suggestions for the below issue. Please provide 
answers what to do now. We re in critical position to handle the issue.


Issue: We upgraded postgresql from 10 to 14 version, post to the upgrade 
, queries are running very slower than normal expected time. As well as 
the queries are not getting completed . We did vacuum analyze, index 
recreation and analyze statements etc but no luck still the job queries 
are running slower like 10hrs , 14hrs . So please help what we need to 
do to fix the issue.-


Actually we have spark scala application that will use JDBC driver to 
make connection to the postgresql and then execute the queries. Before 
upgrade the same application used to complete in very lesser time like 
10mts but now its taking 14hrs and not getting completed.


Not sure why upgrade making slower in performance. Previously the job 
used to complete in lesser time.


Please advise and provide suggestion what inorder to do fix the issue. 
Kindly help and waiting for good reply with exact solutions


More information needed:

1) OS and version?

2) Where is server hosted, on your machines or a service?

3) The hardware specifications for the machine?

4) How was the upgrade done?

5) An example of a slow query with the output of EXPLAIN 

6) What version of the JDBC driver?


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Postgresql Upgrade from 10 to 14

2023-04-09 Thread Tom Lane
thayanban thay  writes:
> Issue: We upgraded postgresql from 10 to 14 version, post to the upgrade ,
> queries are running very slower than normal expected time. As well as the
> queries are not getting completed . We did vacuum analyze, index recreation
> and analyze statements etc but no luck still the job queries are running
> slower like 10hrs , 14hrs . So please help what we need to do to fix the
> issue.

You are unlikely to get any useful advice when you've provided no details.
You need to identify which query or queries has gotten slower, and post
the problem queries along with related table schemas and EXPLAIN output.
See

https://wiki.postgresql.org/wiki/Slow_Query_Questions

for some advice about how to ask answerable performance questions.

Just a shot in the dark: some people upgrading past v12 have found that
queries using WITH got slower because WITH clauses can now be inlined
whereas the previous behavior was always the equivalent of WITH ... AS
MATERIALIZED.  The new behavior is usually better but we have seen
cases where it loses.  Your issue could be something else entirely
though.

regards, tom lane




RE: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Louis Tian
Hi Francisco, 

I think we need to make a distinction between an "operation" and a "statement".
The concept of idempotency applies to an "operation" not an entire statement. 
Like how HTTP "PUT" method is defined as "idempotent", you don't say actual 
HTTP PUT request is idempotent. 
With the "current_datetime" and "access_count+1", you are effectively changing 
the value passing to the UPSERT operator. 
Just like how you changed the payload of a PUT, then obviously there is no 
reason to expect the state of the database to remain the same. 

Cheers,
Louis Tian


-Original Message-
From: Francisco Olarte  
Sent: Sunday, April 9, 2023 7:32 PM
To: Louis Tian 
Cc: Peter Geoghegan ; pgsql-general@lists.postgresql.org
Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

CAUTION: This email originated from outside of Envirada. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


(not the OP on idempotency)

On Sat, 8 Apr 2023 at 18:33, Louis Tian  wrote:
> > In general UPSERT (or any definition of it that I can think of) does 
> > not imply idempotency.
> "Idempotence is the property of certain operations in mathematics and 
> computer science whereby they can be applied multiple times without changing 
> the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. 
> https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you 
> going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database 
> remains the same as if only execute once.
> If a row already exists, the first statement will update the row so does any 
> subsequent statements. executing the same update multiple time is the same as 
> executing it only once.
> If the row doesn't exist, the first statement will insert that row and any 
> subsequent will try to update, but the update has no real effect since it the 
> value is exactly the same as the insert.
> So by defintion, upsert is idempotent.

Only on a narrow definition of upsert.

You are thinking on a narrow ( but very frequent ) use of "upsert"
statements, something like:

insert on users(id,name) values (1,'x') on conflict(id) update set name='x'

But upsert can be used for things like:

   insert into last_access(id,cuando) values (1,current_timestamp) on
conflict(id) set cuando=current_timestamp

   insert into access_count(id, access_count) values (1,1) on
conflict(id) set access_count=access_count+1

Which are not idempotent ( and also frequent, I use both variants )

Francisco Olarte.
--
Message  protected by MailGuard: e-mail anti-virus, anti-spam and content 
filtering.https://www.mailguard.com.au/mg
Click here to report this message as spam:
https://console.mailguard.com.au/ras/25ayvBFOMd/7rJjrYVDtTx03A1wSEIfeL/0.1



RE: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Louis Tian
Hi Alban, 

"I am not expecting an error here", by "here" I means when doing a TRUE UPSERT 
(an upsert current does not exist in Postgres). 
I am NOT referring to an "Insert on conflict do update" (which despite its 
intention and wide acceptance is not fully equivalent to a true upsert).
I understand the error I am getting now is due to not null constraint given how 
"insert on conflict" works. 

An UPSERT checks whether a row exists, if so, it does an update, if not it does 
an insert. This is the literal definition.
An UPSERT is NOT defined as try do an INSERT first, if violate uniqueness 
constraint, do update. This is what on conflict do update is doing. 
We cannot define UPSERT with what "ON CONFLICT DO UPDATE" is doing. That is a 
logical fallacy.
 UPSERT is a higher-level abstract concept. ON CONFLICT DO UPDATE is an 
implementation of UPSERT. not the other way around.

When doing a true UPSERT, if a row already exists, there is no need provide an 
(not null) column, since I am doing an update. 

With `UPSERT person (id, is_active)` VALUES (0, true). Is it necessary to 
provide the not null "name" column here logically? 
Not really, I already specified the row with the `id` column, then I specify 
the column I want to update `is_active`. 
* the id does exist; the update can be executed without any issue or assumptions
* the id does not exit; then I am expecting a violate not null constraint. 

On contrast, with `INSERT person (id, is_active) value (0, true) ON CONFLICT DO 
UPDATE set is_active=true`,
this statement will always fail regardless of whether the id exists or not. 
So the behavior of ON CONFLICT DO UPDATE differs from a UPSERT with the 
presence of a not null column. 
Hence why ON CONFLICT DO UPDATE is not complete equivalent to a TRUE upsert.

Cheers,
Louis Tian


-Original Message-
From: Alban Hertroys  
Sent: Sunday, April 9, 2023 7:26 PM
To: Louis Tian 
Cc: Peter Geoghegan ; pgsql-general@lists.postgresql.org
Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

CAUTION: This email originated from outside of Envirada. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


> On 7 Apr 2023, at 2:49, Louis Tian  wrote:

(…)

> I am not expecting an error here. The problem is with no conflict it always 
> go down the insert path first and results in a not null constraint error.
> While I am expecting the insert is never executed in the first place when 
> that row already exist (as identified by it primary key). So the update 
> execute without error.
> I hope the pesudo code above is enough to clarify the difference?

Your assumption on what the problem is, is not correct. The problem is not with 
the conflict resolution, it is with your statement violating a not null 
constraint.

It doesn’t matter whether you insert first or update first, either operation is 
going to violate that constraint. You’re specifying a NULL value for a column 
that doesn’t accept that because it has a NOT NULL constraint. That is your 
problem.

Alban Hertroys
--
There is always an exception to always.




--
Message  protected by MailGuard: e-mail anti-virus, anti-spam and content 
filtering.https://www.mailguard.com.au/mg
Click here to report this message as spam:
https://console.mailguard.com.au/ras/25ayougcIv/4FIia1zrtWT2nnuHlesEOS/1.8



Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Rob Sargent

On 4/9/23 19:55, Louis Tian wrote:

Hi Alban,

"I am not expecting an error here", by "here" I means when doing a TRUE UPSERT 
(an upsert current does not exist in Postgres).
I am NOT referring to an "Insert on conflict do update" (which despite its 
intention and wide acceptance is not fully equivalent to a true upsert).
I understand the error I am getting now is due to not null constraint given how 
"insert on conflict" works.

An UPSERT checks whether a row exists, if so, it does an update, if not it does 
an insert. This is the literal definition.


This the part that's always eluded me: How does the client, the 
UPSERTer, come to hold an id and not know whether or not it's already in 
the database.






Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres

2023-04-09 Thread Adrian Klaver

On 4/9/23 19:16, Rob Sargent wrote:

On 4/9/23 19:55, Louis Tian wrote:

Hi Alban,

"I am not expecting an error here", by "here" I means when doing a 
TRUE UPSERT (an upsert current does not exist in Postgres).
I am NOT referring to an "Insert on conflict do update" (which despite 
its intention and wide acceptance is not fully equivalent to a true 
upsert).
I understand the error I am getting now is due to not null constraint 
given how "insert on conflict" works.


An UPSERT checks whether a row exists, if so, it does an update, if 
not it does an insert. This is the literal definition.


This the part that's always eluded me: How does the client, the 
UPSERTer, come to hold an id and not know whether or not it's already in 
the database.


My use case is for bulk loading data into a table I know has data that 
will create a PK/Unique violation with the inserted data. It's a quick 
and dirty way to avoid queries that look for potential violations ahead 
of time, basically 'Ask forgiveness' vs 'Get permission'.


--
Adrian Klaver
adrian.kla...@aklaver.com