AI for query-planning?

2024-06-22 Thread Andreas Joseph Krogh


Hi, are there any plans for using some kind of AI for query-planning?

Can someone with more knowledge about this than I have please explain why it 
might, or not, be a good idea, and what the challenges are?



Thanks.






--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 

Re: Help. The database was created using collation version 2.17, but the operating system provides version 2.34.

2024-06-22 Thread Dmitry O Litvintsev
Thank you very much for help and pointers to useful information.

Just want to make clear (sorry I am slow on uptake). I should first REINDEX and 
then ALTER DATABASE xxx REFRESH COLLATION VERSION, or first ALTER and then 
REINDEX or does the order of these action matter at all?

Thank you,
Dmitry


From: Daniel Verite 
Sent: Thursday, June 20, 2024 7:02 AM
To: Dmitry O Litvintsev
Cc: pgsql-generallists.postgresql.org
Subject: Re: Help. The database was created using collation version 2.17, but 
the operating system provides version 2.34.

[EXTERNAL] – This message is from an external sender

Dmitry O Litvintsev wrote:

> WARNING:  database "xxx" has a collation version mismatch
> DETAIL:  The database was created using collation version 2.17, but the
> operating system provides version 2.34.
> HINT:  Rebuild all objects in this database that use the default collation
> and run ALTER DATABASE xxx REFRESH COLLATION VERSION, or build PostgreSQL
> with the right library version.

This upgrade comprises the major change in GNU libc 2.28,
so indeed text indexes created by 2.17 are very likely unsafe to use
on your new server.
See 
https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.postgresql.org_wiki_Locale-5Fdata-5Fchanges&d=DwIFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=7PHi3TDlwkvpc07MjENbOxVFl0u_sEurf250JnUFWCU&m=NhYMQQwCXq5TXHFVjyJeB1eB9KkyHC8FU3lyswgVpUMpAzADq2cffu-xjSUeNCND&s=WKdD4hr8nBJTkQtIcLMagxuGK1yAPTyU2VOmQARksl8&e=

>  REINDEX database xxx
> ALTER DATABASE xxx REFRESH COLLATION VERSION
> But this defeats the whole idea of having short downtime because REINDEX
> will take forever.

The indexes that don't involve collatable types (text,varchar), and those
that use the C collation don't need to be reindexed.
Maybe you can reduce significantly the downtime by including only
the ones that matter.

The wiki page gives the query to obtain the list of affected indexes:

SELECT DISTINCT indrelid::regclass::text, indexrelid::regclass::text,
collname, pg_get_indexdef(indexrelid)
FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index,
generate_subscripts(indcollation, 1) g(i)) s
  JOIN pg_collation c ON coll=c.oid
WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX');

> I do not recall having similar issue when going from RH6 to RH7.

This warning was added relatively recently, in Postgres 15 (october 2022).


Best regards,
--
Daniel Vérité
https://urldefense.proofpoint.com/v2/url?u=https-3A__postgresql.verite.pro_&d=DwIFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=7PHi3TDlwkvpc07MjENbOxVFl0u_sEurf250JnUFWCU&m=NhYMQQwCXq5TXHFVjyJeB1eB9KkyHC8FU3lyswgVpUMpAzADq2cffu-xjSUeNCND&s=yED6Nru4eGTULRzJymNtMgJjXhgirkjOuDzCQnae9Go&e=
Twitter: @DanielVerite




Re: AI for query-planning?

2024-06-22 Thread Adrian Klaver

On 6/22/24 04:50, Andreas Joseph Krogh wrote:

Hi, are there any plans for using some kind of AI for query-planning?

Can someone with more knowledge about this than I have please explain 
why it might, or not, be a good idea, and what the challenges are?


1) Require large amount of resources.

2) Produce high rate of incorrect answers.




Thanks.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: AI for query-planning?

2024-06-22 Thread Vijaykumar Jain
On Sat, Jun 22, 2024, 5:20 PM Andreas Joseph Krogh 
wrote:

> Hi, are there any plans for using some kind of AI for query-planning?
>
> Can someone with more knowledge about this than I have please explain why
> it might, or not, be a good idea, and what the challenges are?
>

https://github.com/ossc-db/pg_plan_advsr

https://github.com/s-hironobu/pg_plan_inspector

not totally ai, but it can use data from your database to build some
metadata.

there are also plan analyzers online which people paste their plan to get a
quick summary of problem areas, that data can be useful too for ai based
plan optimisers.

but that said, postgresql has a lot of knobs to tune, so ...


Re: AI for query-planning?

2024-06-22 Thread Christoph Moench-Tegeder
## Andreas Joseph Krogh (andr...@visena.com):

> Hi, are there any plans for using some kind of AI for query-planning?

Actually, we do have our GEQO - Genetic Query Optimization - already
in the planner: https://www.postgresql.org/docs/current/geqo.html
As per the common taxomonies, genetic algorithms are part of the
larger "Artificial Intelligence" domain - they're just not part of
the current hype cycle (some of us still remember the early iterations
of neuronal networks, or fuzzy logic, or...).

> Can someone with more knowledge about this than I have please explain why it 
> might, or not, be a good idea, and what the challenges are?

Just waving your hand and saying "AI" three times in a row does not
manifest an usable algorithm. Going the full way from a problem to be
understood and solved to algorithm to implementation requires more
effort - and especially with all the "hot off the hype press" ideas
that already fails in the very first stages. Remember just recently,
when some people were "predicting" that all databases would become
blockchains?

Regards,
Christoph

-- 
Spare Space




Re: AI for query-planning?

2024-06-22 Thread Pyrote
The current forms of “AI” have no concept of state or long term memory. On each invocation of the AI you have to tell it,This is a Postgres database.This is my database schema.These are the indexes I have.After providing that information the “AI” “might” generate a valid query for your particular database but it won’t be optimum. The AI doesn’t know how many rows are in each table, the physical media each table is on, or any other attributes about your database that would be used to calculate the cost of using an index or a table scan.So then you could make the jump that an “AI” should be ran locally and trained exclusively on your database. Now you are using a general purpose “AI” algorithm for a very specific task which would not be optimum. It would require constant retraining which would be computationally expensive. Then let’s say you want to write an “AI” algorithm just for Postgres. Now you have basically rewritten the current Postgres Optimizer in a round about way. On Sat, Jun 22, 2024 at 09:40 Adrian Klaver  wrote:On 6/22/24 04:50, Andreas Joseph Krogh wrote:
> Hi, are there any plans for using some kind of AI for query-planning?
> 
> Can someone with more knowledge about this than I have please explain 
> why it might, or not, be a good idea, and what the challenges are?

1) Require large amount of resources.

2) Produce high rate of incorrect answers.


> 
> Thanks.
> 
> --
> *Andreas Joseph Krogh*
> CTO / Partner - Visena AS
> Mobile: +47 909 56 963
> andr...@visena.com 
> www.visena.com 
> 

-- 
Adrian Klaver
adrian.kla...@aklaver.com





pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Shaheed Haque
Hi,

I am using Postgres 14 on AWS RDS and am seeing the output of pg_dump be
restored as expected by pg_restore on some database instances, and fail
with reports of duplicate keys on other database instances:

   - My deployments are always a pair, one "logic VM" for Django etc and
   one "RDS instance". The psql client runs on the logic VM. The Postgres
   version is the same in all cases; psql reports:


   - psql (14.12 (Ubuntu 14.12-0ubuntu0.22.04.1), server 14.9)


   - The pg_restore is done using the same script in both cases.
   - In the failing cases, there are always the same 26 errors (listed in
   detail below), but in summary, 3 distinct "child" tables complain of a
   duplicate id=1, id=2 and id=3 respectively.
   - These "child" tables are FK-related via some intermediate table to a
   top level table. They form a polymorphic set. There are other similar child
   tables which do not appear to be affected:
  - polymorphicmodel
 - companybankdetail
- companybankdetailde
- companybankdetailgb  <<< 1 duplicate, id=2
- companybankdetailus
- companypostaldetail
- companypostaldetailde
- companypostaldetailgb  <<< 1 duplicate, id=1
- companypostaldetailus
 - companytaxdetail
- companytaxdetailde
- companytaxdetailgb  <<< 1 duplicate, id=3
- companytaxdetailus
 - ...
 - several other hierarchies, all error free
 - ...
  - I've looked at the dumped .dat files but they contain no
   duplicates.
   - The one difference I can think of between deployment pairs which work
   ok, and those which fail is that the logic VM (i.e. where the psql client
   script runs) is the use of a standard AWS ubuntu image for the OK case,
   versus a custom AWS image for the failing case.
  - The custom image is a saved snapshot of one created using the
  standard image.

Why should the use of one type of VM image versus another cause pg_restore
to hallucinate the duplicate records?

Encls: 26 errors as mentioned...


pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 5635; 1262 53894 DATABASE foo postgres
pg_restore: error: could not execute query: ERROR:  database "foo" already
exists
Command was: CREATE DATABASE foo WITH TEMPLATE = template0 ENCODING =
'UTF8' LOCALE = 'en_US.UTF-8';


pg_restore: from TOC entry 4808; 2606 80439 CONSTRAINT
paiyroll_companybankdetail paiyroll_companybankdetail_pkey dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companybankdetail_pkey"
DETAIL:  Key (polymorphicmodel_ptr_id)=(2) is duplicated.
Command was: ALTER TABLE ONLY public.paiyroll_companybankdetail
   ADD CONSTRAINT paiyroll_companybankdetail_pkey PRIMARY KEY
(polymorphicmodel_ptr_id);


pg_restore: from TOC entry 4812; 2606 80443 CONSTRAINT
paiyroll_companybankdetailgb paiyroll_companybankdetailgb_pkey dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companybankdetailgb_pkey"
DETAIL:  Key (companybankdetail_ptr_id)=(2) is duplicated.
Command was: ALTER TABLE ONLY public.paiyroll_companybankdetailgb
   ADD CONSTRAINT paiyroll_companybankdetailgb_pkey PRIMARY KEY
(companybankdetail_ptr_id);


pg_restore: from TOC entry 4817; 2606 80447 CONSTRAINT
paiyroll_companypostaldetail paiyroll_companypostaldetail_pkey dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companypostaldetail_pkey"
DETAIL:  Key (polymorphicmodel_ptr_id)=(1) is duplicated.
Command was: ALTER TABLE ONLY public.paiyroll_companypostaldetail
   ADD CONSTRAINT paiyroll_companypostaldetail_pkey PRIMARY KEY
(polymorphicmodel_ptr_id);


pg_restore: from TOC entry 4821; 2606 80451 CONSTRAINT
paiyroll_companypostaldetailgb paiyroll_companypostaldetailgb_pkey
dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companypostaldetailgb_pkey"
DETAIL:  Key (companypostaldetail_ptr_id)=(1) is duplicated.
Command was: ALTER TABLE ONLY public.paiyroll_companypostaldetailgb
   ADD CONSTRAINT paiyroll_companypostaldetailgb_pkey PRIMARY KEY
(companypostaldetail_ptr_id);


pg_restore: from TOC entry 4826; 2606 80455 CONSTRAINT
paiyroll_companytaxdetail paiyroll_companytaxdetail_pkey dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companytaxdetail_pkey"
DETAIL:  Key (polymorphicmodel_ptr_id)=(3) is duplicated.
Command was: ALTER TABLE ONLY public.paiyroll_companytaxdetail
   ADD CONSTRAINT paiyroll_companytaxdetail_pkey PRIMARY KEY
(polymorphicmodel_ptr_id);


pg_restore: from TOC entry 4830; 2606 80459 CONSTRAINT
paiyroll_companytaxdetailgb paiyroll_companytaxdetailgb_pkey dbcoreuser
pg_restore: error: could not execute query: ERROR:  could not create unique
index "paiyroll_companytaxdetailgb_pkey"
DETAIL:  Key (companytaxdetail_pt

Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Tom Lane
Shaheed Haque  writes:
>- The one difference I can think of between deployment pairs which work
>ok, and those which fail is that the logic VM (i.e. where the psql client
>script runs) is the use of a standard AWS ubuntu image for the OK case,
>versus a custom AWS image for the failing case.

Please go to AWS for support for custom AWS stuff.

regards, tom lane




RE: Autovacuum, dead tuples and bloat

2024-06-22 Thread Shenavai, Manuel
Thanks for the suggestion. This is what I found:

- pg_locks  shows only one entry for my DB (I filtered by db oid). The entry is 
related to the relation "pg_locks" (AccessShareLock).
- pg_stat_activity shows ~30 connections (since the DB is in use, this is 
expected)

Is there anything specific I should further look into in these tables?

Regarding my last post: Did we see a problem in the logs I provided in my 
previous post? We have seen that there are 819294 n_live_tup in the 
toast-table. Do we know how much space these tuple use?  Do we know how much 
space one tuple use?

Best regards,
Manuel

-Original Message-
From: Adrian Klaver 
Sent: 21 June 2024 22:39
To: Shenavai, Manuel ; Achilleas Mantzios 
; pgsql-general@lists.postgresql.org
Subject: Re: Autovacuum, dead tuples and bloat

On 6/21/24 12:31, Shenavai, Manuel wrote:
> Hi,
>
> Thanks for the suggestions. I found the following details to our
> autovacuum (see below). The related toast-table of my table shows some
> logs related the vacuum. This toast seems to consume all the data
> (27544451 pages * 8kb ≈ 210GB )

Those tuples(pages) are still live per the pg_stat entry in your second
post:

"n_dead_tup": 12,
"n_live_tup": 819294

So they are needed.

Now the question is why are they needed?

1) All transactions that touch that table are done and that is the data
that is left.

2) There are open transactions that still need to 'see' that data and
autovacuum cannot remove them yet. Take a look at:

pg_stat_activity:

https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW

and

pg_locks

https://www.postgresql.org/docs/current/view-pg-locks.html

to see if there is a process holding that data open.

>
> Any thoughts on this?
>
> Best regards,
> Manuel
>


--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Autovacuum, dead tuples and bloat

2024-06-22 Thread Adrian Klaver

On 6/22/24 13:13, Shenavai, Manuel wrote:

Thanks for the suggestion. This is what I found:

- pg_locks  shows only one entry for my DB (I filtered by db oid). The entry is related 
to the relation "pg_locks" (AccessShareLock).


Which would be the SELECT you did on pg_locks.


- pg_stat_activity shows ~30 connections (since the DB is in use, this is 
expected)


The question then is, are any of those 30 connections holding a 
transaction open that needs to see the data in the affected table and is 
keeping autovacuum from recycling the tuples?


You might need to look at the Postgres logs to determine the above. 
Logging connections/disconnections helps as well at least 'mod' statements.


See:
https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

for more information.



Is there anything specific I should further look into in these tables?

Regarding my last post: Did we see a problem in the logs I provided in my 
previous post? We have seen that there are 819294 n_live_tup in the 
toast-table. Do we know how much space these tuple use?  Do we know how much 
space one tuple use?


You will want to read:

https://www.postgresql.org/docs/current/storage-toast.html

Also:

https://www.postgresql.org/docs/current/functions-admin.html

9.27.7. Database Object Management Functions

There are functions there that show table sizes among other things.



Best regards,
Manuel

-Original Message-
From: Adrian Klaver 
Sent: 21 June 2024 22:39
To: Shenavai, Manuel ; Achilleas Mantzios 
; pgsql-general@lists.postgresql.org
Subject: Re: Autovacuum, dead tuples and bloat

On 6/21/24 12:31, Shenavai, Manuel wrote:

Hi,

Thanks for the suggestions. I found the following details to our
autovacuum (see below). The related toast-table of my table shows some
logs related the vacuum. This toast seems to consume all the data
(27544451 pages * 8kb ≈ 210GB )


Those tuples(pages) are still live per the pg_stat entry in your second
post:

"n_dead_tup": 12,
"n_live_tup": 819294

So they are needed.

Now the question is why are they needed?

1) All transactions that touch that table are done and that is the data
that is left.

2) There are open transactions that still need to 'see' that data and
autovacuum cannot remove them yet. Take a look at:

pg_stat_activity:

https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW

and

pg_locks

https://www.postgresql.org/docs/current/view-pg-locks.html

to see if there is a process holding that data open.



Any thoughts on this?

Best regards,
Manuel




--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Ron Johnson
On Sat, Jun 22, 2024 at 1:02 PM Shaheed Haque 
wrote:

> Hi,
>
> I am using Postgres 14 on AWS RDS and am seeing the output of pg_dump be
> restored as expected by pg_restore on some database instances, and fail
> with reports of duplicate keys on other database instances:
>
>- My deployments are always a pair, one "logic VM" for Django etc and
>one "RDS instance". The psql client runs on the logic VM. The Postgres
>version is the same in all cases; psql reports:
>
>
>- psql (14.12 (Ubuntu 14.12-0ubuntu0.22.04.1), server 14.9)
>
>
>- The pg_restore is done using the same script in both cases.
>- In the failing cases, there are always the same 26 errors (listed in
>detail below), but in summary, 3 distinct "child" tables complain of a
>duplicate id=1, id=2 and id=3 respectively.
>- These "child" tables are FK-related via some intermediate table to a
>top level table. They form a polymorphic set. There are other similar child
>tables which do not appear to be affected:
>   - polymorphicmodel
>  - companybankdetail
> - companybankdetailde
> - companybankdetailgb  <<< 1 duplicate, id=2
> - companybankdetailus
> - companypostaldetail
> - companypostaldetailde
> - companypostaldetailgb  <<< 1 duplicate, id=1
> - companypostaldetailus
>  - companytaxdetail
> - companytaxdetailde
> - companytaxdetailgb  <<< 1 duplicate, id=3
> - companytaxdetailus
>  - ...
>  - several other hierarchies, all error free
>  - ...
>   - I've looked at the dumped .dat files but they contain no
>duplicates.
>- The one difference I can think of between deployment pairs which
>work ok, and those which fail is that the logic VM (i.e. where the psql
>client script runs) is the use of a standard AWS ubuntu image for the OK
>case, versus a custom AWS image for the failing case.
>   - The custom image is a saved snapshot of one created using the
>   standard image.
>
> Why should the use of one type of VM image versus another cause pg_restore
> to hallucinate the duplicate records?
>
> Encls: 26 errors as mentioned...
>
> 
> pg_restore: while PROCESSING TOC:
> pg_restore: from TOC entry 5635; 1262 53894 DATABASE foo postgres
> pg_restore: error: could not execute query: ERROR:  database "foo" already
> exists
> Command was: CREATE DATABASE foo WITH TEMPLATE = template0 ENCODING =
> 'UTF8' LOCALE = 'en_US.UTF-8';
>

Check *all* of the client and server encodings.

99.99% of the time, that's the problem when the same dump file fails to
restore on different servers.


Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Adrian Klaver

On 6/22/24 10:01, Shaheed Haque wrote:

Hi,

I am using Postgres 14 on AWS RDS and am seeing the output of pg_dump be 
restored as expected by pg_restore on some database instances, and fail 
with reports of duplicate keys on other database instances:


  * My deployments are always a pair, one "logic VM" for Django etc and
one "RDS instance". The psql client runs on the logic VM. The
Postgres version is the same in all cases; psql reports:

  o psql (14.12 (Ubuntu 14.12-0ubuntu0.22.04.1), server 14.9)

  * The pg_restore is done using the same script in both cases.
  * In the failing cases, there are always the same 26 errors (listed in
detail below), but in summary, 3 distinct "child" tables complain of
a duplicate id=1, id=2 and id=3 respectively.
  * These "child" tables are FK-related via some intermediate table to a
top level table. They form a polymorphic set. There are other
similar child tables which do not appear to be affected:
  o polymorphicmodel
  + companybankdetail
  # companybankdetailde
  # companybankdetailgb  <<< 1 duplicate, id=2
  # companybankdetailus
  + companypostaldetail
  # companypostaldetailde
  # companypostaldetailgb  <<< 1 duplicate, id=1
  # companypostaldetailus
  + companytaxdetail
  # companytaxdetailde
  # companytaxdetailgb  <<< 1 duplicate, id=3
  # companytaxdetailus
  + ...
  + several other hierarchies, all error free
  + ...
  * I've looked at the dumped .dat files but they contain no duplicates.
  * The one difference I can think of between deployment pairs which
work ok, and those which fail is that the logic VM (i.e. where the
psql client script runs) is the use of a standard AWS ubuntu image
for the OK case, versus a custom AWS image for the failing case.
  o The custom image is a saved snapshot of one created using the
standard image.

Why should the use of one type of VM image versus another cause 
pg_restore to hallucinate the duplicate records?




1) Show the complete pg_restore script.

2) The first issue is related to trying to create a database that 
already exists. Does that database have data in it?


--
Adrian Klaver
adrian.kla...@aklaver.com





Password complexity/history - credcheck?

2024-06-22 Thread Martin Goodson

Hello.

Recently our security team have wanted to apply password complexity 
checks akin to Oracle's profile mechanism to PostgreSQL, checking that a 
password hasn't been used in x months etc, has minimum length, x special 
characters and x numeric characters, mixed case etc.


As far as I'm aware there's nothing part of the standard 'community 
edition' which gives us that, apart from passwordcheck - which doesn't 
give you a password history.


Can anyone recommend a good mechanism to accomodate this? Ideally we're 
looking for something well-established, reliable, and easily 
configurable. Does anything spring to mind?


A colleague has been looking around, and stumbled across 
https://github.com/MigOpsRepos/credcheck. Does anyone have any positive 
(or negative) experience with this? I'm happy to download and apply to a 
test database, obviously, but some indication of whether or not it's 
worth looking at first would be greatly appreciated. Is this something 
that the community would recommend?


Many thanks!

--
Martin Goodson.

"Have you thought up some clever plan, Doctor?"
"Yes, Jamie, I believe I have."
"What're you going to do?"
"Bung a rock at it."







Re: Password complexity/history - credcheck?

2024-06-22 Thread Tom Lane
Martin Goodson  writes:
> Recently our security team have wanted to apply password complexity 
> checks akin to Oracle's profile mechanism to PostgreSQL, checking that a 
> password hasn't been used in x months etc, has minimum length, x special 
> characters and x numeric characters, mixed case etc.

Don't suppose it would help to push back on whether your security
team knows what they're doing.

The really key reason why server-side password checks are not as
bright an idea as they sound is that they cannot be implemented
without forcing the client to transmit the password in cleartext.
It's widely considered best practice if the server *never* sees
the user's cleartext password, because then it can't leak, either
from sniffing the connection or scraping the postmaster log.

I believe that practices such as forcing a password change every
X amount of time are not viewed as favorably as they once were,
either.  (The argument is that that discourages users from putting
any serious thought into choosing an uncrackable password, and
might well encourage them to write down their current and last few
passwords somewhere.)

Anyway, considerations like these are why there's not features
of this sort in community PG.  You can use an extension that
applies some checks, but there's no good way around the "needs
cleartext password" problem for that.

regards, tom lane




Re: Password complexity/history - credcheck?

2024-06-22 Thread Ron Johnson
On Sat, Jun 22, 2024 at 7:28 PM Martin Goodson 
wrote:

> Hello.
>
> Recently our security team have wanted to apply password complexity
> checks akin to Oracle's profile mechanism to PostgreSQL, checking that a
> password hasn't been used in x months


There would have to be a pg_catalog table which stores login history.


> etc, has minimum length, x special
> characters and x numeric characters, mixed case etc.
>

Is that an after-the-fact scanner (with all the problems Tom mentioned), or
is it a client-side "check while you're typing in the *new* password"
scanner?


Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread Muhammad Ikram
Hi  Shaheed,

As pointed above by Adrian Klaver, I suspect that you did multiple attempts
that caused Database Already Exists. ( There must be data in the tables,
which the next attempt is trying to write again) . I can't think of any
scenario where restoration succeeds on one environment and fails on
another, if there were some locale difference then it must have failed with
a different reason, not the duplicates

Perform a clean up and try again. Hope it succeeds.

Regards,
Muhammad Ikram
Bitnine Global.


On Sun, Jun 23, 2024 at 2:59 AM Adrian Klaver 
wrote:

> On 6/22/24 10:01, Shaheed Haque wrote:
> > Hi,
> >
> > I am using Postgres 14 on AWS RDS and am seeing the output of pg_dump be
> > restored as expected by pg_restore on some database instances, and fail
> > with reports of duplicate keys on other database instances:
> >
> >   * My deployments are always a pair, one "logic VM" for Django etc and
> > one "RDS instance". The psql client runs on the logic VM. The
> > Postgres version is the same in all cases; psql reports:
> >
> >   o psql (14.12 (Ubuntu 14.12-0ubuntu0.22.04.1), server 14.9)
> >
> >   * The pg_restore is done using the same script in both cases.
> >   * In the failing cases, there are always the same 26 errors (listed in
> > detail below), but in summary, 3 distinct "child" tables complain of
> > a duplicate id=1, id=2 and id=3 respectively.
> >   * These "child" tables are FK-related via some intermediate table to a
> > top level table. They form a polymorphic set. There are other
> > similar child tables which do not appear to be affected:
> >   o polymorphicmodel
> >   + companybankdetail
> >   # companybankdetailde
> >   # companybankdetailgb  <<< 1 duplicate, id=2
> >   # companybankdetailus
> >   + companypostaldetail
> >   # companypostaldetailde
> >   # companypostaldetailgb  <<< 1 duplicate, id=1
> >   # companypostaldetailus
> >   + companytaxdetail
> >   # companytaxdetailde
> >   # companytaxdetailgb  <<< 1 duplicate, id=3
> >   # companytaxdetailus
> >   + ...
> >   + several other hierarchies, all error free
> >   + ...
> >   * I've looked at the dumped .dat files but they contain no
> duplicates.
> >   * The one difference I can think of between deployment pairs which
> > work ok, and those which fail is that the logic VM (i.e. where the
> > psql client script runs) is the use of a standard AWS ubuntu image
> > for the OK case, versus a custom AWS image for the failing case.
> >   o The custom image is a saved snapshot of one created using the
> > standard image.
> >
> > Why should the use of one type of VM image versus another cause
> > pg_restore to hallucinate the duplicate records?
> >
>
> 1) Show the complete pg_restore script.
>
> 2) The first issue is related to trying to create a database that
> already exists. Does that database have data in it?
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
>

-- 
Muhammad Ikram


Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread David G. Johnston
On Saturday, June 22, 2024, Shaheed Haque  wrote:
>
>
>- The one difference I can think of between deployment pairs which
>work ok, and those which fail is that the logic VM (i.e. where the psql
>client script runs) is the use of a standard AWS ubuntu image for the OK
>case, versus a custom AWS image for the failing case.
>   - The custom image is a saved snapshot of one created using the
>   standard image.
>
> Why should the use of one type of VM image versus another cause pg_restore
> to hallucinate the duplicate records?
>

To tie the other comments to your description: you took/have a snapshot of
the base image after you created the database and added some records to
it.  Nothing wrong here - you just need to decide how you want to deal with
the situation.

David J.


Re: pg_dump restores as expected on some machines and reports duplicate keys on others

2024-06-22 Thread David G. Johnston
On Saturday, June 22, 2024, David G. Johnston 
wrote:

> On Saturday, June 22, 2024, Shaheed Haque  wrote:
>>
>>
>>- The one difference I can think of between deployment pairs which
>>work ok, and those which fail is that the logic VM (i.e. where the psql
>>client script runs) is the use of a standard AWS ubuntu image for the OK
>>case, versus a custom AWS image for the failing case.
>>   - The custom image is a saved snapshot of one created using the
>>   standard image.
>>
>> Why should the use of one type of VM image versus another cause
>> pg_restore to hallucinate the duplicate records?
>>
>
> To tie the other comments to your description: you took/have a snapshot of
> the base image after you created the database and added some records to
> it.  Nothing wrong here - you just need to decide how you want to deal with
> the situation.
>

Sorry, but to be clear/clarify - you must be using an RDS snapshot as well
as an EC2 snapshot, a fresh built RDS cluster isn’t going to be complaining
about things (especially the database) existing.

David J.