Re: Postgres 12.1 : UPPER() in WHERE clause restarts server

2020-02-08 Thread Marc

Adrian,

Everything was a clean install ( MacOS Mojave and Postgres )

Export and import were done with the latest version of PGAdmin.

Please advise if we can provide you with anything ( logging etc . . . )


Is there a possibility to downgrade to version 11 ?

We upgraded over the weekend because we experienced a crash on our 
production server with “toast” issues as result.


Thanks in advance,

Marc


On 8 Feb 2020, at 21:16, Adrian Klaver wrote:


On 2/8/20 12:09 PM, Nick Renders wrote:

Hi,

We have just upgraded our Postgres 9.6 database to 12.1 (pg_dumpall 
-> pg_restore on a clean installation) and now we are having some 
issues with one of our tables.


When we do the following statement:

 SELECT * FROM f_gsxws_schedule WHERE UPPER(gwsc_dossier) = 
'TEST'


the Postgres service restarts.

It seems that using UPPER() in the WHERE clause is causing this. The 
same statement without UPPER() works just fine.


I have tried to emulate the issue with other tables, but 
f_gsxws_schedule seems to be the only one.
The table also has another character field that is indexed, and the 
same problem occurs there. Whenever we use UPPER() or LOWER() to do a 
case-insensitive search, the service reboots.


Looking at the table's definition, I don't see anything different 
with the other tables.


Here is what is logged:





Has anyone noticed anything like this before? Any idea how to fix 
this?


Was the upgrade on the same machine?

Or was the machine also upgraded/updated?

I ask as there have been similar reports having to with changes in 
glibc version affecting collation.





Best regards,

Nick Renders





--
Adrian Klaver
adrian.kla...@aklaver.com


ARC - your Apple service partner


Re: Postgres 12.1 : UPPER() in WHERE clause restarts server

2020-02-08 Thread Marc

Adrian,

Old production server was postgres 9.6 with Mac0S 10.9 so much older 
than the “new” server. ( Now MacOS 10.14 Postgres 12.1 )
After sudden restart of the cpu we started having issues, part of the 
data that is lost TOAST. . .  and we also started having issues when 
TRUNCATING certain tables. The tabel where Nick reported the 
“UPPER”-issue was not “involved”


The data we restored today on the “new” server was from before the 
crash, and in our opinion “healthy”, didn’t run into any issue 
importing it.


Never had issues before, RAID system wasn’t giving any warnings 
neither. We still presume an hardware failure, but haven’t been able 
to figure out what exactly.


I’ll provide a log tomorrow in the morning CET.

Regarding the stack trace we’ll dig in to it to see what we can come 
up with


¿  is downgrading possible ? We need to have a working system by 
tomorrow evening CET.


Thanks in advance,


Marc



On 8 Feb 2020, at 22:18, Adrian Klaver wrote:


On 2/8/20 12:28 PM, Marc wrote:

Adrian,

Everything was a clean install ( MacOS Mojave and Postgres )

Export and import were done with the latest version of PGAdmin.

Please advise if we can provide you with anything ( logging etc . . . 
)



Is there a possibility to downgrade to version 11 ?


At this point hard to tell whether this a version issue or something 
else. Probably best not to introduce too many more moving parts at 
this time.


Questions:

1) The OS and version you mention above is the same as on the 
production server below?


2) What where the exact issues you had on the production server?
The actual error messages would be very helpful.

3) Getting a stack trace of the UPPER(), LOWER() issue would be nice. 
For more info on that:


Not that familiar with MacOS, so not sure if the Linux/BSD steps would 
apply or not, still:


https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend







We upgraded over the weekend because we experienced a crash on our 
production server with “toast” issues as result.


Thanks in advance,

Marc





--
Adrian Klaver
adrian.kla...@aklaver.com


ARC - your Apple service partner


Re: Unable to startup postgres: Could not read from file "pg_clog/00EC"

2020-02-09 Thread Marc

Alvaro, Thomas,

Putting a 256kb file full of 0x55 that's 01010101 and represents 4 
commits

It did the job in being able to restart the server.
According to our data a “better” way, with less garbage.

The “Toast” issues how ever are still present.

To spend our weekend well we setup a new server with version 12.1 but 
had to fallback on 11.6 ( see other post )


We kept our “old” server active to see if we can learn some more 
from this hard-times.


Thanks for the help

Marc



On 5 Feb 2020, at 12:14, Nick Renders wrote:


Hello,

Yesterday, we experienced some issues with our Postgres installation 
(v9.6 running on macOS 10.12).
It seems that the machine was automatically rebooted for a yet unknown 
reason, and afterwards we were unable to start the Postgres service.


The postgres log shows the following:

2020-02-04 15:20:41 CET LOG:  database system was interrupted; last 
known up at 2020-02-04 15:18:34 CET
2020-02-04 15:20:43 CET LOG:  database system was not properly shut 
down; automatic recovery in progress
2020-02-04 15:20:44 CET LOG:  invalid record length at 14A/9E426DF8: 
wanted 24, got 0

2020-02-04 15:20:44 CET LOG:  redo is not required
2020-02-04 15:20:44 CET FATAL:  could not access status of transaction 
247890764
2020-02-04 15:20:44 CET DETAIL:  Could not read from file 
"pg_clog/00EC" at offset 106496: Undefined error: 0.
2020-02-04 15:20:44 CET LOG:  startup process (PID 403) exited with 
exit code 1
2020-02-04 15:20:44 CET LOG:  aborting startup due to startup process 
failure

2020-02-04 15:20:44 CET LOG:  database system is shut down


After some searching, I found someone who had had a similar issue and 
was able to resolve it by overwriting the file in pg_clog.

So I tried the following command:

dd if=/dev/zero of=[dbpath]/pg_clog/00EC bs=256k count=1

and now the service is running again.


But I am worried that there might still be some issues that we haven't 
noticed yet. I also have no idea what caused this error in the first 
place. It might have been the reboot, but maybe the reboot was a 
result of a Postgres issue.


Is there anything specific I should check in our postgres installation 
/ database to make sure it is running ok now? Anyway to see what the 
consequences were of purging that one pg_clog file?


Best regards,

Nick Renders


ARC - your Apple service partner


Re: Postgres 12.1 : UPPER() in WHERE clause restarts server

2020-02-09 Thread Marc

Adrian, Christoph, Tom,

We identified as the problem being persistent on all tables with many 
records ( +600K ) and they all had a JSONB column ( we feel that might 
be related )


Luckily we were able to downgraded to version 11.6 with the same system 
MacOS 10.14.6 so that the OS impact can ruled out.


We will keep the 12.1 in place so that we can run additional tests to 
assist to pin-point the issue.


Feel free to ask but allow us to recover from these hectic days ;-)

Many thanks for the help !



Marc


On 8 Feb 2020, at 21:09, Nick Renders wrote:


Hi,

We have just upgraded our Postgres 9.6 database to 12.1 (pg_dumpall -> 
pg_restore on a clean installation) and now we are having some issues 
with one of our tables.


When we do the following statement:

SELECT * FROM f_gsxws_schedule WHERE UPPER(gwsc_dossier) = 'TEST'

the Postgres service restarts.

It seems that using UPPER() in the WHERE clause is causing this. The 
same statement without UPPER() works just fine.


I have tried to emulate the issue with other tables, but 
f_gsxws_schedule seems to be the only one.
The table also has another character field that is indexed, and the 
same problem occurs there. Whenever we use UPPER() or LOWER() to do a 
case-insensitive search, the service reboots.


Looking at the table's definition, I don't see anything different with 
the other tables.


Here is what is logged:

2020-02-08 20:21:19.942 CET [83892] LOG:  server process (PID 85456) 
was terminated by signal 9: Killed: 9
2020-02-08 20:21:19.942 CET [83892] DETAIL:  Failed process was 
running: SELECT * FROM f_gsxws_schedule WHERE UPPER(gwsc_dossier) = 
'TEST'
2020-02-08 20:21:19.942 CET [83892] LOG:  terminating any other active 
server processes
2020-02-08 20:21:19.943 CET [85364] WARNING:  terminating connection 
because of crash of another server process
2020-02-08 20:21:19.943 CET [85364] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and 
exit, because another server process exited abnormally and possibly 
corrupted shared memory.
2020-02-08 20:21:19.943 CET [85364] HINT:  In a moment you should be 
able to reconnect to the database and repeat your command.
2020-02-08 20:21:19.943 CET [85360] WARNING:  terminating connection 
because of crash of another server process
2020-02-08 20:21:19.943 CET [85360] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and 
exit, because another server process exited abnormally and possibly 
corrupted shared memory.
2020-02-08 20:21:19.943 CET [85360] HINT:  In a moment you should be 
able to reconnect to the database and repeat your command.
2020-02-08 20:21:19.943 CET [85269] WARNING:  terminating connection 
because of crash of another server process
2020-02-08 20:21:19.943 CET [85269] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and 
exit, because another server process exited abnormally and possibly 
corrupted shared memory.
2020-02-08 20:21:19.943 CET [85269] HINT:  In a moment you should be 
able to reconnect to the database and repeat your command.
2020-02-08 20:21:19.946 CET [83892] LOG:  all server processes 
terminated; reinitializing
2020-02-08 20:21:19.988 CET [85686] LOG:  database system was 
interrupted; last known up at 2020-02-08 20:20:48 CET
2020-02-08 20:21:20.658 CET [85686] LOG:  database system was not 
properly shut down; automatic recovery in progress

2020-02-08 20:21:20.662 CET [85686] LOG:  redo starts at C/B99B45A0
2020-02-08 20:21:20.662 CET [85686] LOG:  invalid record length at 
C/B99B4688: wanted 24, got 0

2020-02-08 20:21:20.662 CET [85686] LOG:  redo done at C/B99B4650
2020-02-08 20:21:20.675 CET [83892] LOG:  database system is ready to 
accept connections



Has anyone noticed anything like this before? Any idea how to fix 
this?



Best regards,

Nick Renders


ARC - your Apple service partner


Re: Postgres 12.1 : UPPER() in WHERE clause restarts server

2020-02-27 Thread Marc

Hello Tom,

To whom do we report our findings regarding this issue ?

I can offer you a Belgian waffle to go with you caffeine.

Kindest Regards,



Marc


On 25 Feb 2020, at 10:35, Nick Renders wrote:


Hi Tom,

1. we used the EDB installer.

2. turning JIT off did make the problem go away. So I guess this was 
causing the Postgres process to crash all along.


Thanks for the help,

Nick


On 24 Feb 2020, at 16:24, Tom Lane wrote:


"Nick Renders"  writes:
We have set up a new test environment running PostgreSQL v12.2 on 
macOS

10.14 and the issue is still there.


Some nearby threads prompt these two questions:

1. Are you using your own build, or is this from EDB's installer?

2. If the latter, does turning JIT off ("set jit = off") make the
problem go away?

There is as yet no "native" support for --with-llvm on macOS,
ie Apple themselves don't provide sufficient support for that.
EDB seem to have hacked up something that sort of works, but
only sort of.

regards, tom lane


ARC - your Apple service partner


Re: solved (was plain stupidity) Re: ubuntu 18: PostgreSQL does not start. how can I totally remove and reinstall it

2021-01-24 Thread Marc

Robert,

Otherwise mankind would constantly be in pain ;-)

Enjoy the weekend !

Marc


On 24 Jan 2021, at 8:13, robert rottermann wrote:


thanks a lot.
why dos such stupidity not hurt. ?

have a nice weekend
robert

On 24.01.21 08:04, Julien Rouhaud wrote:
On Sun, Jan 24, 2021 at 2:58 PM rob...@redo2oo.ch  
wrote:

root@elfero-test:~/scripts# pg_lsclusters
Ver Cluster Port Status OwnerData directory  Log 
file
10  main5433 online postgres /var/lib/postgresql/10/main 
/var/log/postgresql/postgresql-10-main.log

[...]
psycopg2.OperationalError: could not connect to server: Connection 
refused
 Is the server running on host "localhost" (127.0.0.1) and 
accepting

 TCP/IP connections on port 5432?
It looks like your instance is configured to listen on port 5433, not 
5432.


ARC - your Apple service partner


Re: PG12: Any drawback of increasing wal_keep_segments

2022-03-31 Thread Marc

On 29 Mar 2022, at 17:17, Stephen Frost wrote:


Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:

On 2022-Mar-22, Shukla, Pranjal wrote:
Are there any disadvantages of increasing the 
“wal_keep_segments” to a

higher number say, 500? Will it have any impact on performance of
streaming replication, on primary or secondary servers?


No.  It just means WAL will occupy more disk space.  I've seen people 
go

even as high as 5000 with no issues.


Yeah, though it makes the primary into essentially a WAL repository 
and,

really, you'd be better off having a dedicated repo that replicas can
pull from instead.  Consider that a replica might fall way behind and
then demand the primary send 5000 WAL segments to it.  The primary 
then

has to go pull that 80GB of data from disk and send it across the
network.  As to if that's an issue or not depends on the IOPS and
bandwidth available, of course, but it's not free.

Thanks,

Stephen


Hello Stephen,

How do you see a setup with a ‘a dedicated repo that replicas can pull 
from’ ?


Thanks in advance for the clarification.


Marc


pgBackRest on MacOS

2022-09-10 Thread Marc


Has anybody pgbackrest running on MacOS ( Monterey ? )

If yes are you willing to share the how to ?

Or can anybody guide us to an “easy” how to ?

Many thanks in advance,


Marc




Re: Autovacuum on sys tables

2023-01-21 Thread Marc
This is a test.

Apologies but 19/12 we are no longer receiving the list mails





how to troubleshoot: FATAL: canceling authentication due to timeout

2021-03-17 Thread Marc
hi all,

We are facing a problem with a user login into database. It happens when there 
is large load and only from time to time.
Once we get this error, the user becomes unusable until database is restarted. 
(That user is being used by multiple instances of the same application, it also 
happens using dedicated users for each application, resulting on one of those 
users being locked out, the rest keep working fine)

The errors is as follows:
LOG: pam_authenticate failed: Authentication failure
FATAL: canceling authentication due to timeout

Our setup:
3 nodes cluster
- Centos 7
- Streaming replication in place (async)
- WAL shipped to an external location
- pooling done at client side
- Centos joined to an Active Directory domain
- Authentication is using PAM module

User is completely fine in AD side since i can use it to login to a standby DB.
I guess there must be a lock that prevents this user to do the first 
authentication step, but no idea how to find it. I’ve tried common queries to 
find locks but I can’t see anything relevant.

I would appreciate if someone could point me to the right direction!

Thanks a lot!
Marc.





Re: how to troubleshoot: FATAL: canceling authentication due to timeout

2021-03-17 Thread Marc
Hi,
Not much, we don't see any failed login.

We have added debug login into sssd service since we just found out that
restarting sssd released the user and it became usable again.
So there must be something wrong between postgres and sssd/pam modules...

Waiting now for fresh logs if it happens again.

Thanks!
On Wed, 17 Mar 2021, 22:32 Diego,  wrote:

> hi!
>
> What you see in the log files of CentOS ( /var/log ) ?
> i
>
>
> On 17/03/2021 16:00, Marc wrote:
>
> hi all,
>
> We are facing a problem with a user login into database. It happens when 
> there is large load and only from time to time.
> Once we get this error, the user becomes unusable until database is 
> restarted. (That user is being used by multiple instances of the same 
> application, it also happens using dedicated users for each application, 
> resulting on one of those users being locked out, the rest keep working fine)
>
> The errors is as follows:
> LOG: pam_authenticate failed: Authentication failure
> FATAL: canceling authentication due to timeout
>
> Our setup:
> 3 nodes cluster
> - Centos 7
> - Streaming replication in place (async)
> - WAL shipped to an external location
> - pooling done at client side
> - Centos joined to an Active Directory domain
> - Authentication is using PAM module
>
> User is completely fine in AD side since i can use it to login to a standby 
> DB.
> I guess there must be a lock that prevents this user to do the first 
> authentication step, but no idea how to find it. I’ve tried common queries to 
> find locks but I can’t see anything relevant.
>
> I would appreciate if someone could point me to the right direction!
>
> Thanks a lot!
> Marc.
>
>
>
>
>


Re: Incremental Materialized Views

2021-08-23 Thread Marc
On 23 Aug 2021, at 11:55, Oliver Kohll wrote:

> Hi,
>
> Just wondering if anyone knows which release (if any) this is targeted for?
>
> https://wiki.postgresql.org/wiki/Incremental_View_Maintenance
>
> Asking because this could make a massive difference to some of our
> workload, even when limited to relatively simple queries. It's quite
> exciting.
>
> Regards
> Oliver
> www.agilebase.co.uk

Oliver,

According to this info maybe version 15

https://commitfest.postgresql.org/23/2138/

Regards,


Marc




pg_dump - increase in backup time - log

2021-08-27 Thread Marc

Hello,

Suddenly the time required to backup with pg_dump increased suddenly 
drastically ( + 20min on a backuptime of 2 hours ) no comparable 
increase of data which could explain the behaviour.


We want to dig into it but we lack a detailled pg_dump log. We used 
verbose mode but unfortunately this lacks a date time stamp.

No changes on the server neither.

Postgres: version 11.13
pg_dump is running on the same machine as Postgres and no changes have 
been made to the configuration


Any ideas/help most welcome,
Thanks in advance,



Marc

ARC - your Apple service partner


Extract transactions from wals ??

2019-11-21 Thread Marc Millas
Hi,

due to a set of bugs and wrong manip, an inappropriate update have been
done into a production DB.
After that, quite a long set of valuables inserts and updates have been
done and needs to be kept.
Obviously getting a backup and applying pitr will get us just before the
offending update.
Now, we need to find a way of extracting, either from the ex prod db, or
from the wals, the "good" transactions to be able to re-apply them.

This did already happen on a Prod  Oracle DB, and recovering was possible
with a :
select * from table_name AS OF TIMESTAMP TO_TIMESTAMP('09052019
0900','MMDD HH24MI');
to get most things done after the problem.
As we are currently moving out of Oracle, we must prove to the business
people that our new postgres env is fine.
So, ... any idea ?

thanks,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: Extract transactions from wals ??

2019-11-21 Thread Marc Millas
Hi Laurenz,

you say "extract the data you need"
That is exactly the point of my question, as the PITR step was obvious.
How to guess "what is the data" I need ??

The timestamp stuff within Oracle was providing exactly that: get all mods
from a given table that did occur within a given timeframe.
Quite clearly, an option, for the future, would be to modify ALL tables and
add a timestamp column and a trigger to fill/update it.
a tad boring to do...
This is why I was wondering if it exits another possibility, like getting,
from the wals, a list of modify objects.

so ??

regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Nov 21, 2019 at 3:54 PM Laurenz Albe 
wrote:

> On Thu, 2019-11-21 at 14:50 +0100, Marc Millas wrote:
> > due to a set of bugs and wrong manip, an inappropriate update have been
> done into a production DB.
> > After that, quite a long set of valuables inserts and updates have been
> done and needs to be kept.
> > Obviously getting a backup and applying pitr will get us just before the
> offending update.
> > Now, we need to find a way of extracting, either from the ex prod db, or
> from the wals, the "good" transactions to be able to re-apply them.
> >
> > This did already happen on a Prod  Oracle DB, and recovering was
> possible with a :
> > select * from table_name AS OF TIMESTAMP TO_TIMESTAMP('09052019
> 0900','MMDD HH24MI');
> > to get most things done after the problem.
> > As we are currently moving out of Oracle, we must prove to the business
> people that our new postgres env is fine.
> > So, ... any idea ?
>
> Sure.
>
> Restore a backup and perform point-in-time-recovery.
> Then extract the data you need.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>


Re: Extract transaction from WAL

2019-11-21 Thread Marc Millas
Hi,
funny enough, this pb looks similar to mine.
the point is: how to guess: what is the "data I need" ??

Looks like we are looking for a way to ask postgres: which transactions
have occurred between this and that.
Obviously, if we can have, online, both the db after the PITR and the db
"corrupted" we can try to create a dblink from one to the other and, then
try to extract the "differences".

but this is not always possible. hence the question about wals.
or ?

regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Nov 21, 2019 at 3:24 PM Michael Loftis  wrote:

>
>
> On Thu, Nov 21, 2019 at 04:56 Jill Jade  wrote:
>
>> Hello everyone,
>>
>> I am new to Postgres and I have a query.
>>
>>  I have updated a table which I should not have.
>>
>>  Is there a way to extract the transactions from the WAL and get back the
>> previous data?
>>
>> Is there a tool that can help to get back the transactions?
>>
>
> The normal way is to use a backup along with point in time recovery. But
> this requires you’ve setup backups and are archiving WALs F/ex with
> pgbackrest. You restore the last full backup from before the incident and
> play back to a time stamp or transaction ID. Either to the original server
> or elsewhere...in this case I would probably restore elsewhere and extract
> the data I needed using tools like pg_dump to restore the selected data.
>
> I’m personally unaware of other methods which may exist.
>
>>
>> Thanks in advance.
>>
>> Regards,
>> Jill
>>
>>
>> --
>
> "Genius might be described as a supreme capacity for getting its possessors
> into trouble of all kinds."
> -- Samuel Butler
>


Re: Extract transactions from wals ??

2019-11-21 Thread Marc Millas
Hi Laurenz,


I was writing select from ""table"" as a template. We have to do this for a
bunch of tables.
So, to my understanding, what you suggest is to PITR up to the first
timestamp, extract all meaningfull tables, and then pitr to the second
timestamp
so as to be able to script a kind of "diff" between the 2 to get what I
want.

Yes ?

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Nov 21, 2019 at 5:16 PM Laurenz Albe 
wrote:

> On Thu, 2019-11-21 at 17:07 +0100, Marc Millas wrote:
> > you say "extract the data you need"
> > That is exactly the point of my question, as the PITR step was obvious.
> > How to guess "what is the data" I need ??
>
> Well, you asked for the contents of a table AS OF TIMESTAMP .
>
> That means you know which table and which timestamp.
>
> So restore the PostgreSQL cluster to that timestamp, connect
> and SELECT from the table.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>
>


Re: Extract transactions from wals ??

2019-11-22 Thread Marc Millas
Yes !

We are looking for something providing a functionnality  similar to
Oracle's :-)
Through PITR or a tool or extension around wals.
Still, as wals are containing enough info for replication to work,
It should be possible to extract from wals a list of objects that have been
written, and elements about what was written.
OS files written and position should be enough to provide this.

Something ?

thanks,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Fri, Nov 22, 2019 at 11:02 AM Ganesh Korde 
wrote:

> I think he wants to see data from different tables at different timestamp
> (like flashback query in Oracle). As per my understanding question here is
> can PITR be done for specific table and for specific timestamp.
>
> On Fri, Nov 22, 2019 at 2:37 PM Laurenz Albe 
> wrote:
>
>> On Thu, 2019-11-21 at 17:35 +0100, Marc Millas wrote:
>> > I was writing select from ""table"" as a template. We have to do this
>> for a bunch of tables.
>> > So, to my understanding, what you suggest is to PITR up to the first
>> timestamp,
>> > extract all meaningfull tables, and then pitr to the second timestamp
>> > so as to be able to script a kind of "diff" between the 2 to get what I
>> want.
>>
>> Sure, you can do that.
>>
>> The description of what you wanted to do was rather unclear, all I could
>> make out is that you want to query AS OF TIMESTAMP.
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
>>
>>
>>


install postgres

2019-12-20 Thread Marc Millas
Hi,

I may overlook something obvious..
I am just looking, on the download pages of postgresql.org
for a way to download rpm.(for  redhat 6 and  redhat 7 x64 machines)
NOT the noarch, but the full distro.
Reason is I have to install postgres on a network with NO internet access.

Thanks,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


policies and extensions

2020-02-17 Thread Marc Munro
I tried to define a policy within an extension but the policy does not
seem to belong to the extension.  Is this the way it is supposed to be?

This is postgres 9.5.21

Here is the relevant code from the extension:

create table rls2 (
  username text not null,
  details  text not null
);

create policy only_owner on rls2
  for all
  to session_user
  using (username = user);

The table, as expected, depends on the extension but the policy does
not (determined by querying pg_depend).

Am I missing something special about policies or is this an oversight?

__
Marc




Re: policies and extensions

2020-02-18 Thread Marc Munro
On Mon, 2020-02-17 at 22:48 -0500, Tom Lane wrote:
> Marc Munro  writes:
> > 
> An RLS policy is a table "subsidiary object" so it only depends
> indirectly
> on the extension that owns the table.

Yep, I get that, and I see the dependency chain in the catalog. 

However an extension can create the table with or without the policy,
and a table created by an extension without policy can later have a
policy added, and, unless I'm missing something, the same dependency
chain exists in either case.

This means that I cannot tell whether the policy was added by the
extension or not.

I can see use cases where an extension writer might create an extension
with policies on tables, and others where a user might want to create
policies on tables from an extension provided by someone else.

Unfortunately, there is no way after the fact of determining which case
applies.

My use case is a tool that determines the state of a database for
performing diffs, etc.  It can generate ddl from database diffs to
create or alter tables, etc, and can also deal with policies and
extensions but will not be able to deal with policies created in
extensions, which is disappointing.

I can live with it though.  I'll document it as an oddity that the tool
is unable to deal with and generate commented ddl if the policy applies
to a table defined in an extension.

Thanks for the response.

__
Marc




Re: policies and extensions

2020-02-18 Thread Marc Munro
On Tue, 2020-02-18 at 15:06 -0500, Stephen Frost wrote:
> 
> Policies, also being part of the overall privilege system, could
> certainly be looked at in a similar light as being different from
> triggers and indexes...

While I think I agree with Stephen here, I don't have a vested interest
 in any particular solution and am not advocating for change.  

I am kinda surprised that policies are not explicitly tracked as part
of an extension but I can live with the status quo now that it has been
explained.  

I think it *may* be worth stating something explicitly in the
documentation but again I am not advocating.

Thanks again.

__
Marc





GPG signing

2020-05-26 Thread Marc Munro
I need to be able to cryptographically sign objects in my database
using a public key scheme.

Is my only option to install plpython or some such thing?   Python
generally makes me unhappy as I'm never convinced of its stability or
the quality of its APIs, and it is not obvious to me which of the many
different gpg-ish packages I should choose.

Any other options?  Am I missing something?

Thanks.

__
Marc





Re: GPG signing

2020-05-26 Thread Marc Munro
On Tue, 2020-05-26 at 12:04 -0700, Adrian Klaver wrote:
> On 5/26/20 12:01 PM, Marc Munro wrote:
> > I need to be able to cryptographically sign objects in my database
> > using a public key scheme.
> > [ . . . ]
> > Any other options?  Am I missing something?
> 
> https://www.postgresql.org/docs/12/pgcrypto.html#id-1.11.7.34.7

I looked at that but I must be missing something.  In order to usefully
sign something, the private, secret, key must be used to encrypt a
disgest of the thing being signed (something of a simplification, but
that's the gist).  This can then be verified, by anyone, using the
public key.

But the pgcrypto functions, for good reasons, do not allow the private
(secret) key to be used in this way.  Encryption and signing algorithms
are necessarily different as the secret key must be protected; and we
don't want signatures to be huge, and it seems that pgcrypto has not
implemented signing algorithms.

What am I missing?

__
Marc




Re: GPG signing

2020-05-28 Thread Marc Munro
On Wed, 2020-05-27 at 14:42 -0700, Michel Pelletier wrote:
> Hi Marc,
> 
> You can sign content with pgsodium:
> 
> https://github.com/michelp/pgsodium

Michel,
Yay!  A modern crypto implementation.  And it seems to do most of what
I need right out of the box with way less work than pgcrypto.

Any chance that crypto_sign_detatched() and
crypto_sign_verify_detatched() will be implemented soon?

I'll implement it and provide a patch if you'd like.

__
Marc






table name

2020-06-11 Thread Marc Millas
sorry if my question is tooo simple :-)

I got a shapefile from the french gov.
I import it with postgis 3.01 utility.
fine !
the table created by this utility is named regions-20180101
with the dash in the middle.
I see that table name in pg_class, and, also, in the geometry_columns view.


obviously if I ask:
select * from regions-20180101;
I get a syntax error.
if I try select * from $$regions_20180101$$;
I get another syntax error.
If I try to rename that table, same thing.
if I try a cte, same thing.

What should I do ?

thanks,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: table name

2020-06-11 Thread Marc Millas
damn..
thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Jun 11, 2020 at 8:55 PM Paul Ramsey 
wrote:

> ALTER TABLE "regions-20180101" rename to regions_20180101;
>
>
> > On Jun 11, 2020, at 11:54 AM, Marc Millas 
> wrote:
> >
> > sorry if my question is tooo simple :-)
> >
> > I got a shapefile from the french gov.
> > I import it with postgis 3.01 utility.
> > fine !
> > the table created by this utility is named regions-20180101
> > with the dash in the middle.
> > I see that table name in pg_class, and, also, in the geometry_columns
> view.
> >
> >
> > obviously if I ask:
> > select * from regions-20180101;
> > I get a syntax error.
> > if I try select * from $$regions_20180101$$;
> > I get another syntax error.
> > If I try to rename that table, same thing.
> > if I try a cte, same thing.
> >
> > What should I do ?
> >
> > thanks,
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com
> >
>
>


some random() clarification needed

2020-07-14 Thread Marc Millas
Hi,

when, in psql, on a postgres 12.3, I write:
select ceiling(random()*2582);
it does provide the expected answer, ie. a number between 1 and 2582,
inclusive.
allways.
when I decide to use this to get a random row within a table prenoms having
2 columns
a id serial, and a prenom varchar, with explicitly 2582 lines, no gaps,
I write:
select id, prenom from prenoms where id=ceiling(random()*2582);

expecting to get, allways, one line.
But its not the case.
around 15% of time I get 0 lines which is already quite strange to me.
but 10% of time, I get a random number of lines, until now up to 4.
even weirder (to me !)

so, can someone please clarify ?

thanks,
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: some random() clarification needed

2020-07-14 Thread Marc Millas
Ok, thanks for the clarification.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 5:24 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 14, 2020 at 8:15 AM Marc Millas 
> wrote:
>
>> select id, prenom from prenoms where id=ceiling(random()*2582);
>>
>> expecting to get, allways, one line.
>> But its not the case.
>> around 15% of time I get 0 lines which is already quite strange to me.
>> but 10% of time, I get a random number of lines, until now up to 4.
>> even weirder (to me !)
>>
>> so, can someone please clarify ?
>>
>>
> You are basically asking:
>
> For each row in my table compare the id to some random number and if they
> match return that row, otherwise skip it.  The random number being compared
> to is different for each row because random() is volatile and thus
> evaluated for each row.
>
> David J.
>


Re: some random() clarification needed

2020-07-14 Thread Marc Millas
Hi,
your answer helps me understand my first problem.
so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
(at least I was thinking I did... looks like I was wrong !)
step by step loop:
DO $$
BEGIN
  FOR counter IN 1..1000 LOOP
begin
declare
id1 integer =ceiling(random()*2582);
id3 date= '2000-01-01';
id2 date;
pren varchar;
begin
id2=id3 + (random()*7200)::integer;
SELECT prenom FROM prenoms WHERE id=id1 into pren;
INSERT INTO testparttransac (datenaissance, prenoms) values(id2,pren);
  end;
end;
END LOOP;
END; $$;

I truncated the table, executed the loop with no errors, and expected that
a select count(*)
may answer 1000 !
no.
it varies, from less than 1000 (much less, something like 900)
and more than 1000 (up to 1094)

so... what s "volatile" in the loop ?

BTW the testparttransac table is partitioned on datenaissance, with a
default partition.

thanks,
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 5:24 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 14, 2020 at 8:15 AM Marc Millas 
> wrote:
>
>> select id, prenom from prenoms where id=ceiling(random()*2582);
>>
>> expecting to get, allways, one line.
>> But its not the case.
>> around 15% of time I get 0 lines which is already quite strange to me.
>> but 10% of time, I get a random number of lines, until now up to 4.
>> even weirder (to me !)
>>
>> so, can someone please clarify ?
>>
>>
> You are basically asking:
>
> For each row in my table compare the id to some random number and if they
> match return that row, otherwise skip it.  The random number being compared
> to is different for each row because random() is volatile and thus
> evaluated for each row.
>
> David J.
>


how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi,
We would like to understand where an alter table attach partition spend its
time.
to my understanding, explain doesnt do this.

for a BI job we have a partitionned table with 1800+ partitions.
the feeding process of this table leeds to detach and attach partitions.
attaching do take time, something like 12 seconds by partition.
We need to understand where that time is spent (check constraints or check
within the default partition or...)

So, how to ?

thanks,
regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi,
thanks for the answer.
the pb is that the fact table do have mods for "old" data.
so the current scheme implies to truncate partitions and recreate them, and
copy from ods to dm, etc which is better than millions (tens of) delete and
vacuuming.
and so, the partitioning scheme is based on day s data. so the 1800+.

the other pb we do have is the very long planning time for most request.
was 120 sec in r11, down to 60 sec in 12.
vs an exec time around 4 sec. Looks like the number of indexes is of
paramount impact.
Can you take me to any doc about optimizing the index scheme for a fact
table with 40 dimensions ?
thanks
regards,




Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane  wrote:

> Marc Millas  writes:
> > We would like to understand where an alter table attach partition spend
> its
> > time.
> > to my understanding, explain doesnt do this.
>
> Nope :-(.  As our DDL commands have gotten more complicated, there's
> been some discussion of adding that, but nothing's really been done
> yet.
>
> There is some progress-monitoring support for some DDL commands now,
> but that's not quite the same thing.
>
> > for a BI job we have a partitionned table with 1800+ partitions.
>
> TBH I'd recommend scaling that down by at least a factor of ten.
> We are not at a point where you can expect that all operations will
> be cheap even with thousands of partitions.  We may never be at that
> point, although people continue to chip away at the bottlenecks.
>
> regards, tom lane
>


Re: how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are read,
even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default
partition...

regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane  wrote:

> Marc Millas  writes:
> > We would like to understand where an alter table attach partition spend
> its
> > time.
> > to my understanding, explain doesnt do this.
>
> Nope :-(.  As our DDL commands have gotten more complicated, there's
> been some discussion of adding that, but nothing's really been done
> yet.
>
> There is some progress-monitoring support for some DDL commands now,
> but that's not quite the same thing.
>
> > for a BI job we have a partitionned table with 1800+ partitions.
>
> TBH I'd recommend scaling that down by at least a factor of ten.
> We are not at a point where you can expect that all operations will
> be cheap even with thousands of partitions.  We may never be at that
> point, although people continue to chip away at the bottlenecks.
>
> regards, tom lane
>


Re: some random() clarification needed

2020-07-15 Thread Marc Millas
Thanks!
makes it clearer :-)
its not that obvious to guess the consequences of the "volatile" behaviour.
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 15, 2020 at 1:53 AM David Rowley  wrote:

> On Wed, 15 Jul 2020 at 04:01, Marc Millas  wrote:
> > your answer helps me understand my first problem.
> > so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
>
> Not sure what you're trying to do with the plpgsql, but you can just
> escape the multiple evaluations by putting the volatile function in a
> sub-select with no FROM clause.
>
> SELECT ... FROM ... WHERE id = (SELECT ceiling(random()*2582));
>
> Or the more traditional way to get a random row is:
>
> SELECT ... FROM ... WHERE id BETWEEN 0 AND 2585 ORDER BY random() LIMIT 1;
>
> David
>


ransomware

2021-02-01 Thread Marc Millas
Hi,
I have been asked the following question:
is there anyway, from within postgres, to detect any ""abnormal"" disk
writing activity ?
obvious goal would be to alert  if...
its quite clear that the underlying OS is the place to do the checks, but,
still

--to my understanding, a simple script can check various inner counters,
but this will imply that the "undesired" soft uses postgres to do the
crypting (any experience on this ???)
--another approach would be based on the fact that, if anything do change
any postgres file (data, current wal, ...) postgres should somehow "hang"

there are various ways to do those checks but I was wandering if any
""standard''" solution exist within postgres ecosystem, or someone do have
any feedback on the topic.

thanks for your help

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: ransomware

2021-02-02 Thread Marc Millas
Hi,

I know its quite general. It is as I dont know what approaches may exist.

Requirement is extremely simple: Is there anyway, from a running postgres
standpoint, to be aware that a ransomware is currently crypting your data ?

answer can be as simple as: when postgres do crash.

something else ?

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Feb 2, 2021 at 2:37 AM Michael Paquier  wrote:

> On Mon, Feb 01, 2021 at 03:38:35PM +0100, Marc Millas wrote:
> > there are various ways to do those checks but I was wandering if any
> > ""standard''" solution exist within postgres ecosystem, or someone do
> have
> > any feedback on the topic.
>
> It seems to me that you should first write down on a sheet of paper a
> list of all the requirements you are trying to satisfy.  What you are
> describing here is a rather general problem line, so nobody can help
> without knowing what you are trying to achieve, precisely.
> --
> Michael
>


prepare in a do loop

2021-02-15 Thread Marc Millas
Hi,

in psql, with a postgres 12.5 db on a centos 7 intel:
I do create a function named randname() returning a varchar, and a table
matable with a column prenom varchar(50). then
postgres=# prepare moninsert(varchar) as
postgres-# insert into matable(prenoms) values($1);
PREPARE

I test it:
postgres=# execute moninsert(randname());
INSERT 0 1

up to now, everything fine. then:
do $$ begin for counter in 1..100 loop execute
moninsert(randname());end loop;end;$$;
ERREUR:  la fonction moninsert(character varying) n'existe pas
LIGNE 1 : SELECT moninsert(randname())

someone can explain ?

thanks

(its a french db, so error message in french :-)

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: prepare in a do loop

2021-02-15 Thread Marc Millas
Hi Tom,

I do read the doc, and understand the caching behaviour of plpgsql.
if in psql I write begin;execute moninsert(randname()); execute
moninsert(randname());end;
it does work.  And if I put this (begin execute end) inside a do loop it
doesnt anymore.
ok the begin execute end is ""pure"" SQL, and the same thing within a do
loop is plpgsql
so
postgres=# create function testexec()returns void as $$
postgres$# execute moninsert(randname());
postgres$# end;
postgres$# $$ language plpgsql;
ERREUR:  erreur de syntaxe sur ou près de « execute »
LIGNE 2 : execute moninsert(randname());
fine, quite coherent.
then
postgres=# create function testexec()returns void as $$
execute moninsert(randname());
end;
$$ language sql;
CREATE FUNCTION
as SQL, legal syntax.. ok
but
postgres=# select testexec();
ERREUR:  COMMIT n'est pas autorisé dans une fonction SQL
CONTEXTE : fonction SQL « testexec » lors du lancement
a bit more difficult to understand, as such.(where is the commit ??)
so.. the prepare//execute thing can only be used in embedded SQL (as not in
any plpg, nor in sql functions.
The doc states :
The SQL standard includes a PREPARE statement, but it is only for use in
embedded SQL. This version of the PREPARE statement also uses a somewhat
different syntax.
??? where is the difference for the prepare context thing (I dont mean the
different syntax part) ??

thanks for clarification


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Feb 15, 2021 at 5:27 PM Tom Lane  wrote:

> Marc Millas  writes:
> > in psql, with a postgres 12.5 db on a centos 7 intel:
> > I do create a function named randname() returning a varchar, and a table
> > matable with a column prenom varchar(50). then
> > postgres=# prepare moninsert(varchar) as
> > postgres-# insert into matable(prenoms) values($1);
> > PREPARE
>
> > I test it:
> > postgres=# execute moninsert(randname());
> > INSERT 0 1
>
> > up to now, everything fine. then:
> > do $$ begin for counter in 1..100 loop execute
> > moninsert(randname());end loop;end;$$;
> > ERREUR:  la fonction moninsert(character varying) n'existe pas
> > LIGNE 1 : SELECT moninsert(randname())
>
> > someone can explain ?
>
> EXECUTE in plpgsql has nothing whatever to do with the SQL EXECUTE
> command.  See the respective documentation.
>
> You don't actually need to use SQL PREPARE/EXECUTE in plpgsql.
> If you just write "insert into ..." as a command in a
> plpgsql function, it's automatically prepared behind the scenes.
> Indeed, one of the common uses for plpgsql's EXECUTE is to stop
> a prepared plan from being used when you don't want that ... so
> far from being the same thing, they're more nearly opposites.
> Perhaps a different name should have been chosen, but we're
> stuck now.
>
> regards, tom lane
>


postgis

2022-07-18 Thread Marc Millas
Hi,

I would like to install postgis 3.04 on a debian 11.

digging into various web sites, I didnt found the name of that packet.

can someone help ?

thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: postgis

2022-07-18 Thread Marc Millas
Hi,
postgres 12.
I may use whatever repo.

I install postgres from postgresql.org...



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Jul 18, 2022 at 7:15 PM Adrian Klaver 
wrote:

> On 7/18/22 10:08, Marc Millas wrote:
> > Hi,
> >
> > I would like to install postgis 3.04 on a debian 11.
> >
> > digging into various web sites, I didnt found the name of that packet.
> >
> > can someone help ?
>
> Should have added to previous response, what version of Postgres?
> >
> > thanks
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-19 Thread Marc Millas
Thanks Adrian

still, I see a bunch of files, and nothing that can be installed via
apt-get install.
The only "things" big enough to contain a postgis distrib, like
https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgis/postgis_3.0.3%2Bdfsg.orig.tar.xz

can obviously not be installed via apt.

what am I missing ??


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 12:09 AM Adrian Klaver 
wrote:

> On 7/18/22 11:48, Marc Millas wrote:
> > Hi,
> > postgres 12.
> > I may use whatever repo.
> >
> > I install postgres from postgresql.org...
>
> You pretty much need to install from the same repo for PostGIS.
> Otherwise you will most likely run into compatibility issues.
>
>  From here:
>
> https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgis/
>
> I only see 3.0.1 and 3.0.3 as options.
>
> A more complete answer will need to come from one of the maintainers.
>
> >
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
> >
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-19 Thread Marc Millas
just... nothing !

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 6:36 PM Adrian Klaver 
wrote:

> On 7/19/22 9:01 AM, Marc Millas wrote:
> > Thanks Adrian
> >
> > still, I see a bunch of files, and nothing that can be installed via
> > apt-get install.
> > The only "things" big enough to contain a postgis distrib, like
> >
> https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgis/postgis_3.0.3%2Bdfsg.orig.tar.xz
> > <
> https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgis/postgis_3.0.3%2Bdfsg.orig.tar.xz
> >
> >
> > can obviously not be installed via apt.
> >
>
> > what am I missing ??
>
> What does:
>
> apt-cache search postgresql-12-postgis
>
> return?
>
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-19 Thread Marc Millas
It is. I do begin with the postgres site script:

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt
$(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'


so...


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 6:54 PM Adrian Klaver 
wrote:

> On 7/19/22 9:51 AM, Marc Millas wrote:
> > just... nothing !
>
> I thought you said you used the Postgres community repo to install
> Postgres.
>
> Is that not the case?
>
>
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
> >
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-19 Thread Marc Millas
Hi,

from your message, I understand that for debian 11, I can NOT get any 3.0.x
version.
right ?

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 7:20 PM Imre Samu  wrote:

> > I would like to install postgis 3.04 on a debian 11
> > postgres 12.
> > I may use whatever repo.
> > I install postgres from postgresql.org...
>
> As I see - from the official postgresql.org debian11 repo,
> you can only install the "postgresql-12-postgis-3" package  ( now:
> Postgis=3.2.1 )
>
> docker run -it --rm postgres:12 bash
>   apt update && apt search postgis | grep postgresql-12-postgis-3
>---> *postgresql-12-postgis-3/bullseye-pgdg 3.2.1+dfsg-1.pgdg110+1
> amd64*
>   cat /etc/os-release | grep VERSION
>---> VERSION="11 (bullseye)
>
> Now the latest 3.0 version is http://postgis.net/2022/02/02/postgis-3.0.5/
>
> regards,
>   Imre
>
> Marc Millas  ezt írta (időpont: 2022. júl. 18.,
> H, 20:48):
>
>> Hi,
>> postgres 12.
>> I may use whatever repo.
>>
>> I install postgres from postgresql.org...
>>
>>
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
>> On Mon, Jul 18, 2022 at 7:15 PM Adrian Klaver 
>> wrote:
>>
>>> On 7/18/22 10:08, Marc Millas wrote:
>>> > Hi,
>>> >
>>> > I would like to install postgis 3.04 on a debian 11.
>>> >
>>> > digging into various web sites, I didnt found the name of that packet.
>>> >
>>> > can someone help ?
>>>
>>> Should have added to previous response, what version of Postgres?
>>> >
>>> > thanks
>>> >
>>> > Marc MILLAS
>>> > Senior Architect
>>> > +33607850334
>>> > www.mokadb.com <http://www.mokadb.com>
>>> >
>>>
>>>
>>> --
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com
>>>
>>


Re: postgis

2022-07-19 Thread Marc Millas
I did run each step of the script and did install a postgres 12.11.
then destroyed the instance created by the script, and, then
pg_createcluster a new one, which is running fine.
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 8:42 PM Adrian Klaver 
wrote:

> On 7/19/22 11:19, Marc Millas wrote:
> > It is. I do begin with the postgres site script:
> >
> > sudo sh -c 'echo "debhttp://apt.postgresql.org/pub/repos/apt  <
> http://apt.postgresql.org/pub/repos/apt>  $(lsb_release -cs)-pgdg main" >
> /etc/apt/sources.list.d/pgdg.list'
> >
> >
> > so...
>
> Have you run?:
>
> sudo apt update
>
>
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-19 Thread Marc Millas
Postgres installed, but not postgis.. which is why I need some help...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 19, 2022 at 11:43 PM Adrian Klaver 
wrote:

> On 7/19/22 2:09 PM, Marc Millas wrote:
> > I did run each step of the script and did install a postgres 12.11.
> > then destroyed the instance created by the script, and, then
> > pg_createcluster a new one, which is running fine.
>
> Does this mean you have PostGIS installed now?
>
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: postgis

2022-07-20 Thread Marc Millas
???

I did describe precisely what I did:

On 7/19/22 2:09 PM, Marc Millas wrote:
> > I did run each step of the script and did install a postgres 12.11.
> > then destroyed the instance created by the script, and, then
> > pg_createcluster a new one, which is running fine.

no error messages.. Postgres is fine.

My question i(from the beginning) s about the availability of a *postgis *3.0.x
distro for postgres 12 on debian 11.
and, if yes, the name of the package, and the @ of the repo.

To my understanding, such a *postgis* distro does not exist in the
postgresql.org repos
so through this mail list, I was trying to know if it may exist somewhere.

I know that I am supposed to post such a question on the postgis mail list.
But.. I register thru the postgis web site, and didnt get any answer.
...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 20, 2022 at 10:25 AM Ron  wrote:

> You've never shown us *exactly what you did*, along with any *error
> messages*.
>
> On 7/19/22 22:07, Marc Millas wrote:
>
> Postgres installed, but not postgis.. which is why I need some help...
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Tue, Jul 19, 2022 at 11:43 PM Adrian Klaver 
> wrote:
>
>> On 7/19/22 2:09 PM, Marc Millas wrote:
>> > I did run each step of the script and did install a postgres 12.11.
>> > then destroyed the instance created by the script, and, then
>> > pg_createcluster a new one, which is running fine.
>>
>> Does this mean you have PostGIS installed now?
>>
>> > Marc MILLAS
>> > Senior Architect
>> > +33607850334
>> > www.mokadb.com <http://www.mokadb.com>
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>
> --
> Angular momentum makes the world go 'round.
>


Re: postgis

2022-07-20 Thread Marc Millas
???
I didnt get any error, as I dont know the name of the package to be
installed !!!

my question was, and still is:
"Hi,
I would like to install postgis 3.04 on a debian 11.

digging into various web sites, I didnt found the name of that packet.

can someone help ?"

so..
the only info on the debian postgis page I was able to find is the name of
a package:
postgres-12-postgis-3 which do install a postgis 3.2.1
not a postgis 3.0.x






Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 20, 2022 at 3:52 PM Ron  wrote:

> This long drama is about *POSTGIS*, not Postgresql.  What error do you
> get when trying to install *POSTGIS*?
>
> On 7/20/22 08:26, Marc Millas wrote:
>
> ???
>
> I did describe precisely what I did:
>
> On 7/19/22 2:09 PM, Marc Millas wrote:
>> > I did run each step of the script and did install a postgres 12.11.
>> > then destroyed the instance created by the script, and, then
>> > pg_createcluster a new one, which is running fine.
>
> no error messages.. Postgres is fine.
>
> My question i(from the beginning) s about the availability of a *postgis 
> *3.0.x
> distro for postgres 12 on debian 11.
> and, if yes, the name of the package, and the @ of the repo.
>
> To my understanding, such a *postgis* distro does not exist in the
> postgresql.org repos
> so through this mail list, I was trying to know if it may exist somewhere.
>
> I know that I am supposed to post such a question on the postgis mail list.
> But.. I register thru the postgis web site, and didnt get any answer.
> ...
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Wed, Jul 20, 2022 at 10:25 AM Ron  wrote:
>
>> You've never shown us *exactly what you did*, along with any *error
>> messages*.
>>
>> On 7/19/22 22:07, Marc Millas wrote:
>>
>> Postgres installed, but not postgis.. which is why I need some help...
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
>> On Tue, Jul 19, 2022 at 11:43 PM Adrian Klaver 
>> wrote:
>>
>>> On 7/19/22 2:09 PM, Marc Millas wrote:
>>> > I did run each step of the script and did install a postgres 12.11.
>>> > then destroyed the instance created by the script, and, then
>>> > pg_createcluster a new one, which is running fine.
>>>
>>> Does this mean you have PostGIS installed now?
>>>
>>> > Marc MILLAS
>>> > Senior Architect
>>> > +33607850334
>>> > www.mokadb.com <http://www.mokadb.com>
>>>
>>>
>>> --
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com
>>>
>>
>> --
>> Angular momentum makes the world go 'round.
>>
>
> --
> Angular momentum makes the world go 'round.
>


Re: postgis

2022-07-20 Thread Marc Millas
Thanks for your answer.
I would like to avoid compiling as much as possible.
I know that postgis 3.2.1 is available and does install without pb. but..
That db run an app which is very long to test, so I need to stick to a
postgis 3.0.x

regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 20, 2022 at 4:16 PM jian he  wrote:

>
> Can you try compiling from source: https://postgis.net/source/
>
> postgis 3.2.1 is OK. postgresql & postgis version info:
> https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS
>
> if you already installed check postgis version:
> https://postgis.net/docs/PostGIS_Version.html
>
> On Wed, Jul 20, 2022 at 7:37 PM Marc Millas 
> wrote:
>
>> ???
>> I didnt get any error, as I dont know the name of the package to be
>> installed !!!
>>
>> my question was, and still is:
>> "Hi,
>> I would like to install postgis 3.04 on a debian 11.
>>
>> digging into various web sites, I didnt found the name of that packet.
>>
>> can someone help ?"
>>
>> so..
>> the only info on the debian postgis page I was able to find is the name
>> of a package:
>> postgres-12-postgis-3 which do install a postgis 3.2.1
>> not a postgis 3.0.x
>>
>>
>>
>>
>>
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
>> On Wed, Jul 20, 2022 at 3:52 PM Ron  wrote:
>>
>>> This long drama is about *POSTGIS*, not Postgresql.  What error do you
>>> get when trying to install *POSTGIS*?
>>>
>>> On 7/20/22 08:26, Marc Millas wrote:
>>>
>>> ???
>>>
>>> I did describe precisely what I did:
>>>
>>> On 7/19/22 2:09 PM, Marc Millas wrote:
>>>> > I did run each step of the script and did install a postgres 12.11.
>>>> > then destroyed the instance created by the script, and, then
>>>> > pg_createcluster a new one, which is running fine.
>>>
>>> no error messages.. Postgres is fine.
>>>
>>> My question i(from the beginning) s about the availability of a *postgis
>>> *3.0.x distro for postgres 12 on debian 11.
>>> and, if yes, the name of the package, and the @ of the repo.
>>>
>>> To my understanding, such a *postgis* distro does not exist in the
>>> postgresql.org repos
>>> so through this mail list, I was trying to know if it may exist
>>> somewhere.
>>>
>>> I know that I am supposed to post such a question on the postgis mail
>>> list.
>>> But.. I register thru the postgis web site, and didnt get any answer.
>>> ...
>>>
>>> Marc MILLAS
>>> Senior Architect
>>> +33607850334
>>> www.mokadb.com
>>>
>>>
>>>
>>> On Wed, Jul 20, 2022 at 10:25 AM Ron  wrote:
>>>
>>>> You've never shown us *exactly what you did*, along with any *error
>>>> messages*.
>>>>
>>>> On 7/19/22 22:07, Marc Millas wrote:
>>>>
>>>> Postgres installed, but not postgis.. which is why I need some help...
>>>>
>>>> Marc MILLAS
>>>> Senior Architect
>>>> +33607850334
>>>> www.mokadb.com
>>>>
>>>>
>>>>
>>>> On Tue, Jul 19, 2022 at 11:43 PM Adrian Klaver <
>>>> adrian.kla...@aklaver.com> wrote:
>>>>
>>>>> On 7/19/22 2:09 PM, Marc Millas wrote:
>>>>> > I did run each step of the script and did install a postgres 12.11.
>>>>> > then destroyed the instance created by the script, and, then
>>>>> > pg_createcluster a new one, which is running fine.
>>>>>
>>>>> Does this mean you have PostGIS installed now?
>>>>>
>>>>> > Marc MILLAS
>>>>> > Senior Architect
>>>>> > +33607850334
>>>>> > www.mokadb.com <http://www.mokadb.com>
>>>>>
>>>>>
>>>>> --
>>>>> Adrian Klaver
>>>>> adrian.kla...@aklaver.com
>>>>>
>>>>
>>>> --
>>>> Angular momentum makes the world go 'round.
>>>>
>>>
>>> --
>>> Angular momentum makes the world go 'round.
>>>
>>
>
> --
>  I recommend David Deutsch's <>
>
>   Jian
>
>
>


Re: postgis

2022-07-20 Thread Marc Millas
right.
so I scratch the debian vm, install a centos 7 and within minutes I have a
postgres 12 with postgis 3.0.4 running.
so easy.

regards.
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, Jul 20, 2022 at 7:27 PM Imre Samu  wrote:

> >  I would expect the 35 packages implied by the version policies of those
> two projects.
>
> Based on my docker-postgis support  - the "geos" is also important.
> Now Bullseye(Debian11) geos version is 3.9 - and this is likely to
> continue until the end of the cycle ( so no upgrade expected to 3.10,3.11)
>
> And the  (next) Postgis 3.3.0 Release is not enabling all new features
> with the current Bullseye - Geos version:
> https://git.osgeo.org/gitea/postgis/postgis/raw/tag/3.3.0beta2/NEWS
>
> *"This version requires PostgreSQL 11 or higher, GEOS 3.6 or higher, and
> Proj 5.2+.*
>
> *Additional features are enabled if you are running GEOS 3.9+ST_MakeValid
> enhancements with 3.10+, *
> *numerouse additional enhancements with GEOS 3.11+. *
> *Requires SFCGAL 1.4.1+ for ST_AlphaShape and ST_OptimalAlphaShape.*
> *"*
>
> And Postgis 3.2 also has some enhancements working only with geos 3.10+  (
> ST_MakeValid enhancements )
> And "Bookworm" Debian12 expected  >= mid-2023.
> so not easy ...
>
> Imre
>
>
> David G. Johnston  ezt írta (időpont: 2022.
> júl. 20., Sze, 18:31):
>
>> On Wed, Jul 20, 2022 at 9:21 AM Imre Samu  wrote:
>>
>>> > My general impression is that the packaging, at least for Debian,
>>> > doesn’t actually understand how the PostGIS project handles versioning
>>> support.
>>> > But i may be missing something
>>>
>>> "PostGIS Pre-built Binary Distributions for various OS"
>>> --->  https://trac.osgeo.org/postgis/wiki/UsersWikiPackages
>>>
>>> Debian is a conservative Linux.
>>>
>>> IMHO:
>>> Packaging is not so easy, [
>>> https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS  ]
>>> - there are [n.=7] Postgres version [9.6,10,11,12,13,14,15 ]   [ now:
>>> all supported in bullseye ]
>>> - there are [g.=9 ] Geos version
>>> [3.3,3.4,3.5,3.6,3.7,3.8,3.9,3.10,3.11]  [ now: bullsey= 3.9.0 ]
>>> - there are [p.=7 ] Proj version [ 4.8,4.9,5.x,6.x,7.x,8.x,9.x ][
>>> now: bullseye = 7.2.1 ]
>>> - there are [d.= 7 ] Gdal version [ 2.4,3.0,3.1,3.2,3.3,3.4,3.5][
>>> now: bullseye = 3.2.2 ]
>>> - there are [m.=5] Postgis version [2.4,2.5,3.0,3.1,3.2,3.3]   [now:
>>> bullseye= 3.2.1 ]
>>>
>>> And there are also projects based on PostGIS.
>>> - Pgrouting [r.=7 ]   [2.3,2.4,2.5,2.6,3.0,3.1,3.2,3.3]  [ now:
>>> bullseye= 3.3.0 ; postgresql-12-pgrouting ]
>>>
>>> So the ideal "end user" combination =  n*g*p*d*m*r  = 7*9*7*7*5*7  =
>>> 108045
>>>
>>> // disclaimer:   I am a Postgis user and a
>>> https://github.com/postgis/docker-postgis contributor
>>>
>>>>
>>>>
>> Yes, my expectation may be naive, but as the package name is
>> "postgresql-[version]-postgis-[version]" I would expect the 35 packages
>> implied by the version policies of those two projects.  So that one can
>> choose their combination and focus on patch releases within those two named
>> projects.  The OP seems to as well.  Or maybe a functional subset so that
>> some number less than 35 may exist but, say, you cannot combine v14 and 3.0
>> since 3.0 since 3.2 was the most recent release of PostGIS when PostgreSQL
>> v14 came out.
>>
>> In any case it does sound like the request by the OP is not something the
>> community has chosen to provide.  Which means a choice on their part - move
>> up PostGIS or compile from source.
>>
>> David J.
>>
>>
>>


limits, indexes, views and query planner

2022-08-15 Thread Marc Mamin

hello,
in the example below, we can see that the view test_ab prevents the usage of 
the index to retrieve the top last rows.
This is a special case, as the where clause excludes data from the second 
table, and the explain output do not references it at all.
I wonder if the planner could be able to exclude the table_b earlier in its 
plan and to possibly fallback to a plan equivalent to the first one.
with a view on a single table (test_av), the index is used.

An oddity in the plan is the expected row count in "Append"(11) ...

( tested on postgres 14 )

Regards,
Marc Mamin


create temp table table_a as (select * from generate_series(1,10)x);
create temp table table_b as (select * from generate_series(1,10)x);
create index i_a on table_a (x);
create index i_b on table_b (x);

analyze table_a;
analyze table_b;

CREATE VIEW test_ab AS

  select 'a' as src, x from table_a
  UNION
  select 'b' as src, x from table_b
;


explain analyze select * from table_a order by x desc limit 10;
Limit  (cost=0.29..0.60 rows=10 width=4) (actual 
time=0.056..0.060 rows=10 loops=1)
   ->  Index Only Scan Backward using i_a on table_a  
(cost=0.29..3050.29 rows=10 width=4) (actual time=0.055..0.058 rows=10 
loops=1)
Heap Fetches: 10


explain analyze select * from test_ab where src='a' order by x desc limit 10;

Limit  (cost=17895.92..17895.94 rows=10 width=36) (actual time=89.678..89.681 
rows=10 loops=1)
  ->  Sort  (cost=17895.92..18145.92 rows=11 width=36) (actual 
time=89.677..89.679 rows=10 loops=1)
Sort Key: table_a.x DESC
Sort Method: top-N heapsort  Memory: 25kB
->  Unique  (cost=13984.92..14734.92 rows=11 width=36) (actual 
time=47.684..75.574 rows=10 loops=1)
  ->  Sort  (cost=13984.92..14234.92 rows=11 width=36) (actual 
time=47.682..60.869 rows=10 loops=1)
Sort Key: ('a'::text), table_a.x
Sort Method: external merge  Disk: 1768kB
->  Append  (cost=0.00..2943.01 rows=11 width=36) 
(actual time=0.012..21.268 rows=10 loops=1)
  ->  Seq Scan on table_a  (cost=0.00..1443.00 
rows=10 width=36) (actual time=0.011..14.078 rows=10 loops=1)
  ->  Result  (cost=0.00..0.00 rows=0 width=36) (actual 
time=0.001..0.002 rows=0 loops=1)
One-Time Filter: false
Planning Time: 0.107 ms
Execution Time: 90.139 ms

CREATE VIEW test_av AS
  select 'a' as src, x from table_a;

explain analyze select * from test_av order by x desc limit 10;
 ->  Index Only Scan Backward using i_a on table_a  (cost=0.29..3050.29 
rows=10 width=36) (actual time=0.017..0.019 rows=10 loops=1)



impact join syntax ?? and gist index ??

2023-01-07 Thread Marc Millas
Hi,

postgres 12, postgis 3.0

I have a small table A, 11 rows with a varchar column x and a geometry
column y.
gist index on the geometry column.
the geometry do contains multipolygons (regions on a map)
I have a second table B , same structure, around 420 000 rows.
no index,
the geometry do contains points.
all geometries are on 4326 srid.

 If i ask to count points in each multipolygons:

select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x;
it takes 11 seconds  (everything in shared buffers).
If I do the very same thing as:
select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group by
A.x;
same result, but 85 seconds (every thing in shared buffers, again)
if I redo asking with explain analyze, buffers, the plan is very different.


if I do create a gist index on geometry column of the big table,
both syntax takes 21 seconds.

I get the feeling I am missing something.. (at least 2 things...)
can someone shed some light ??

thanks


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: impact join syntax ?? and gist index ??

2023-01-07 Thread Marc Millas
Yes, I know the 2 syntax provide a different result: one provides the 6
meaningful lines, the left join do add 5 lines with a count of 0...
...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sat, Jan 7, 2023 at 8:46 PM Marc Millas  wrote:

> Hi,
>
> postgres 12, postgis 3.0
>
> I have a small table A, 11 rows with a varchar column x and a geometry
> column y.
> gist index on the geometry column.
> the geometry do contains multipolygons (regions on a map)
> I have a second table B , same structure, around 420 000 rows.
> no index,
> the geometry do contains points.
> all geometries are on 4326 srid.
>
>  If i ask to count points in each multipolygons:
>
> select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x;
> it takes 11 seconds  (everything in shared buffers).
> If I do the very same thing as:
> select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group by
> A.x;
> same result, but 85 seconds (every thing in shared buffers, again)
> if I redo asking with explain analyze, buffers, the plan is very different.
>
>
> if I do create a gist index on geometry column of the big table,
> both syntax takes 21 seconds.
>
> I get the feeling I am missing something.. (at least 2 things...)
> can someone shed some light ??
>
> thanks
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>


Re: impact join syntax ?? and gist index ??

2023-01-07 Thread Marc Millas
on postgres 15 and postgis 3.3, with the very same dataset,
without gist index on the 420k rows table, the syntax with the left join
takes 25 seconds, and without 770 ms.
so to get 5 empty lines its 30 times slower.
if I add the gist index, both syntaxes takes 770 ms...

at least, this close the discussion about the versions my project will use
:-)


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sat, Jan 7, 2023 at 8:46 PM Marc Millas  wrote:

> Hi,
>
> postgres 12, postgis 3.0
>
> I have a small table A, 11 rows with a varchar column x and a geometry
> column y.
> gist index on the geometry column.
> the geometry do contains multipolygons (regions on a map)
> I have a second table B , same structure, around 420 000 rows.
> no index,
> the geometry do contains points.
> all geometries are on 4326 srid.
>
>  If i ask to count points in each multipolygons:
>
> select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x;
> it takes 11 seconds  (everything in shared buffers).
> If I do the very same thing as:
> select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group by
> A.x;
> same result, but 85 seconds (every thing in shared buffers, again)
> if I redo asking with explain analyze, buffers, the plan is very different.
>
>
> if I do create a gist index on geometry column of the big table,
> both syntax takes 21 seconds.
>
> I get the feeling I am missing something.. (at least 2 things...)
> can someone shed some light ??
>
> thanks
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>


Re: impact join syntax ?? and gist index ??

2023-01-07 Thread Marc Millas
here they are: (I replace the column and table names) also I post 2 more
remarks, one on left join, and one on the test I did on postgres 15 postgis
3.3...

2023=# explain (analyze, buffers) select A.x, count(B.x) from A left join B
on st_within(B.geom, A.geom) group by A.x;
QUERY PLAN
--
 GroupAggregate  (cost=212638398.98..212701792.16 rows=20 width=16) (actual
time=86717.857..86757.820 rows=11 loops=1)
   Group Key: A.x
   Buffers: shared hit=4243867
   ->  Sort  (cost=212638398.98..212659529.97 rows=8452398 width=16)
(actual time=86717.851..86727.334 rows=421307 loops=1)
 Sort Key: A.x
 Sort Method: quicksort  Memory: 37963kB
 Buffers: shared hit=4243867
 ->  Nested Loop Left Join  (cost=0.00..211521459.31 rows=8452398
width=16) (actual time=17.473..86642.332 rows=421307 loops=1)
   Join Filter: st_within(B.geom, A.geom)
   Rows Removed by Join Filter: 4229377
   Buffers: shared hit=4243867
   ->  Seq Scan on A  (cost=0.00..9.20 rows=20 width=17752)
(actual time=0.009..0.043 rows=11 loops=1)
 Buffers: shared hit=9
   ->  Materialize  (cost=0.00..22309.83 rows=422789 width=40)
(actual time=0.001..23.392 rows=422789 loops=11)
 Buffers: shared hit=15968
 ->  Seq Scan on B  (cost=0.00..20195.89 rows=422789
width=40) (actual time=0.006..57.651 rows=422789 loops=1)
   Buffers: shared hit=15968
 Planning Time: 0.693 ms
 Execution Time: 86763.087 ms
(19 lignes)


2023=# explain (analyze, buffers) select A.x, count(B.x) from A, B where
st_within(B.geom, A.geom) group by A.x;

 QUERY PLAN

 Finalize GroupAggregate  (cost=6301606.00..6301608.60 rows=20 width=16)
(actual time=11857.363..11863.212 rows=6 loops=1)
   Group Key: A.x
   Buffers: shared hit=2128836
   ->  Gather Merge  (cost=6301606.00..6301608.30 rows=20 width=16) (actual
time=11857.359..11863.207 rows=12 loops=1)
 Workers Planned: 1
 Workers Launched: 1
 Buffers: shared hit=2128836
 ->  Sort  (cost=6300605.99..6300606.04 rows=20 width=16) (actual
time=11840.355..11840.356 rows=6 loops=2)
   Sort Key: A.x
   Sort Method: quicksort  Memory: 25kB
   Worker 0:  Sort Method: quicksort  Memory: 25kB
   Buffers: shared hit=2128836
   ->  Partial HashAggregate  (cost=6300605.36..6300605.56
rows=20 width=16) (actual time=11840.331..11840.332 rows=6 loops=2)
 Group Key: A.x
 Buffers: shared hit=2128825
 ->  Nested Loop  (cost=0.13..6275745.36 rows=4971999
width=16) (actual time=0.505..11781.817 rows=210651 loops=2)
   Buffers: shared hit=2128825
   ->  Parallel Seq Scan on B  (cost=0.00..18454.99
rows=248699 width=40) (actual time=0.005..22.859 rows=211395 loops=2)
 Buffers: shared hit=15968
   ->  Index Scan using A_geom_idx on A
(cost=0.13..25.15 rows=1 width=17752) (actual time=0.054..0.055 rows=1
loops=422789)
 Index Cond: (geom ~ B.geom)
 Filter: st_within(B.geom, geom)
 Rows Removed by Filter: 0
 Buffers: shared hit=2112857
 Planning Time: 0.252 ms
 Execution Time: 11863.357 ms
(26 lignes)




Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sat, Jan 7, 2023 at 9:40 PM Erik Wienhold  wrote:

> > On 07/01/2023 20:46 CET Marc Millas  wrote:
> >
> > Hi,
> >
> > postgres 12, postgis 3.0
> >
> > I have a small table A, 11 rows with a varchar column x and a geometry
> column y.
> > gist index on the geometry column.
> > the geometry do contains multipolygons (regions on a map)
> > I have a second table B , same structure, around 420 000 rows.
> > no index,
> > the geometry do contains points.
> > all geometries are on 4326 srid.
> >
> > If i ask to count points in each multipolygons:
> >
> > select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x;
> > it takes 11 seconds (everything in shared buffers).
> > If I do the very same thing as:
> > select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group
> by A.x;
> > same result, but 85 seconds (every thing in shared buffers, again)
> > if I redo asking with explain analyze, buffers, the pl

Re: impact join syntax ?? and gist index ??

2023-01-07 Thread Marc Millas
I read your answer, Tom, but I cannot connect it to my measurements: why
adding the index did slow the request twice ??

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sat, Jan 7, 2023 at 10:33 PM Tom Lane  wrote:

> Marc Millas  writes:
> > 2023=# explain (analyze, buffers) select A.x, count(B.x) from A left
> join B
> > on st_within(B.geom, A.geom) group by A.x;
>
> So the problem with this is that the only decently-performant way to
> do the join is like
>
> >  ->  Nested Loop  (cost=0.13..6275745.36
> rows=4971999 width=16) (actual time=0.505..11781.817 rows=210651 loops=2)
> >->  Parallel Seq Scan on B
> (cost=0.00..18454.99 rows=248699 width=40) (actual time=0.005..22.859
> rows=211395 loops=2)
> >->  Index Scan using A_geom_idx on A
> (cost=0.13..25.15 rows=1 width=17752) (actual time=0.054..0.055 rows=1
> loops=422789)
> >  Index Cond: (geom ~ B.geom)
> >  Filter: st_within(B.geom, geom)
>
> (Ignore the parallelism, it's not very relevant here.)  There's no
> chance for merge or hash join because those require simple equality
> join conditions.  The only way to avoid a stupid
> compare-every-row-of-A-to-every-row-of-B nestloop is to use a
> parameterized inner indexscan, as this plan does.  But that only works
> if the join is inner or has the indexed table on the nullable side.
> We have no support for nestloop right join, which is what would be
> needed to make things run fast with no index on B.
>
> regards, tom lane
>


alter table impact on view

2023-01-30 Thread Marc Millas
Hello,

to my understanding, if I do alter table rename column, Postgres change the
name of the said column, and modify the views using that table so that they
keep working (NOT oracle behaviour..)
fine.
But if I alter table to change a column that is a varchar 20 into a varchar
21
postgres refuse saying that it cannot due to the return rule... using said
column

why ?? as the view is not a materialized object, the impact of the length
of a column of an underlying table do change the description of the view,
clearly, but I dont see where the difficulty is "hidden". Can someone
enlighten me?

thanks,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: alter table impact on view

2023-01-30 Thread Marc Millas
A bit sad

Thanks..

Le lun. 30 janv. 2023 à 13:53, David Rowley  a écrit :

> On Tue, 31 Jan 2023 at 01:14, Marc Millas  wrote:
> > But if I alter table to change a column that is a varchar 20 into a
> varchar 21
> > postgres refuse saying that it cannot due to the return rule... using
> said column
> >
> > why ?? as the view is not a materialized object, the impact of the
> length of a column of an underlying table do change the description of the
> view, clearly, but I dont see where the difficulty is "hidden". Can someone
> enlighten me?
>
> Primarily because nobody has written the required code.
>
> In [1], which is now quite old, there was some discussion about
> various aspects of making this better. Perhaps changing the typmod is
> easier than changing the type completely, but we still don't have any
> code for it. So for now, you're just stuck manually dropping and
> recreating your views.
>
> David
>
> [1]
> https://www.postgresql.org/message-id/603c8f070807291912x37412373q7cd7dc36dd55a...@mail.gmail.com
>


Re: Best Open Source OS for Postgresql

2023-01-31 Thread Marc Millas
Sorry for inappropriate "reply".

if you do check the debian postgis repo, you ll find that its NOT possible
to choose a postgis version.
its possible for postgis 2.4 and 2.5, then ALL 3.x versions are
inaccessible but one, that did change from time to time.
(you MUST ask for postgis 3 without being able to ask for 3.0 or 3.1 or...
its like asking for postgres 9 without .5 or .6)
2 of my customers reverse to a RH family linux because they have been
unable to install the requested postgres/postgis version on debian.
when I did ask the team, the reply was: we cannot package for all cross
possibilities (ie. 5 postgres x 6 postgis, less some impossibilities
according to postgis matrix)

so...


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jan 31, 2023 at 1:23 PM hubert depesz lubaczewski 
wrote:

> On Tue, Jan 31, 2023 at 01:18:08PM +0100, Marc Millas wrote:
> > Did you check postgis debian repo? ??
>
> Not sure why:
> 1. you ask me that
> 2. you ask me that off list
>
> but no, i haven't.
>
> depesz
>


Re: Best Open Source OS for Postgresql

2023-02-01 Thread Marc Millas
Hello,
What about postgis :
3.0,
3.3,
3.4
??

Le mer. 1 févr. 2023 à 07:20, Tony Shelver  a écrit :

>
> Copied to the list
> On Wed, 1 Feb 2023 at 08:18, Tony Shelver  wrote:
>
>>
>>
>> On Wed, 1 Feb 2023 at 08:04, Tony Shelver  wrote:
>>
>>>
>>> On Tue, 31 Jan 2023 at 15:10, Marc Millas 
>>> wrote:
>>>
>>>> Sorry for inappropriate "reply".
>>>>
>>>> if you do check the debian postgis repo, you ll find that its NOT
>>>> possible to choose a postgis version.
>>>> its possible for postgis 2.4 and 2.5, then ALL 3.x versions are
>>>> inaccessible but one, that did change from time to time.
>>>> (you MUST ask for postgis 3 without being able to ask for 3.0 or 3.1
>>>> or...  its like asking for postgres 9 without .5 or .6)
>>>> 2 of my customers reverse to a RH family linux because they have been
>>>> unable to install the requested postgres/postgis version on debian.
>>>> when I did ask the team, the reply was: we cannot package for all cross
>>>> possibilities (ie. 5 postgres x 6 postgis, less some impossibilities
>>>> according to postgis matrix)
>>>>
>>>>
>> Maybe I am not understanding this, but have you checked the UbuntuGIS
>> source?  I know not pure Debian, but...
>>
>> Sent before complete:
>>
>> UbuntuGIS stable shows PostGIS 2.4 and 2.5 available, as well as 3.1 and
>> 3.2  :
>> https://launchpad.net/%7Eubuntugis/+archive/ubuntu/ppa/+index?batch=75&memo=75&start=75
>> Got this link from https://wiki.ubuntu.com/UbuntuGIS  via
>> https://postgis.net/install/
>> Also note that UbuntuGIS is based off the DebianGIS project:
>> https://wiki.debian.org/DebianGis
>>
>> We run both Ubuntu and Centos servers.  The change to Centos licensing
>> has led us to support it only for one specific application that is embedded
>> in Centos that we are trying to end-of-life ASAP.  At least Ubuntu server
>> has a 5 year support window, and Ubuntu has now announced a limited 'free'
>> Pro option for smaller businesses that provides a 10 year window.
>>
>>
>> Regards
>>
>>
>>>


Re: Best Open Source OS for Postgresql

2023-02-01 Thread Marc Millas
If I remember  well, I did try all repo I ever heard off, and basic Google
search
Till I ask someone thru a postgis mailing list.
If you find something, like how to install a postgres 12 with a postgis 3.0
on any debian based distro.. Pls tell, I will read with interest. Also,
pls, not recompiling the whole thing with all associated libs ...
Thanks
Which was my customer requirelent

Le mar. 31 janv. 2023 à 17:38, Adrian Klaver  a
écrit :

> On 1/31/23 05:09, Marc Millas wrote:
> > Sorry for inappropriate "reply".
> >
> > if you do check the debian postgis repo, you ll find that its NOT
> > possible to choose a postgis version.
>
> Are you talking about the repo hosted by Debian or the Postgres
> community repo's here:
>
> https://www.postgresql.org/download/linux/debian/
>
> > its possible for postgis 2.4 and 2.5, then ALL 3.x versions are
> > inaccessible but one, that did change from time to time.
> > (you MUST ask for postgis 3 without being able to ask for 3.0 or 3.1
> > or...  its like asking for postgres 9 without .5 or .6)
> > 2 of my customers reverse to a RH family linux because they have been
> > unable to install the requested postgres/postgis version on debian.
> > when I did ask the team, the reply was: we cannot package for all cross
> > possibilities (ie. 5 postgres x 6 postgis, less some impossibilities
> > according to postgis matrix)
> >
> > so...
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

2023-02-17 Thread Marc Millas
If I may..
this answer looks more "philosophical" than "practical".
On Oracle (maybe 10, I don't remember well) was introduced the possibility
to explicitly store an execution plan, so that a given query use THAT plan
ie. dont go thru planner job.
OK if someone do stupid things, one may get stupid results...it was an
"expert only" functionality  :-)
Still, in some cases, it was very useful to manage the rare cases where the
planner cannot, for whatever reason do a good job.

OK its not the way postgres do behave. Still, in some cases...



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Feb 16, 2023 at 5:08 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Feb 16, 2023 at 8:48 AM cen  wrote:
>
>>
>> - does the planner take previous runs of the same query and it's
>> execution time into account? If not, why?
>>
>
> No, because that isn't how it works.  And while I'm no planner expert I'm
> not imagining any particularly compelling argument for why it would even
> make sense to try.  The book-keeping would be expensive and dealing with
> supposedly an ever-changing dataset would in many cases make any such
> comparisons be meaningless.
>
>
>> - assuming the query to be immutable, would it be possible for the
>> planner to microbenchmark a few different plans instead of trying to
>> estimate the cost?
>> As in, actually executing the query with different plans and caching the
>> best one.
>>
>
> No, the planner may not cause execution.  While I could imagine extending
> EXPLAIN to somehow retrieve and maybe even try alternative plans that have
> been fully constructed today I'm not holding my breath.
>
> There is little reason for the project to give any real weight to
> "assuming the query to be immutable".  We do want to fix the planner to
> behave better if it is mis-behaving, otherwise you do have access to cost
> parameters, and potentially other planner toggles if you've truly run into
> an intractable problem.
>
> David J.
>
>


pg_reload_conf()

2023-02-20 Thread Marc Millas
Hi,



maybe I am missing something...

with superuser rights, on a postgres 14 and postgres 15 version:
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off
alter system set log_connections to 'on';
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off
... strange
select pg_reload_conf();
t
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
off off off

very strange.

cat postgresql.auto.conf
log_connections = 'on'

obviously, if I stop and restart the DB,
select  setting, boot_val, reset_val from pg_settings where
name='log_connections';
on off on

So, I am perplexed: what pg_reload_conf() is doing/not doing ?

thanks


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: pg_reload_conf()

2023-02-20 Thread Marc Millas
Thanks Adrian,

the fact that the changes are visible only by the new backend, and NOT by
the backend which did the changes was what I did missed.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Feb 20, 2023 at 6:01 PM Adrian Klaver 
wrote:

> On 2/20/23 08:57, Marc Millas wrote:
> > select  setting, boot_val, reset_val from pg_settings where
> > name='log_connections';
>
> # select  setting, boot_val, reset_val, context from pg_settings where
> name='log_connections';
>   setting | boot_val | reset_val |  context
> -+--+---+---
>   on  | off  | on| superuser-backend
>
>
> https://www.postgresql.org/docs/15/view-pg-settings.html
>
> There are several possible values of context. In order of decreasing
> difficulty of changing the setting, they are:
>
>
> ...
>
> superuser-backend
>
>  Changes to these settings can be made in postgresql.conf without
> restarting the server. They can also be set for a particular session in
> the connection request packet (for example, via libpq's PGOPTIONS
> environment variable), but only if the connecting user is a superuser or
> has been granted the appropriate SET privilege. However, these settings
> never change in a session after it is started. If you change them in
> postgresql.conf, send a SIGHUP signal to the postmaster to cause it to
> re-read postgresql.conf. The new values will only affect
> subsequently-launched sessions.
>
> ...
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Oracle to PostgreSQL Migration

2023-03-21 Thread Marc Millas
EDB do have a replication server which can be used to transfer real time
data from oracle to postgres.
don't know if it can be used to get to "no downtime"
BTW what do you call "no downtime" as anyway a switch, as fast as it can be
do take ""some"" time ?


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Mon, Mar 20, 2023 at 1:58 PM Inzamam Shafiq 
wrote:

> Hi,
>
> Hope everyone is fine.
>
> Can someone help or guide regarding Open Source tools for Oracle to
> PostgreSQL migration with real time CDC. along with this is there any
> possibility to change the structure of the database? Let me explain a
> little more,
>
> We have an Oracle DB which is around 1TB and we want to migrate to
> PostgreSQL that have a new table structure, so we want to perform data
> transformation and real time CDC from Oracle to PostgreSQL. Do we have any
> good open source tool to achieve this with No Coding involved.??
>
> Thanks.
>
> Regards,
>
> *Inzamam Shafiq*
> *Sr. DBA*
>


SIze 0 in pg_stat_file, pg_stat_progress_copy

2023-04-14 Thread Marc Millas
Hi,

on a debian machine, with a postgres 14,2 server

logs in a dedicated  directory (not log)
when logged as superuser, I get:
--pg_stat_file for the current logfile says size 0,
--pg_ls_logdir answers 0 files,
--pg_ls_dir, for the log directory provides postgres.csv, postgres.json,...
list of files
--and when running a copy from the current logfile with a where clause to
choose only the current day logs, the bytes total column is 0

what can provides such a behaviour ?

thanks,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: SIze 0 in pg_stat_file, pg_stat_progress_copy

2023-04-14 Thread Marc Millas
Sorry...
someone did setup the log as a named pipe...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Fri, Apr 14, 2023 at 4:26 PM Marc Millas  wrote:

> Hi,
>
> on a debian machine, with a postgres 14,2 server
>
> logs in a dedicated  directory (not log)
> when logged as superuser, I get:
> --pg_stat_file for the current logfile says size 0,
> --pg_ls_logdir answers 0 files,
> --pg_ls_dir, for the log directory provides postgres.csv,
> postgres.json,... list of files
> --and when running a copy from the current logfile with a where clause to
> choose only the current day logs, the bytes total column is 0
>
> what can provides such a behaviour ?
>
> thanks,
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>


missing something about json syntax

2023-04-20 Thread Marc Millas
Hi,

postgres 15

looks Iike I am missing something, maybe obvious :-(
In a table with a json column (_data) if I ask psql to select _data from
mytable with a where clause to get only one line,, I get something
beginning by
 
{"time":"2023-04-19T16:28:01.19780551+02:00","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":\"2023-04-19T14:28:01Z\",\"logger\":\"_audit\",\"msg\":\"record\",\"logging_pod\":\"cluster-pgsql\",\"record\":{\"log_time\":\"2023-04-19
14:28:01.197 UTC\",\
etc...
if I create table anothertable as select _data as _data  from mytable, it
creates and feed that new table with all the appropriate data, and when I
ask psql \d anothertable it says that its a table with a json column.named
_data.
fine !

now if I select json_object_keys(_data) from mytable, I get a list of tags.
time, stream, _p, log, fine.
now, if i select json_object_keys(_data) from anothettable, I get an error:
cannot call json_objet_keys on a scalar..

???
both columns are fed and of type json. and postgres didn't throw any error
feeding them.
if I create a table with a jsonb column and feed it with the
anothertable json column, same, fine... but still unusable.

and unusable with all the other ways I did try, like simply select
_data->'log'->>'level' from mytable, or select _data->'level' from
anothertable

sure if I look at the json field one is showed { "tag": "value", ...
and the other is showed "{\"tag\":\"value\", ...

not the very same

so 2 questions:
1) how postgres can feed a json or jsonb column and CANNOT use the values
in it ??
2) how to "transform" the inappropriate json into a usable one ?


of course, if what I am missing is very obvious, I apologize...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: missing something about json syntax

2023-04-20 Thread Marc Millas
Thanks for your input.

select (_data->>'log')::json->'level' from mytable;
this does work.
but it doesnt explain how postgres is able to put a scalar in a json or
jsonb column without pb:
I don't understand how this ('"{\"t\"}"') can be considered  a valid enough
json to be inserted in a json column
and at the same time invalid for all other json uses.
just like if postgres was allowing to insert things that are not of the
column type

it's the first time I do encounter this kind of behaviour from postgres





Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, Apr 20, 2023 at 7:47 PM Erik Wienhold  wrote:

> > On 20/04/2023 18:35 CEST Marc Millas  wrote:
> >
> > Hi,
> >
> > postgres 15
> >
> > looks Iike I am missing something, maybe obvious :-(
> > In a table with a json column (_data) if I ask psql to select _data from
> > mytable with a where clause to get only one line,, I get something
> beginning
> > by
> >
> {"time":"2023-04-19T16:28:01.19780551+02:00","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":\"2023-04-19T14:28:01Z\",\"logger\":\"_audit\",\"msg\":\"record\",\"logging_pod\":\"cluster-pgsql\",\"record\":{\"log_time\":\"2023-04-19
> 14:28:01.197 UTC\",\
> > etc...
>
> The value of property "log" is a string, not an object.  Notice the escaped
> double quotes (\").
>
> > if I create table anothertable as select _data as _data from mytable, it
> > creates and feed that new table with all the appropriate data, and when
> I ask
> > psql \d anothertable it says that its a table with a json column.named
> _data.
> > fine !
> >
> > now if I select json_object_keys(_data) from mytable, I get a list of
> tags.
> > time, stream, _p, log, fine.
> > now, if i select json_object_keys(_data) from anothettable, I get an
> error:
> > cannot call json_objet_keys on a scalar..
> >
> > ???
> > both columns are fed and of type json. and postgres didn't throw any
> error
> > feeding them.
> > if I create a table with a jsonb column and feed it with the
> anothertable json
> > column, same, fine... but still unusable.
> >
> > and unusable with all the other ways I did try, like simply
> > select _data->'log'->>'level' from mytable, or
> > select _data->'level' from anothertable
> >
> > sure if I look at the json field one is showed { "tag": "value", ...
> > and the other is showed "{\"tag\":\"value\", ...
>
> You executed
>
> create table anothertable as select _data->'log' as _data from
> mytable;
>
> and not
>
> create table anothertable as select _data as _data from mytable;
>
> So you end up with the scalar value of property "log" in
> anothertable._data.
>
> > not the very same
> >
> > so 2 questions:
> > 1) how postgres can feed a json or jsonb column and CANNOT use the
> values in
> >it ??
> > 2) how to "transform" the inappropriate json into a usable one ?
> >
> > of course, if what I am missing is very obvious, I apologize...
>
> Get the log value with operator ->> and cast the returned text to json:
>
> select (_data->>'log')::json->'level' from mytable;
>
> --
> Erik
>


Re: missing something about json syntax

2023-04-21 Thread Marc Millas
Ok, thanks.

Le jeu. 20 avr. 2023 à 22:42, Tom Lane  a écrit :

> Marc Millas  writes:
> > but it doesnt explain how postgres is able to put a scalar in a json or
> > jsonb column without pb:
> > I don't understand how this ('"{\"t\"}"') can be considered  a valid
> enough
> > json to be inserted in a json column
> > and at the same time invalid for all other json uses.
>
> That's a bare string (it's not an object).  That's valid JSON per
> RFC 7159:
>
>JSON can represent four primitive types (strings, numbers, booleans,
>and null) and two structured types (objects and arrays).
>...
>A JSON text is a serialized value.  Note that certain previous
>specifications of JSON constrained a JSON text to be an object or an
>array.
>
> However, there certainly are some operations that require the top-level
> value to be an object or array.
>
> regards, tom lane
>


Death postgres

2023-05-05 Thread Marc Millas
Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)
after the postgres disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about
temp_file_limit ?

thanks,




Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :

> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
> > postgres 14.2 on Linux redhat
> >
> > temp_file_limit set around 210 GB.
> >
> > a select request with 2 left join have crashed the server (oom killer)
> after
> > the postgres disk occupation did grow from 15TB to 16 TB.
>
> temp_file_limit limits the space a process may use on disk while the OOM
> killer gets activated when the system runs out of RAM. So these seem to
> be unrelated.
>
> hp
>
Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at the
very same time, with only one user connected, and only one query running...

>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 06:18, Adrian Klaver  a
écrit :

> On 5/5/23 18:14, Marc Millas wrote:
> > Hi,
> >
> > postgres 14.2 on Linux redhat
> >
> > temp_file_limit set around 210 GB.
> >
> > a select request with 2 left join have crashed the server (oom killer)
> > after the postgres disk occupation did grow from 15TB to 16 TB.
>
> The result of EXPLAIN  would be helpful.
> Sure!
>
But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed by
control c

> >
> > What are the cases where postgres may grow without caring about
> > temp_file_limit ?
> >
> > thanks,
> >
> >
> >
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 15:15, Ron  a écrit :

> On 5/6/23 07:19, Marc Millas wrote:
>
>
>
> Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :
>
>> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
>> > postgres 14.2 on Linux redhat
>> >
>> > temp_file_limit set around 210 GB.
>> >
>> > a select request with 2 left join have crashed the server (oom killer)
>> after
>> > the postgres disk occupation did grow from 15TB to 16 TB.
>>
>
> "15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds
> *down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
> only 200GB apart.
>
> Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may
> actually be working.
>

It was... 15.2  and becomes 16.3...

>
>
>> temp_file_limit limits the space a process may use on disk while the OOM
>> killer gets activated when the system runs out of RAM. So these seem to
>> be unrelated.
>>
>> hp
>>
> Its clear that oom killer is triggered by RAM and temp_file is a disk
> thing...
> But the sudden growth of disk space usage and RAM did happen exactly at
> the very same time, with only one user connected, and only one query
> running...
>
>
> If your question is about temp_file_limit, don't distract us with OOM
> issues.
>
> --
> Born in Arizona, moved to Babylonia.
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 15:15, Ron  a écrit :

> On 5/6/23 07:19, Marc Millas wrote:
>
>
>
> Le sam. 6 mai 2023 à 09:46, Peter J. Holzer  a écrit :
>
>> On 2023-05-06 03:14:20 +0200, Marc Millas wrote:
>> > postgres 14.2 on Linux redhat
>> >
>> > temp_file_limit set around 210 GB.
>> >
>> > a select request with 2 left join have crashed the server (oom killer)
>> after
>> > the postgres disk occupation did grow from 15TB to 16 TB.
>>
>
> "15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds
> *down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
> only 200GB apart.
>
> Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may
> actually be working.
>
>
>> temp_file_limit limits the space a process may use on disk while the OOM
>> killer gets activated when the system runs out of RAM. So these seem to
>> be unrelated.
>>
>> hp
>>
> Its clear that oom killer is triggered by RAM and temp_file is a disk
> thing...
> But the sudden growth of disk space usage and RAM did happen exactly at
> the very same time, with only one user connected, and only one query
> running...
>
>
> If your question is about temp_file_limit, don't distract us with OOM
> issues.
>
My question is how postgres can use space without caring about
temp_file_limit. The oom info is kind of hint about the context as, as
said, one select did generate both things

>
> --
> Born in Arizona, moved to Babylonia.
>


Re: Death postgres

2023-05-06 Thread Marc Millas
Le sam. 6 mai 2023 à 18:11, Adrian Klaver  a
écrit :

> On 5/6/23 05:25, Marc Millas wrote:
> >
> >
> > Le sam. 6 mai 2023 à 06:18, Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> a écrit :
> >
> > On 5/5/23 18:14, Marc Millas wrote:
> >  > Hi,
> >  >
> >  > postgres 14.2 on Linux redhat
> >  >
> >  > temp_file_limit set around 210 GB.
> >  >
> >  > a select request with 2 left join have crashed the server (oom
> > killer)
> >  > after the postgres disk occupation did grow from 15TB to 16 TB.
> >
> > The result of EXPLAIN  would be helpful.
> > Sure!
> >
> > But. One of the table looks "inaccessible" since.
> > Ie. Even explain select * from the_table didnt answer and must be killed
> > by control c
>
> When you restarted the server where there any warnings shown?
>
Sadly, I cannot. Will be done next tuesday.

>
> Using psql can you \d ?
>
Yes, and no pb to check pg_statistic,...

>
> Can you select from any other table in the database?
>
Yes

>
>
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Death postgres

2023-05-07 Thread Marc Millas
Le sam. 6 mai 2023 à 21:46, Adrian Klaver  a
écrit :

> On 5/6/23 10:13, Marc Millas wrote:
> >
> >
>
> > When you restarted the server where there any warnings shown?
> >
> > Sadly, I cannot. Will be done next tuesday.
>
> Cannot do what:
>
> 1) Get to the log to see if there are warnings?
>
> 2) Restart the server?
>
> Your original post said the server crashed.
>
> If that was the case how can you do any of the below without restarting it?
> Infrastructure is managed by another company. The db is within a
> container.
>
So.. If I run a pg_ctl restart, the container... Vanished.
So for each main thing, I must ask... By mail..

> >
> >
> > Using psql can you \d ?
> >
> > Yes, and no pb to check pg_statistic,...
> >
> >
> > Can you select from any other table in the database?
> >
> > Yes
> >
> >
> >
> >
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Death postgres

2023-05-10 Thread Marc Millas
On Sun, May 7, 2023 at 8:42 PM Laurenz Albe 
wrote:

> On Sat, 2023-05-06 at 03:14 +0200, Marc Millas wrote:
> > postgres 14.2 on Linux redhat
> >
> > temp_file_limit set around 210 GB.
> >
> > a select request with 2 left join have crashed the server (oom killer)
> after the postgres
> > disk occupation did grow from 15TB to 16 TB.
> >
> > What are the cases where postgres may grow without caring about
> temp_file_limit ?
>
> That's too little information for a decent answer.
> One obvious answer is: if it is not writing temporary files.
>
> Yours,
> Laurenz Albe
>

Logical ...

so here is a little more info:

db=# analyze myschema.table_a;
ANALYZE

db=# with ta as(select 'myschema' as s, 'table_a' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t),  tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
 staattnum | column_name  | stanullfrac | stadistinct
---+--+-+-
 1 | col_ne |   0 |  -0.6100224
 2 | col_brg|  0.0208 |   6
 3 | col_ano|   0 |  447302
 4 | col_ine|   0 |  -0.5341927
 5 | col_cha|   0 |  11
(5 rows)

db=# select count(*) from myschema.table_a;
  count
--
 13080776
(1 row)

db=# select count(distinct col_ano) from myschema.table_a;
  count
--
 10149937
(1 row)


// stats looks somewhat absurd, as analyze guess 20 times less distinct
values as a select distinct does on column col_ano...
db=# analyze myschema.table_b;
ANALYZE
db=# with ta as(select 'myschema' as s, 'table_b' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t),  tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
 staattnum |   column_name   | stanullfrac | stadistinct
---+-+-+-
 1 | col_nerg  |   0 |  161828
 2 | col_ibi   |   0 |  362161
 3 | col_imi   |   0 |  381023
 4 | col_ipi   |   0 |  391915
 5 | col_ne|   0 | -0.53864235
 6 | col_ano   |   0 |  482225
(6 rows)

db=# select count(*) from myschema.table_b;
  count
--
 14811485
(1 row)

db=# select count(distinct col_ano) from myschema.table_b;
  count
--
 10149937
(1 row)

//same remark


db=# explain  select distinct t1.col_ine,  case when t2.col_ibi is null
then t3.col_ibi else t2.col_ibi end  from myschema.table_a t1 left join
myschema.table_b t2 on t1.col_ano=t2.col_ano Left join myschema.table_b t3
on t1.NUM_ENQ=t3.NUM_ENQ;
QUERY PLAN

---
 Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864 width=97)
   ->  Gather Merge  (cost=72377463163.02..195904919832.48
rows=1021522829864 width=97)
 Workers Planned: 5
 ->  Sort  (cost=72377462162.94..72888223577.87 rows=204304565973
width=97)
   Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN
t3.col_ibi ELSE t2.col_ibi END)
   ->  Parallel Hash Left Join  (cost=604502.76..1276224253.51
rows=204304565973 width=97)
 Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
 ->  Parallel Hash Left Join
 (cost=300803.38..582295.38 rows=4857277 width=52)
   Hash Cond: ((t1.col_ne)::text =
(t3.col_ne)::text)
   ->  Parallel Seq Scan on table_a t1
 (cost=0.00..184052.35 rows=2616335 width=53)
   ->  Parallel Hash  (cost=243466.06..243466.06
rows=2965306 width=31)
 ->  Parallel Seq Scan on table_b t3
 (cost=0.00..243466.06 rows=2965306 width=31)
 ->  Parallel Hash  (cost=243466.06..243466.06
rows=2965306 width=34)
   ->  Parallel Seq Scan on table_b t2
 (cost=0.00..243466.06 rows=2965306 width=34)
 JIT:
   Functions: 19
   Options: Inlining true, Optimization true, Expressions true, Deforming
true
(17 rows)

//so.. the planner guess that those 2 join will generate 1000 billions
rows...

//so, I try to change stats

db=# alter table myschema.table_a alter column col_ano 

Re: Death postgres

2023-05-10 Thread Marc Millas
On Wed, May 10, 2023 at 7:24 PM Peter J. Holzer  wrote:

> On 2023-05-10 16:35:04 +0200, Marc Millas wrote:
> >  Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864
> width=97)
> >->  Gather Merge  (cost=72377463163.02..195904919832.48
> rows=1021522829864 width=97)
> ...
> >->  Parallel Hash Left Join
>  (cost=604502.76..1276224253.51 rows=204304565973 width=97)
> >  Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
> ...
> >
> > //so.. the planner guess that those 2 join will generate 1000 billions
> rows...
>
> Are some of the col_ano values very frequent? If say the value 42 occurs
> 1 million times in both table_a and table_b, the join will create 1
> trillion rows for that value alone. That doesn't explain the crash or the
> disk usage, but it would explain the crazy cost (and would probably be a
> hint that this query is unlikely to finish in any reasonable time).
>
> hp
>
> good guess, even if a bit surprising: there is one (and only one) "value"
which fit your supposition: NULL
75 in each table which perfectly fit the planner rows estimate.
One question: what is postgres doing when it planned to hash 1000 billions
rows ?
Did postgres create an appropriate ""space"" to handle those 1000 billions
hash values ?
 thanks,
MM

> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Re: Death postgres

2023-05-11 Thread Marc Millas
On Thu, May 11, 2023 at 1:56 AM Peter J. Holzer  wrote:

> On 2023-05-10 22:52:47 +0200, Marc Millas wrote:
> > On Wed, May 10, 2023 at 7:24 PM Peter J. Holzer 
> wrote:
> >
> > On 2023-05-10 16:35:04 +0200, Marc Millas wrote:
> > >  Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864
> width=
> > 97)
> > >->  Gather Merge  (cost=72377463163.02..195904919832.48 rows=
> > 1021522829864 width=97)
> > ...
> > >->  Parallel Hash Left Join  (cost=
> > 604502.76..1276224253.51 rows=204304565973 width=97)
> > >  Hash Cond: ((t1.col_ano)::text =
> (t2.col_ano)::text)
> > ...
> > >
> > > //so.. the planner guess that those 2 join will generate 1000
> billions
> > rows...
> >
> > Are some of the col_ano values very frequent? If say the value 42
> occurs
> > 1 million times in both table_a and table_b, the join will create 1
> > trillion rows for that value alone. That doesn't explain the crash
> or the
> > disk usage, but it would explain the crazy cost (and would probably
> be a
> > hint that this query is unlikely to finish in any reasonable time).
> >
> >
> > good guess, even if a bit surprising: there is one (and only one)
> "value" which
> > fit your supposition: NULL
>
> But NULL doesn't equal NULL, so that would result in only one row in the
> left join. So that's not it.
>

if so... how ???

>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


order by

2023-05-11 Thread Marc Millas
Hi,

I keep on investigating on the "death postgres" subject
but open a new thread as I don't know if it's related to my pb.

I have 2 different clusters, on 2 different machines, one is prod, the
second test.
Same data volumes.

On prod if I do
select col_a, count(col_a) from table_a group by col_a order by col_a desc,
I get the numbers of NULL on top.
To get the number of NULL on top on the test db, I have to
select col_a, count(col_a) from table_a group by col_a order by col_a asc.

so, it looks like there is something different within the b-tree operator
class of varchar (?!?)
between those 2 clusters.

What can I check to to explain this difference as, to my understanding,
it's not a postgresql.conf parameter.

thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


gather merge

2023-05-11 Thread Marc Millas
Hi,

another new thread related to "death postgres":
how to stop Gather Merge from going parallel ?
ie. not forcing parallel to one thread as limitating max_parallel_workers
(per_gatherer)

thanks,



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: order by

2023-05-11 Thread Marc Millas
Thanks,

I do know about index options.

that table have NO (zero) indexes.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, May 11, 2023 at 4:48 PM Adam Scott  wrote:

> Check the index creation has NULLS FIRST (or LAST) on both indexes that
> are used. Use explain to see what indexes are used
>
> See docs for create index:
> https://www.postgresql.org/docs/current/sql-createindex.html
>
> On Thu, May 11, 2023, 7:30 AM Marc Millas  wrote:
>
>> Hi,
>>
>> I keep on investigating on the "death postgres" subject
>> but open a new thread as I don't know if it's related to my pb.
>>
>> I have 2 different clusters, on 2 different machines, one is prod, the
>> second test.
>> Same data volumes.
>>
>> On prod if I do
>> select col_a, count(col_a) from table_a group by col_a order by col_a
>> desc,
>> I get the numbers of NULL on top.
>> To get the number of NULL on top on the test db, I have to
>> select col_a, count(col_a) from table_a group by col_a order by col_a asc.
>>
>> so, it looks like there is something different within the b-tree operator
>> class of varchar (?!?)
>> between those 2 clusters.
>>
>> What can I check to to explain this difference as, to my understanding,
>> it's not a postgresql.conf parameter.
>>
>> thanks
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>


Re: order by

2023-05-11 Thread Marc Millas
On Thu, May 11, 2023 at 4:43 PM Adrian Klaver 
wrote:

> On 5/11/23 07:29, Marc Millas wrote:
> > Hi,
> >
> > I keep on investigating on the "death postgres" subject
> > but open a new thread as I don't know if it's related to my pb.
> >
> > I have 2 different clusters, on 2 different machines, one is prod, the
> > second test.
> > Same data volumes.
>
> How can they be sharing the same data 'volume'?
>
roughly: one table is 1308 lines and the second is 1310 lines,
the data comes from yet another DB.

> those 2 tables have no indexes. they are used to build kind of aggregates
> thru multiple left joins.
>


> Do you mean you are doing dump/restore between them?
>
no

>
> Postgres version for each cluster is?
> 14.2
>



> >
> > On prod if I do
> > select col_a, count(col_a) from table_a group by col_a order by col_a
> desc,
> > I get the numbers of NULL on top.
> > To get the number of NULL on top on the test db, I have to
> > select col_a, count(col_a) from table_a group by col_a order by col_a
> asc.
> >
> > so, it looks like there is something different within the b-tree
> > operator class of varchar (?!?)
> > between those 2 clusters.
> >
> > What can I check to to explain this difference as, to my understanding,
> > it's not a postgresql.conf parameter.
> >
> > thanks
> >
> > Marc MILLAS
> > Senior Architect
> > +33607850334
> > www.mokadb.com <http://www.mokadb.com>
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: order by

2023-05-11 Thread Marc Millas
On Thu, May 11, 2023 at 5:23 PM Adrian Klaver 
wrote:

> On 5/11/23 08:00, Marc Millas wrote:
> >
> > On Thu, May 11, 2023 at 4:43 PM Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 5/11/23 07:29, Marc Millas wrote:
> >  > Hi,
> >  >
> >  > I keep on investigating on the "death postgres" subject
> >  > but open a new thread as I don't know if it's related to my pb.
> >  >
> >  > I have 2 different clusters, on 2 different machines, one is
> > prod, the
> >  > second test.
> >  > Same data volumes.
> >
> > How can they be sharing the same data 'volume'?
> >
> >  roughly: one table is 1308 lines and the second is 1310
> > lines, the data comes from yet another DB.
> >
> > those 2 tables have no indexes. they are used to build kind of
> > aggregates thru multiple left joins.
> >
> > Do you mean you are doing dump/restore between them?
> >
> > no
>
> So how is the data getting from the third database to the prod and test
> clusters?
>
> For the machines hosting the third db, the prod and test clusters what
> are?:
>

should I understand that you suggest that the way the data is inserted Do
change the behaviour of the ORDER BY clause ??

>
> OS
>
> OS version
>
> locale
>
>
> >
> >
> > Postgres version for each cluster is?
> > 14.2
>
> FYI, 14.8 has just been released so the clusters are behind by 6 bug fix
> releases.
> Sadly.. I know.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: gather merge

2023-05-11 Thread Marc Millas
so, I put max_parallel_workers_per_gather to 0, and it does work, no more
parallel execution.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, May 11, 2023 at 4:38 PM Marc Millas  wrote:

> Hi,
>
> another new thread related to "death postgres":
> how to stop Gather Merge from going parallel ?
> ie. not forcing parallel to one thread as limitating max_parallel_workers
> (per_gatherer)
>
> thanks,
>
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>


Re: Death postgres

2023-05-11 Thread Marc Millas
On Thu, May 11, 2023 at 1:56 AM Peter J. Holzer  wrote:

> On 2023-05-10 22:52:47 +0200, Marc Millas wrote:
> > On Wed, May 10, 2023 at 7:24 PM Peter J. Holzer 
> wrote:
> >
> > On 2023-05-10 16:35:04 +0200, Marc Millas wrote:
> > >  Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864
> width=
> > 97)
> > >->  Gather Merge  (cost=72377463163.02..195904919832.48 rows=
> > 1021522829864 width=97)
> > ...
> > >->  Parallel Hash Left Join  (cost=
> > 604502.76..1276224253.51 rows=204304565973 width=97)
> > >  Hash Cond: ((t1.col_ano)::text =
> (t2.col_ano)::text)
> > ...
> > >
> > > //so.. the planner guess that those 2 join will generate 1000
> billions
> > rows...
> >
> > Are some of the col_ano values very frequent? If say the value 42
> occurs
> > 1 million times in both table_a and table_b, the join will create 1
> > trillion rows for that value alone. That doesn't explain the crash
> or the
> > disk usage, but it would explain the crazy cost (and would probably
> be a
> > hint that this query is unlikely to finish in any reasonable time).
> >
> >
> > good guess, even if a bit surprising: there is one (and only one)
> "value" which
> > fit your supposition: NULL
>
> But NULL doesn't equal NULL, so that would result in only one row in the
> left join. So that's not it.
>


so, apo...

the 75 lines in each tables are not NULLs but '' empty varchar, which,
obviously is not the same thing.
and which perfectly generates 500 billions lines for the left join.
So, no planner or statistics pbs. apologies for the time wasted.
Back to the initial pb:
if, with temp_file_limit positioned to 210 GB, I try to run the select *
from table_a left join table_b on the col_a (which contains the 75 ''
on both tables)
then postgres do crash, killed by oom, after having taken 1.1 TB of
additional disk space.
the explain plan guess 512 planned partitions. (obviously, I cannot provide
an explain analyze...)

to my understanding, before postgres 13, hash aggregate did eat RAM
limitless in such circumstances.
but in 14.2 ??
(I know, 14.8 is up...)


> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: Death postgres

2023-05-12 Thread Marc Millas
On Fri, May 12, 2023 at 8:31 AM Peter J. Holzer  wrote:

> On 2023-05-11 21:27:57 +0200, Marc Millas wrote:
> > the 75 lines in each tables are not NULLs but '' empty varchar,
> which,
> > obviously is not the same thing.
> > and which perfectly generates 500 billions lines for the left join.
> > So, no planner or statistics pbs. apologies for the time wasted.
>
> No problem. Glad to have solved that puzzle.
>
> > Back to the initial pb: if, with temp_file_limit positioned to 210 GB,
> > I try to run the select * from table_a left join table_b on the col_a
> > (which contains the 75 '' on both tables)
> > then postgres do crash, killed by oom, after having taken 1.1 TB of
> additional
> > disk space.
>
> My guess is that the amount of parallelism is the problem.
>
> work_mem is a per-node limit. Even a single process can use a multiple of
> work_mem if the query contains nested nodes (which almost every query
> does, but most nodes don't need much memory). With 5 parallel workers,
> the total consumption will be 5 times that. So to prevent the OOM
> condition you would need to reduce work_mem or max_parallel_workers (at
> least for this query).
>

we have more than 100GB RAM and only 1 user, with one request running.
work_mem is set to 10MB.  for oom to kill due to work_mem it means that for
one request with 2 left join, postgres needs more than 10.000 work_mem
buffers.
to me, it seems difficult to believe. but that postgres may need that RAM
space for hashing or whatever other similar purpose seems more probable.
no ?

>
> The description temp_file_limit says "...the maximum amount of disk
> space that *a process* can use...". So with 5 workers that's 210*5 =
> 1050 GB total. Again, you may want to reduce either temp_file_limit or
> max_parallel_workers.
>
Yes, but if so, we may have had a request canceled due to temp_file limit,
which was not the case.

>
> > to my understanding, before postgres 13, hash aggregate did eat RAM
> limitless
> > in such circumstances.
> > but in 14.2 ??
> > (I know, 14.8 is up...)
>
> Maybe the older version of postgres didn't use as many workers for that
> query (or maybe not parallelize it at all)?
>
> hp
>
> --
>_  | Peter J. Holzer| Story must make more sense than reality.
> |_|_) ||
> | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |   challenge!"
>


Re: order by

2023-05-13 Thread Marc Millas
On Thu, May 11, 2023 at 11:08 PM Ron  wrote:

> On 5/11/23 09:55, Marc Millas wrote:
>
> Thanks,
>
> I do know about index options.
>
> that table have NO (zero) indexes.
>
>
> If the table has no indices, then why did you write "it looks like there
> is something different within the *b-tree operator* class of varchar"?
> After all, you only care about b-trees when you have b-tree indices.
>
> to my understanding, the btree operator is the default operator used to do
any sort, like an order by, for varchar, text, .. types.

> --
> Born in Arizona, moved to Babylonia.
>


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-21 Thread Marc Millas
Oracle have One (1) DB per instance (in Oracle its not named a cluster
as...there is only one !). So ...

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Thu, May 18, 2023 at 9:30 PM Adrian Klaver 
wrote:

> On 5/18/23 11:56, Ron wrote:
> > On 5/18/23 12:54, Rob Sargent wrote:
> >> On 5/18/23 11:49, Ron wrote:
>
> > We need to keep costs down, too.
> >
> > Oracle (I think) does it at the DB level, and so does SQL Server. Upper
> > Management hears us say "sorry, no can do" and wonders what bunch of
> > amateurs are developing PostgreSQL.
>
> Looks like you will be migrating to Oracle or SQL Server.
>
> Good luck on keeping costs down.
>
> >
> > --
> > Born in Arizona, moved to Babylonia.
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
>


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-21 Thread Marc Millas
?? the sqlite thing ??

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sun, May 21, 2023 at 7:15 PM Theodore M Rolle, Jr. 
wrote:

> What about sqlcipher?
>
> On Sun, May 21, 2023, 07:16 Marc Millas  wrote:
>
>> Oracle have One (1) DB per instance (in Oracle its not named a cluster
>> as...there is only one !). So ...
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
>> On Thu, May 18, 2023 at 9:30 PM Adrian Klaver 
>> wrote:
>>
>>> On 5/18/23 11:56, Ron wrote:
>>> > On 5/18/23 12:54, Rob Sargent wrote:
>>> >> On 5/18/23 11:49, Ron wrote:
>>>
>>> > We need to keep costs down, too.
>>> >
>>> > Oracle (I think) does it at the DB level, and so does SQL Server.
>>> Upper
>>> > Management hears us say "sorry, no can do" and wonders what bunch of
>>> > amateurs are developing PostgreSQL.
>>>
>>> Looks like you will be migrating to Oracle or SQL Server.
>>>
>>> Good luck on keeping costs down.
>>>
>>> >
>>> > --
>>> > Born in Arizona, moved to Babylonia.
>>>
>>> --
>>> Adrian Klaver
>>> adrian.kla...@aklaver.com
>>>
>>>
>>>
>>>


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-21 Thread Marc Millas
Hi Thomas,

to my understanding, there is ONE master key for TDE in 12c.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Sun, May 21, 2023 at 9:47 PM Thomas Kellerer  wrote:

>
> Marc Millas schrieb am 21.05.2023 um 13:13:
> > Oracle have One (1) DB per instance (in Oracle its not named a cluster
> as...there is only one !). So ...
>
> Oracle can have multiple (pluggable) databases per instance since Oracle 12
>
>
>
>


syntax pb

2023-05-30 Thread Marc Millas
Hi,

I always have had difficulties to understand syntax. So...

If I have:
create table t1 (t text);
create table t2 (a text, b text, c test, d numeric);
insert into t1('azerty');
INSERT 0 1
fine !


so, now, if I write:
Select distinct test1.t, 'abc' as b, NULL as c, NULL as d
>From t1 test1;
   t  |  b   | c  | d
+-+---+---
 azerty | abc |   |
(1 row)

ok.

and , now, if I want to insert that:
Insert into t2 (a, b, c, d)
Select distinct test1.t, 'abc' as b, NULL as c, NULL as d
>From t1 test1;

I get:
ERROR:  column "d" is of type numeric but expression is of type text
LINE 2: Select distinct test1.t, 'abc' as b, NULL as c, NULL as d

HINT:  You will need to rewrite or cast the expression.

Can someone give a short SQL syntax hint ?

thanks,



Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: syntax pb

2023-05-30 Thread Marc Millas
The above should have been:
>
> Insert into t2 (a, b, c, d)
> Select distinct test1.t, 'abc' as b, NULL::text, NULL::numeric
>  From t1 test1;
>
> >
> > which results in:
> >
> > select * from t2;
> > a|  b  |  c   |  d
> > +-+--+--
> >   azerty | abc | NULL | NULL
> >
> >
>

Thanks Adrian, but if the query becomes more complex, for example with a
few joins more, then even casting doesn't work.
This comes from a prod environment and even casting NULLs (which is more
than strange, BTW) generates absurd errors.
Too my understanding it looks like the parser did not parse the select
distinct as we think he does.



> >> Marc MILLAS
> >> Senior Architect
> >> +33607850334
> >> www.mokadb.com <http://www.mokadb.com>
> >>
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: syntax pb

2023-05-30 Thread Marc Millas
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, May 30, 2023 at 3:51 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tuesday, May 30, 2023, Marc Millas  wrote:
>
>>
>> I get:
>> ERROR:  column "d" is of type numeric but expression is of type text
>> LINE 2: Select distinct test1.t, 'abc' as b, NULL as c, NULL as d
>>
>> HINT:  You will need to rewrite or cast the expression.
>>
>> Can someone give a short SQL syntax hint ?
>>
>
>
> https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-TYPE-CASTS
>

I plainly agree on that...
but its NOT what's happeninng. The doc you point to states:
"An explicit type cast can usually be omitted if there is no ambiguity as
to the type that a value expression must produce (for example, when it is
assigned to a table column);"

in the SQL I provide I ask to put a NULL in a numeric column.
Can you tell where the ambiguity is ?



>
> David J.
>


Re: syntax pb

2023-05-30 Thread Marc Millas
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, May 30, 2023 at 7:12 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, May 30, 2023 at 8:53 AM Marc Millas  wrote
>
>
>> This comes from a prod environment and even casting NULLs (which is more
>> than strange, BTW) generates absurd errors.
>>
>
> If you want an input to be anything other than plain text (numbers
> partially exempted) you need to cast it.  Sure, some limited cases allow
> for other parts of a query to infer untyped literals, but literals defined
> at the top-level of a SELECT is not one of those places.
>
> Too my understanding it looks like the parser did not parse the select
>> distinct as we think he does.
>>
>
> The DISTINCT clause doesn't really come into play here at all, so if you
> think it does you indeed have a misunderstanding.
> Inputting literal NULLs, and using DISTINCT, are both, IMO, considered
> code smells and seldom used.  You still need to be able to interpret error
> messages but if you are running actual queries with these things you may
> have larger model design and query writing concerns to deal with in
> addition to being able to identify the problems specific error messages are
> pointing out and trying to fix them.
>

Hi David, my guess about the distinct syntax was just because if I take the
distinct OUT, the SQL works fine. nothing more, nothing less...

>
> David J.
>
>


Re: syntax pb

2023-05-30 Thread Marc Millas
Thanks for the explanation. Crystal clear, thanks

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, May 30, 2023 at 7:31 PM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Tue, May 30, 2023 at 8:53 AM Marc Millas 
> wrote
> >> Too my understanding it looks like the parser did not parse the select
> >> distinct as we think he does.
>
> > The DISTINCT clause doesn't really come into play here at all, so if you
> > think it does you indeed have a misunderstanding.
>
> No, he's correct:
>
> postgres=# create table z (f1 int);
> CREATE TABLE
> postgres=# insert into z values(null);
> INSERT 0 1
> postgres=# insert into z select null;
> INSERT 0 1
> postgres=# insert into z select distinct null;
> ERROR:  column "f1" is of type integer but expression is of type text
> LINE 1: insert into z select distinct null;
>   ^
> HINT:  You will need to rewrite or cast the expression.
>
> The first two INSERTs are accepted because there's nothing
> "between" the untyped NULL and the INSERT, so we can resolve
> the NULL as being of type int.  But use of DISTINCT requires
> resolving the type of the value (else how do you know what's
> distinct from what?) and by default we'll resolve to text,
> and then that doesn't match what the INSERT needs.
>
> regards, tom lane
>


Re: Hash Index on Partitioned Table

2023-06-01 Thread Marc Millas
Hi Peter,

in postgres 13, create index should be, by default, parallelized.
so albeit for specific values of the parallelization parameters in
postgresql.conf, your machine should use more than one core while creating
the indexes.
also you can set the maintenance_workmem parameter to the max for such a
job, as you have some RAM.

In my own experience of indexing big partitioned tables, I did create a few
scripts to create each index (there is one index for each partition) and
when finished create the 'global' index which, as the job is already done,
is fast.
(check the 'only' parameter in create index doc).
doing this it was easy to somewhat optimize the process according to number
of available core/RAM/storage.

hf

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Wed, May 31, 2023 at 7:53 PM peter.boris...@kartographia.com <
peter.boris...@kartographia.com> wrote:

> Hi Tom,
>
> Thanks so much for your quick response. As luck would have it, the
> index FINALLY finished about an hour ago. For a size comparison:
>
>
>
> BTree: 6,433 GB
>
> Hash: 8,647 GB
>
>
>
> Although I don't have a proper benchmark to compare performance, I can say
> the hash is working as good as if not faster than the BTree for my use case
> (web application).
>
>
>
> I guess I was getting a little nervous waiting for the index to complete
> and seeing such a huge difference in file size but I'm ok now :-)
>
>
>
> Thanks again,
>
> Peter
>
>
>
>
>
> -Original Message-
> From: "Tom Lane" 
> Sent: Wednesday, May 31, 2023 10:07am
> To: "peter.boris...@kartographia.com" 
> Cc: pgsql-general@lists.postgresql.org
> Subject: Re: Hash Index on Partitioned Table
>
> "peter.boris...@kartographia.com" 
> writes:
> > I have a rather large database with ~250 billion records in a
> partitioned table. The database has been running and ingesting data
> continuously for about 3 years.
>
> > I had a "regular" BTree index on one of the fields (a unique bigint
> column) but it was getting too big for the disk it was on. The index was
> consuming 6.4 TB of disk space.
>
> That's ... really about par for the course. Each row requires an 8-byte
> index entry, plus 12 bytes index overhead. If I'm doing the math right
> then the index is physically about 78% full which is typical to good for
> a b-tree. Reindexing would remove most of the extra space, but only
> temporarily.
>
> > After doing some research I decided to try to create a hash index
> instead of a BTree. For my purposes, the index is only used to find
> specific numbers ("=" and "IN" queries). From what I read, the hash index
> should run a little faster than btree for my use case and should use less
> disk space.
>
> I'm skeptical. The thing to bear in mind is that btree is the mainstream
> use-case and has been refined and optimized far more than the hash index
> logic.
>
> > (1) Why is the hash index consuming more disk space than the btree
> index? Is it because the hash of the bigint values larger than the storing
> the bigints in the btree?
>
> From memory, the index entries will be the same size in this case,
> but hash might have more wasted space.
>
> > (4) Is there any way to estimate when the index process will complete?
>
> An index on a partitioned table isn't a single object, it's one index per
> partition. So you should be able to look at how many partitions have
> indexes so far. You might have to drill down to the point of counting how
> many files in the database's directory, if the individual indexes aren't
> showing up as committed catalog entries yet.
>
> regards, tom lane
>


date format

2023-06-14 Thread Marc Millas
Hi,

I would like to load data from a file via file_fdw or COPY.. its a postgres
14 cluster

but.. One date (timestamp) column is written french order and another
column is written english order. Data comes from a state owned entity so
asking for a normalization may take ages.

obviously I could load as char and then apply an appropriate
transformation. no pb.
But is there a direct way to do this ?

thanks


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: date format

2023-06-14 Thread Marc Millas
On Wed, Jun 14, 2023 at 7:27 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Jun 14, 2023 at 9:42 AM Marc Millas 
> wrote:
>
>> Hi,
>>
>> I would like to load data from a file via file_fdw or COPY.. its a
>> postgres 14 cluster
>>
>> but.. One date (timestamp) column is written french order and another
>> column is written english order. Data comes from a state owned entity so
>> asking for a normalization may take ages.
>>
>> obviously I could load as char and then apply an appropriate
>> transformation. no pb.
>> But is there a direct way to do this ?
>>
>
> Probably no - casting formats via locale cannot be specified at that scope
> when using copy.  Either the cast for a given single setting produces the
> correct result or it doesn't.  If you need a custom cast like this you have
> to get away from COPY first.  Usually that is best done after importing
> data to a temporary table as text.
>
> David J.
>

So, creating a foreign table with varchar type, and then doing the insert
as select with the appropriate format.. clear.
somewhat sad as it was a one step process with the former oracle db we get
rid off.

Marc


  1   2   >