Re: Who am I? Where am I connected?

2022-05-18 Thread Erik Wienhold
Hi Dominique,

you can use \conninfo in psql to show the database, user, host (or socket in my 
example), and port:

ewie@desktop ~ $ psql test
Null display is "".
psql (14.3)
Type "help" for help.

test=# \conninfo
You are connected to database "test" as user "ewie" via socket in 
"/run/postgresql" at port "5432".


- Erik


> On 18/05/2022 12:07 Dominique Devienne  wrote:
> 
>  
> LibPQ has various defaults for the host, user, and DB name.
> There's also the password file, the service file and service name.
> In the example below, I can connect with a "naked" psql invocation.
> 
> Once connected, can I find out all aspects of the connection string?
> Or where they came from, like a pgpass.conf or service file?
> 
> How to get the host, port, db name once connected?
> SHOW and pg_settings does not appear to be it, at first glance.
> 
> Thanks, --DD
> 
> c:\Users\ddevienne>psql
> psql (12.1, server 14.2)
> WARNING: psql major version 12, server major version 14.
>  Some psql features might not work.
> WARNING: Console code page (437) differs from Windows code page (1252)
>  8-bit characters might not work correctly. See psql reference
>  page "Notes for Windows users" for details.
> Type "help" for help.
> 
> ddevienne=>




Re: Support for dates before 4713 BC

2022-08-23 Thread Erik Wienhold


> On 23/08/2022 14:38 CEST Peter J. Holzer  wrote:
> 
>  
> On 2022-08-23 10:25:12 +0100, Simon Riggs wrote:
> > On Mon, 22 Aug 2022 at 11:14, stefan eichert  
> > wrote:
> > > In order to deal with all dates, historical and prehistoric ones, in
> > > a consistent way, the implementation of timestamps/dates for before
> > > 4713 BC would be very helpful, as we really do have dates before
> > > 4713 BC we are working with, that in some cases also have
> > > information on months respectively days.
> > 
> > One possibility is to store dates as the INTERVAL datatype, using the
> > convention for Before Present, rather than worrying about BC/AD.
> > 
> > create table arch (i interval year);
> > insert into arch values ('-5000 years');
> > select * from arch;
> > 
> >   i
> > -
> >  -5000 years
> 
> [Disclaimer: I am not an archeologist]

Me neither ;)

> I think this works well if you know an approximate age. If something is
> about 5000 years old now, it will still be about 5000 years old next
> year and even in ten years.
> 
> But it breaks down if you already have a relatively precise date.
> 
> 4980 years before now probably should be 4981 years before now next year
> and definitely 4990 years before now in ten years. So you would have to
> continuosly update those values.

Would be easier to store the estimated age and the assessment date so the age
doesn't have to be updated until it's reassessed.  The time from assessment to
now can be handled as you described: either the age is still considered roughly
the same or it increases as time between assessment and now increases.

--
Erik




Re: 10.22 Windows binaries download? (zip "invalid" on Enterprisedb)

2022-09-21 Thread Erik Wienhold
> On 21/09/2022 15:34 CEST Thomas, Richard  
> wrote:
> 
> I’m trying to download the Windows x64 binaries only for PostgreSQL 10.22 in 
> order to perform a minor upgrade. However, the download 
> (postgresql-10.22-1-windows-x64-binaries.zip) from my usual go-to place: 
> (https://www.enterprisedb.com/download-postgresql-binaries) gives me a file 
> that Windows zip declares invalid.

There's already a thread on that issue:

https://www.postgresql.org/message-id/flat/CA%2BRCjaw%3DGH-Lyxe5%3D5-un6sQjO-cQjhTeBDomiXoWQDrvVYwxA%40mail.gmail.com

--
Erik




Re: Playing with pgcrypto

2022-10-10 Thread Erik Wienhold
Hi Howard,

> On 11/10/2022 00:25 CEST howardn...@selestial.com wrote:
> 
> I am trying out a few pgcrypto functions. I was expecting the final 
> select statement to return the row I just inserted - Can anyone tell me 
> what I am not understanding here?
> 
> create table test (p1 bytea);
> insert into test (pgp_sym_encrypt('123', 'secret'));

Your INSERT is malformed.  It's missing a VALUES clause or SELECT.

> select * from test where pgp_sym_encrypt('123', 'secret') = p1;

pgp_sym_encrypt uses a random salt each time, so you cannot compare the output
to p1 like you would do with crypt to verify a given plaintext.  Instead, use
pgp_sym_decrypt with p1 as input to get the plaintext.

--
Erik




Re: Exponentiation confusion

2022-10-13 Thread Erik Wienhold
> On 13/10/2022 18:20 CEST Adrian Klaver  wrote:
> 
> In trying to answer an SO question I ran across this:
> 
> Postgres version 14.5
> 
> select 10^(-1 * 18);
>   ?column?
> --
>  1e-18
> 
> select 10^(-1 * 18::numeric);
>?column?
> 
>   0.
> 
> 
> Same for power:
> 
> select power(10, -18);
>   power
> ---
>   1e-18
> (1 row)
> 
> select power(10, -18::numeric);
> power
> 
>   0.
> 
> 
> Why is the cast throwing off the result?

power has two overloads: 
https://www.postgresql.org/docs/14/functions-math.html#id-1.5.8.9.6.2.2.19.1.1.1

Calling power(numeric, numeric) is what I expect in that case instead of
downcasting the exponent argument to double precision, thus losing precision.

select
  pg_typeof(power(10, -18)),
  pg_typeof(power(10, -18::numeric));

pg_typeof | pg_typeof 
--+---
 double precision | numeric
(1 row)

Determining the right function is described in 
https://www.postgresql.org/docs/14/typeconv-func.html

--
Erik




Re: Exponentiation confusion

2022-10-13 Thread Erik Wienhold
> On 13/10/2022 19:16 CEST Tom Lane  wrote:
> 
> Erik Wienhold  writes:
> > On 13/10/2022 18:20 CEST Adrian Klaver  wrote:
> >> select power(10, -18::numeric);
> >> power
> >> 
> >> 0.
> >> 
> >> Why is the cast throwing off the result?
> 
> > Calling power(numeric, numeric) is what I expect in that case instead of
> > downcasting the exponent argument to double precision, thus losing 
> > precision.
> 
> An inexact result isn't surprising, but it shouldn't be *that* inexact.

Ah, now I see the problem.  I saw a bunch of zeros but not that it's *all*
zeros.  Nevermind.

--
Erik




Re: Cannot restore windows dump on linux

2022-10-19 Thread Erik Wienhold
> On 19/10/2022 17:43 CEST ertan.kucuko...@1nar.com.tr wrote:
>
> I am using PostgreSQL 14.5
>
> I tried to move a Linux database to Windows. Both identical version.
> Linux dump successfully loaded on Windows system.
> Reason for changing system didn’t work out and now I am trying to move it back
> because it has some modifications.
> I just dumped a database backup on windows to a dump file. I see dump file
> contains “\r\n” as new line identifier (hence windows dump). Just to clarify,
> this is not \N character I am mixing.
> When I try to restore windows dump on regular Linux system, I get a lot of
> errors and it stops at this one below where this is a pure data load position.
>
> ERROR: syntax error at or near "43589"
> LINE 1: 43589 7102e523-f401-4cce-852d-e537f863886f…
>
> I also tried to stop at first error when restoring, in this case it stops at
> below error
>
> root@app:/home/ek# psql -v ON_ERROR_STOP=1 -U myuser -d mydb < last_backup.bak
> SET
> SET
> SET
> SET
> SET
> set_config
> 
>
> (1 satır)
>
> SET
> SET
> SET
> SET
> ERROR: index "ix_xrates_date" does not exist
>
> I searched the internet and I could not solve my problem.
>
> How can I successfully dump on Widnows and restore on Linux?

On Windows, did you run pg_dump with --file option or redirect the output to
a file?

pg_dump --file last_backup.bak -U myuser mydb

vs.

pg_dump -U myuser mydb > last_backup.bak

The redirect variant tripped me up before because Windows 10 writes the file as
UTF-16 which is not expected by psql.  I don't know if psql can be configured
to handle UTF-16.  The redirect is also the cause for CRLF line endings.

--
Erik




Re: Column value derived from generated column in INSERT?

2022-10-19 Thread Erik Wienhold
> On 19/10/2022 23:51 CEST Adrian Klaver  wrote:
>
> On 10/19/22 12:58 PM, Adrian Klaver wrote:
> > On 10/19/22 12:48, Mark Raynsford wrote:
> >> On 2022-10-19T12:43:31 -0700
> >> Adrian Klaver  wrote:
> >>>
> >>> HINT:  There is an entry for table "t", but it cannot be referenced from
> >>> this part of the query.
> >>>
> >>> HINT:  There is a column named "x" in table "t", but it cannot be
> >>> referenced from this part of the query.
> >>
> >> Yes, I saw those, hence asking on the list if there was a way to do it.
> >
> > Using a trigger.
>
> To expand:
>
> create table t (
>  x integer not null generated always as identity,
>  y integer not null
>);
> insert into t(y) values (1);
>
> select * from t;
>
> x | y
> ---+---
>   1 | 1
> (1 row)
>
>
>
> CREATE FUNCTION identity_test( )
>   RETURNS trigger
>   LANGUAGE plpgsql
> AS $function$
> BEGIN
>  NEW.y = NEW.x * 2;
>  RETURN NEW;
> END;
>
> $function$
> ;
>
> create trigger identity_trg before insert on t for each row execute
> function identity_test();
>
> insert into t(y) values (0);
>
> select * from t;
>
>   x | y
> ---+---
>   1 | 1
>   2 | 4
> (2 rows)

Make t.y a generated column and avoid the trigger:

create table t (
  x int not null generated always as identity,
  y int not null generated always as (x * 2) stored;
);

insert into t (x) values (default), (default);

select * from t;

 x | y
---+---
 1 | 2
 2 | 4
(2 rows)

But I think Mark wants to specify the expression in the INSERT and not define
it as part of the database schema, if I understand it correctly.

--
Erik




Re: How to remove the partition from table .

2022-10-25 Thread Erik Wienhold
> On 25/10/2022 12:47 CEST Rakesh Nashine  wrote:
>
> We would like to remove the partition from one of the tables , although that
> partition has some business data. Now business doesn't need that partition
> any more .. How could we remove it? although they need the data .

Detach the partition:

https://www.postgresql.org/docs/15/sql-altertable.html#SQL-ALTERTABLE-DETACH-PARTITION

--
Erik




Re: How to add a variable to a timestamp.

2022-10-29 Thread Erik Wienhold
> On 29/10/2022 19:35 CEST Eagna  wrote:
>
> I'm trying to do something like this.
>
> SELECT
> d.i,
> h.i,
>
> '2022-10-31 00:00:00'::TIMESTAMP + INTERVAL 'd.i DAY'
> FROM
> GENERATE_SERIES(0, 6) AS d(i),
> GENERATE_SERIES(0, 23) AS h(i);
>
> where I add d.i days (and also h.i hours) to a timestamp.
>
> I can't seem to get this to work. Any ideas appreciated.

Create the interval with make_interval(days => d.i, hours => h.i).

https://www.postgresql.org/docs/15/functions-datetime.html#id-1.5.8.15.6.2.2.28.1.1.1

--
Erik




Re: Off-topic? How to extract database statements from JPA?

2022-10-31 Thread Erik Wienhold
> On 31/10/2022 11:50 CET Gus Spier  wrote:
>
> I apologize if this is off-topic, but I've become exceedingly frustrated and
> need help.
>
> The assignment is to evaluate SQL code for an application that has been built
> with Java Springer Persistence API and, if appropriate, offer suggestions to
> improve database performance.
>
> The problem is that I am not a Java guy. I can usually read code and determine
> what it is trying to do. But here, I am at a loss. Where does the JPA hide the
> SQL code? I do not really expect a definitive, explicit answer, but if anybody
> could point me to documentation or a working aid that lays out where the JPA
> stores the DDL and DML, I would be truly grateful.

My JPA knowledge is a bit rusty and I only worked with Hibernate before.
Which JPA implementation do you use?

In case of Hibernate you can enable logging:

org.hibernate.SQL => DEBUG
org.hibernate.type.descriptor.sql => TRACE

If you also use Spring you can try the following config instead:

spring.jpa.show-sql = true
spring.jpa.properties.hibernate.format_sql = true

--
Erik




Re: an difficult SQL

2022-11-06 Thread Erik Wienhold
> On 06/11/2022 13:48 CET Rafal Pietrak  wrote:
>
> W dniu 5.11.2022 o 19:05, Thiemo Kellner pisze:
> >
> > You first could select the three users with the most recent entries with
> > a windowing function
> > (https://www.postgresql.org/docs/15/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS)
>
> surely I'm missing something crucial here:
> select row_number() over w,* from eventlog where row_number() over w < 5
>   window w as (partition by user);
> ERROR:  window functions are not allowed in WHERE
>
> So I'm unable to pick a limited number of rows within the user
> "group-window" ranges.
>
> Without that, I cannot proceed.
>
> Any suggestions?

Windows functions are only permitted in SELECT and ORDER BY because they are
executed after WHERE, GROUP BY, and HAVING[1].

You need a derived table to filter on row_number:

with
ranked as (
select *, row_number() over w
from eventlog
window w as (partition by user)
)
select *
from ranked
where row_number < 5;

[1] https://www.postgresql.org/docs/15/tutorial-window.html

--
Erik




Re: Information to CVE-2022-42889

2022-11-08 Thread Erik Wienhold
> On 08/11/2022 11:50 CET Cedric Aaron Towstyka 
>  wrote:
>
> the german bureau for IT-Security "BSI" (Bundesamt für Sicherheit in der
> Informationstechnik) has issued a warning for CVE CVE-2022-42889with the name
> commons-text. Insurance companies are obliged to analyse the installed
> software for vulnerabilities of this type.  As the Barmenia is using your
> product PostgreSQL Server it is necessary to obtain all information regarding
> any vulnerability against above CVE.  We kindly ask you to provide information
> if the above product is affected by the CVE and if yes, when a fix will be
> available.

Postgres does not use Java and should not be affected.  Maybe if you use
PL/Java[1].

This CVE reminds me of Log4j from last year[2].

[1] https://tada.github.io/pljava/
[2] 
https://www.postgresql.org/message-id/flat/30390f0b07fd4d90b1aacb683ebfae45%40pictet.com

--
Erik




Re: copying json data and backslashes

2022-11-22 Thread Erik Wienhold
> On 22/11/2022 15:23 CET Alastair McKinley  
> wrote:
>
> Hi all,
>
> I have come across this apparently common issue COPY-ing json and wondering if
> there is potentially a better solution.
>
> I am copying data into a jsonb column originating from a 3rd party API. The
> data may have literal \r,\t,\n and also double backslashes.
>
> I discovered that I can cast this data to a jsonb value directly but I can't
> COPY the data without pre-processing.
>
> The example below illustrates my issue (only with \r, but the problem extends
> to other \X combinations).
>
> > do $$
> > lines=[r'{"test" : "\r this data has a carriage return"}']
> >
> > with open("/tmp/test1.json","w") as f:
> > for line in lines:
> > f.write(line.strip() + "\n")
> >
> > $$ language plpython3u;
> >
> > create temp table testing (data jsonb);
> >
> > -- this works
> > insert into testing (data)
> > select l::jsonb
> > from pg_read_file('/tmp/test1.json') f,
> > lateral regexp_split_to_table(f,'\n') l where l <> '';
> >
> > -- fails
> > copy testing (data) from '/tmp/test1.json';
> >
> > -- works
> > copy testing (data) from program $c$ sed -e 's/\\r/u000a/g' 
> > /tmp/test1.json $c$;
> >
>
> Is there any other solution with COPY that doesn't require manual
> implementation of search/replace to handle these edge cases?
> Why does ::jsonb work but COPY doesn't? It seems a bit inconsistent.

COPY handles special backslash sequences[1].  The \r in your sample JSON,
although properly escaped according to JSON, is replaced with an actual
carriage return by COPY before casting to jsonb.  The error results from JSON
prohibiting unescaped control characters in strings[2].

You must double escape to pass those characters through COPY.

See how COPY outputs backslash sequences:

-- Actual carriage return:
copy (select e'\r') to stdout;
\r

-- Backslash sequence for carriage return:
copy (select '\r') to stdout;
\\r

[1] https://www.postgresql.org/docs/current/sql-copy.html#id-1.9.3.55.9.2
[2] https://www.json.org/json-en.html

--
Erik




Re: copying json data and backslashes

2022-11-22 Thread Erik Wienhold
> On 22/11/2022 20:11 CET p...@cmicdo.com  wrote:
>
> On Tuesday, November 22, 2022 at 01:16:02 PM EST, Peter J. Holzer 
>  wrote:
>
> > On 2022-11-22 17:39:04 +, Alastair McKinley wrote:
>  > > > \copy footable from 'input.json' (format csv, escape '^B', delimieter 
> '^C
> ', quote '^E')
>  > > >
>  > > > where the control characters are the actual control char, not the
>  > > > caret-letter, and it requires no escaping escapes. I realize this
>  > > > won't work for all
>  > > > situations.
>  > >
>  > > Thanks for the suggestion, this is interesting to me to try but I am
>  > > not quite sure how this works.
>  > > As far as I understand, escape/quote/delimiter have to be a single
>  > > character, and CTRL-C etc. are multiple characters.
>  >
>
> Yes, Alastair, Peter said what I would have...
>
>  > You may have to hit several Keys[1] on your keyboard, but Ctrl-C is a
>  > single character, just like Shift-C is (the former has code 0003, the
>  > latter 0043).
>  >
>  > On Unix-like systems you can usually type the control characters by
>  > typing Ctrl-V first:
>  >
>  > At the psql prompt, type
>  > select ascii('
>  > then hit V while holding the ctrl key
>  > then hit C while holding the ctrl key
>  > The terminal should display that as ^C
>  > then complete the line with
>  > ');
>  > so that it looks like
>  > select ascii('^C');
>  > and hit return:
>  >
>  >
>  > [1] There are usually four Ctrl-Characters which need only a single
>  > key: Ctrl-I (TAB), Ctrl-M (CR), Ctrl-[ (ESC) and Ctrl-H (BS) or Ctrl-?
>  > (DEL).
>  >
>  > (On Unix systems CR is normally translated to LF, on Windows to CRLF)
>  >

Or use C-style escapes[1]:

\copy footable from 'input.json' (format csv, escape e'\2', delimiter 
e'\3', quote e'\5')

[1] 
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE

--
Erik




Re: Get table catalog from pg_indexes

2022-11-27 Thread Erik Wienhold
> On 27/11/2022 18:22 CET Igor Korot  wrote:
>
> Table pg_indexes does not contain a field for a catalog.
>
> So how do I get that?
>
> SELECT 1 FROM pg_indexes WHERE indexname = $1 AND tablename = $2 AND
> schemaname = $3

Use SELECT current_database() if you need to know the catalog.
pg_indexes only covers the current database[1].

[1] https://www.postgresql.org/docs/current/view-pg-indexes.html

--
Erik




Re: Views "missing" from information_schema.view_table_usage

2022-12-02 Thread Erik Wienhold
> On 02/12/2022 21:51 CET Jonathan Lemig  wrote:
>
> Has anybody ever encountered this, and if so, did you find a resolution?
> Or perhaps there other limitations with the VTU that I'm unaware of?

Is the one view you cannot find in view_table_usage a materialized view?
Because those are not covered by view_table_usage[1].

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/catalog/information_schema.sql;h=18725a02d1fb6ffda3d218033b972a0ff23aac3b;hb=HEAD#l2605

--
Erik




Re: Views "missing" from information_schema.view_table_usage

2022-12-02 Thread Erik Wienhold
> On 02/12/2022 22:33 CET Erik Wienhold  wrote:
> 
>  
> > On 02/12/2022 21:51 CET Jonathan Lemig  wrote:
> >
> > Has anybody ever encountered this, and if so, did you find a resolution?
> > Or perhaps there other limitations with the VTU that I'm unaware of?
> 
> Is the one view you cannot find in view_table_usage a materialized view?
> Because those are not covered by view_table_usage[1].
> 
> [1] 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/catalog/information_schema.sql;h=18725a02d1fb6ffda3d218033b972a0ff23aac3b;hb=HEAD#l2605

Never mind.  I forgot that you wrote that it appears in pg_views.  So it's a 
plain view.

--
Erik




Re: Views "missing" from information_schema.view_table_usage

2022-12-02 Thread Erik Wienhold
> On 02/12/2022 22:55 CET Jonathan Lemig  wrote:
>
> drps=> select viewowner, schemaname, viewname from pg_views where viewname = 
> 'platform_version_v';
>  viewowner | schemaname | viewname
> ---++
>  drps | event | platform_version_v
> (1 row)
>
> drps=> select * from information_schema.view_table_usage where view_name = 
> 'platform_version_v';
>  view_catalog | view_schema | view_name | table_catalog | table_schema | 
> table_name
> --+-+---+---+--+
> (0 rows)
>
> The event.platform_version_v doesn't show up when querying the 
> view_table_usage view.

Does platform_version_v reference any tables?  That view name suggests that it
provides some version info, e.g. with a definition like:

create view event.platform_version_v as select '1.0.0' as version;

In that case it won't appear in view_table_usage.

--
Erik




Re: Views "missing" from information_schema.view_table_usage

2022-12-02 Thread Erik Wienhold
> On 02/12/2022 23:22 CET Jonathan Lemig  wrote:
>
> It probably wouldn't hurt to have that added to the documentation. I'll post
> a message to pgsql-docs. Thanks again!
>
> Jon

Good idea!

Could it be a bug?  Materialized views are a Postgres extension[1] (I always
thought they are standard.)  But I'd expect them to be included when talking
about "views".  Maybe they are not included because they are considered being
closer to physical tables[2] than views.  Yet their dependencies would justify
inclusion in view_table_usage.

[1] https://www.postgresql.org/docs/15/sql-creatematerializedview.html, see 
Compatibility
[2] https://www.postgresql.org/docs/9.3/release-9-3.html#AEN119452

> On Fri, Dec 2, 2022 at 4:14 PM Jonathan Lemig  wrote:
> > Hi Erik - sorry I missed your reply when I replied to David's. That is
> > indeed the issue. The object that the view is querying is a materialized
> > view.
> >
> > Thanks for the link.
> >
> > Jon
> >
> >
> > On Fri, Dec 2, 2022 at 3:33 PM Erik Wienhold  wrote:
> > > > On 02/12/2022 21:51 CET Jonathan Lemig  wrote:
> > >  >
> > >  > Has anybody ever encountered this, and if so, did you find a 
> > > resolution?
> > >  > Or perhaps there other limitations with the VTU that I'm unaware of?
> > >
> > >  Is the one view you cannot find in view_table_usage a materialized view?
> > >  Because those are not covered by view_table_usage[1].
> > >
> > >  [1] 
> > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/catalog/information_schema.sql;h=18725a02d1fb6ffda3d218033b972a0ff23aac3b;hb=HEAD#l2605
> > >
> > >  --
> > >  Erik

--
Erik




Re: impact join syntax ?? and gist index ??

2023-01-07 Thread Erik Wienhold
> On 07/01/2023 20:46 CET Marc Millas  wrote:
>
> Hi,
>
> postgres 12, postgis 3.0
>
> I have a small table A, 11 rows with a varchar column x and a geometry column 
> y.
> gist index on the geometry column.
> the geometry do contains multipolygons (regions on a map)
> I have a second table B , same structure, around 420 000 rows.
> no index,
> the geometry do contains points.
> all geometries are on 4326 srid.
>
> If i ask to count points in each multipolygons:
>
> select A.x, count(B.x) from A, B where st_within(B.y, A.y) group by A.x;
> it takes 11 seconds (everything in shared buffers).
> If I do the very same thing as:
> select A.x, count(B.x) from A left join B on st_within(B.y, A.y) group by A.x;
> same result, but 85 seconds (every thing in shared buffers, again)
> if I redo asking with explain analyze, buffers, the plan is very different.
>
>
> if I do create a gist index on geometry column of the big table, both syntax 
> takes 21 seconds.
>
> I get the feeling I am missing something.. (at least 2 things...)
> can someone shed some light ??

Please provide the executions plans for both queries with and without the index 
on B.y.

--
Erik




Re: How do the Linux distributions create the Linux user/group "postgres"?

2023-01-09 Thread Erik Wienhold
> On 09/01/2023 13:41 CET Matthias Apitz  wrote:
>
> Hello,
>
> Please note: I'm talking about the user and group "postgres" in the
> Linux OS and not in the PostgreSQL server.
>
> We're compiling PostgreSQL from source (actually 14.1) and distribute that
> to our customers. They're asked to setup user and group "postgres"
> before creating the cluster. As nowadays there are a lot of setup such
> things in bigger installations, like LDAP or AD, etc. I'd like to know
> how other installations for Linux deal with this?

Arch Linux uses sysusers.d[1] to create user postgres[2][3].

[1] https://man.archlinux.org/man/sysusers.d.5
[2] 
https://github.com/archlinux/svntogit-packages/blob/packages/postgresql/trunk/PKGBUILD#L204
[3] 
https://github.com/archlinux/svntogit-packages/blob/packages/postgresql/trunk/postgresql.sysusers

--
Erik




Re: Why is a Read-only Table Gets Autovacuumed "to prevent wraparound"

2023-01-16 Thread Erik Wienhold
> On 16/01/2023 13:48 CET Fred Habash  wrote:
>
> This is a puzzle I have not been able to crack yet.
>
> We have a single-page table with 28 rows that is purely read-only. There isn't
> a way in postgres to make a table RO, but I say this with confidence because
> pg_stat_user_tables has always showed 0 updates/deletes/inserts.
>
> Furthermore, the schema app developers know, for certain, this table does not
> get changed at all.

Only way to ensure that is to have database users other than the table owners
or superusers connect from your app.  Then you can GRANT the absolute necessary
privileges like SELECT for read-only access.

> We installed scripts that run every few minutes that do a 'select *' and over
> a period of days, we have not seen a change.
>
> We disabled autovacuum on this table '{autovacuum_enabled=false}'. But,
> despite the fact that this table is read-only (by design) and autovac id is
> disabled, it got autovac'd twice in less than 10 days and on both occasions,
> pg_stat_activity showed the worker with 'to prevent wraparound'. This explains
> why autovac did not honor the disabled status.
>
> But why is this table autovac'd at all?

Wraparound protection is always performed even if autovacuum is disabled:
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

> I have a hypothesis, but I need it validated and may be indicate if it is
> scientifically plausible. It goes like this ...
>
> 1. Application initiates a T1 transaction
> 2. App. reads multiple tables to get product metadata and this small table is
>one of them.
> 3. At some point, app. locks a row on one of the tables (not the small one).
> 4. Client app. keeps session 'idle in transaction' while it refreshes a
>webpage to render the data.
> 4. Once the client app verifies the web app has rendered the data correctly,
>it comes back to the database to finish the transaction.
>
> So, even if the small table is never changed, it is part of a transaction to
> be queried. Will this use-case cause the table to qualify for an aggressive
> autovac to prevent wraparound.
>
> If not, why else is a table with zero DML changes ever gets autovac'd?

--
Erik




Re: Interpreting postgres execution plan along with AND/OR precedence

2023-01-18 Thread Erik Wienhold
> On 18/01/2023 18:56 CET Tom Lane  wrote:
>
> "Dirschel, Steve"  writes:
> >  Filter: ... (alternatives: SubPlan 1 or hashed SubPlan 2) ...
> >  SubPlan 1
> >->  Index Scan using ix_lm_cc on lm_queue lmq2  
> > (cost=0.40..177.93 rows=1 width=0)
> >  Index Cond: ((collection_name)::text = 
> > (lmq1.collection_name)::text)
> >  Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text 
> > <> 'EXTRACT'::text))
> >  SubPlan 2
> >->  Seq Scan on lm_queue lmq2_1  (cost=0.00..124999.06 rows=12 
> > width=32)
> >  Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text 
> > <> 'EXTRACT'::text))
> 
> > I understand SubPlan 1 above- it is joining into the NOT EXISTS via the
> > lmq1.COLLECTION_NAME = lmq2.COLLECTION_NAME and then applying the other
> > filtering inside the NOT EXISTS.  But I don't understand SubPlan 2.
> > Given the filter conditions under SubPlan 2 it is also coming from the
> > NOT EXISTS because that is where LM_ID <> 0 and PROCESS_TYPE <> EXTRACT
> > exist but I don't understand the scenario where this query would need to
> > use SubPlan 2.  Would anyone be able to explain under what condition(s)
> > SubPlan 2 would get executed?
> 
> The key is the "alternatives:" notation.  SubPlan 1 can be used in a
> "retail" fashion by invoking it once per outer row, passing a new
> value of lmq1.collection_name each time.  SubPlan 2 is meant to be
> invoked just once, and its output (ie, all the relevant values of
> lmq2.collection_name) will be loaded into an in-memory hash table
> which is then probed for each outer row.  At the point where these
> subplans are made, we don't have enough information about how many
> outer rows there will be to decide which way is better, so we create
> both subplans and postpone the decision till execution.  That's all
> just related to the EXISTS clause, though.
>
> (Since v14 we don't do it like that anymore, so that this confusing
> EXPLAIN notation is gone.)

EXPLAIN ANALYZE shows which subplan gets executed.  Look for "never executed".

--
Erik




Re: Interpreting postgres execution plan along with AND/OR precedence

2023-01-18 Thread Erik Wienhold
> On 18/01/2023 18:03 CET Dirschel, Steve  
> wrote:
>
> The plan changes:
>
> Sort (cost=9382.94..9382.97 rows=12 width=169)
> Sort Key: lmq1.priority DESC, lmq1.request_time
> -> Bitmap Heap Scan on lm_queue lmq1 (cost=4572.59..9382.73 rows=12 width=169)
> Recheck Cond: ((client_name)::text = 'WLCASES'::text)
> Filter: ((NOT (alternatives: SubPlan 1 or hashed SubPlan 2)) OR 
> (((process_type)::text = ANY ('{UNLOCK,"UNLOCK RERUN"}'::text[])) AND (lm_id 
> = '0'::numeric) AND ((host_name)::text = 'WLLOADB'::text) AND (hold_flag = 
> 'Y'::bpchar) AND ((host_name)::text = 'WLLOADB'::text) AND (status_code = 
> '1'::numeric)))
> -> Bitmap Index Scan on ix_lm_chl (cost=0.00..4572.58 rows=25 width=0)
> Index Cond: ((client_name)::text = 'WLCASES'::text)
> SubPlan 1
> -> Bitmap Heap Scan on lm_queue lmq2 (cost=164.44..188.42 rows=1 width=0)
> Recheck Cond: ((lmq1.collection_name)::text = (collection_name)::text)
> Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 
> 'EXTRACT'::text))
> -> Bitmap Index Scan on ix_lm_cc (cost=0.00..164.44 rows=6 width=0)
> Index Cond: ((collection_name)::text = (lmq1.collection_name)::text)
> SubPlan 2
> -> Seq Scan on lm_queue lmq2_1 (cost=0.00..124999.49 rows=25 width=32)
> Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 
> 'EXTRACT'::text))
>
> In the original plan above I believe the query drives off index ix_lm_chl
> applying both client_name = WLCASES and host_name = WLLOADB to the index cond.
> In the plan directly above I believe it also drives off index ix_lm_chl but it
> is only applying client_name = WLCASES to the index cond.
>
> If AND’s are applied first then why wouldn’t the modified query apply both
> client_name = WLCASES and host_name = WLLOADB to the index cond? Is it because
> those are moved below the OR condition?

Yes because those two conditions are combined with OR and that's handled by the
bitmap scans in your second execution plan.  See also:
https://www.postgresql.org/docs/14/indexes-bitmap-scans.html

--
Erik




Re: Database size different on Primary and Standby?

2023-01-18 Thread Erik Wienhold
> On 19/01/2023 00:09 CET Hilbert, Karin  wrote:
>
> I manage some PostgreSQL clusters on Linux. We have a Primary & two Standby
> servers & for Production, there is also a DR server. We use repmgr for our HA
> solution & the Standbys are cloned from the Primary using the repmgr standby
> clone command.
>
> My manager asked for a report of all the user databases & their sizes for each
> server in the cluster. I used the psql "\l+" command & then extracted the
> database name & the size from the output. I expected the databases to be the
> same size on the Standbys as on the Primary, but I found that some of the
> databases were smaller on the Standby servers than on the Primary.
>
> For example, the output on the Primary for one of the user databases showed
> as: 8997 kB, but on the Standbys, it was 8849 kB.

The standbys could be missing some indexes because schema changes are not
replicated and must be applied manually.

> I even dropped the database on the Primary & then restored it from a backup.
> Then checked the sizes again & they still showed the difference.
>
> I also found that the template1 database on the Primary was 7821 kB, but on
> the Standbys, it was 7673 kB. Is this normal? Why would the sizes be 
> different?

Is template1 identical (schema and data) on primary and standby?

Could also be different page sizes. But that's a compilation option. What does
SHOW block_size say on those systems?

--
Erik




Re: Database size different on Primary and Standby?

2023-01-18 Thread Erik Wienhold
> On 19/01/2023 01:23 CET Ian Lawrence Barwick  wrote:
>
> 2023年1月19日(木) 8:50 Erik Wienhold :
> >
> > > On 19/01/2023 00:09 CET Hilbert, Karin  wrote:
> > >
> > > I manage some PostgreSQL clusters on Linux. We have a Primary & two 
> > > Standby
> > > servers & for Production, there is also a DR server. We use repmgr for 
> > > our HA
> > > solution & the Standbys are cloned from the Primary using the repmgr 
> > > standby
> > > clone command.
> > >
> > > My manager asked for a report of all the user databases & their sizes for 
> > > each
> > > server in the cluster. I used the psql "\l+" command & then extracted the
> > > database name & the size from the output. I expected the databases to be 
> > > the
> > > same size on the Standbys as on the Primary, but I found that some of the
> > > databases were smaller on the Standby servers than on the Primary.
> > >
> > > For example, the output on the Primary for one of the user databases 
> > > showed
> > > as: 8997 kB, but on the Standbys, it was 8849 kB.
> >
> > The standbys could be missing some indexes because schema changes are not
> > replicated and must be applied manually.
>
> This is incorrect; with streaming replication all changes applied on the 
> primary
> are applied on the standby.

Thanks. I was thinking about logical replication.

> (...)
> > Could also be different page sizes. But that's a compilation option. What 
> > does
> > SHOW block_size say on those systems?
>
> It is impossible to start a standby using binaries built with a
> different block size to
> the primary.

Makes sense for streaming replication.

--
Erik




Re: Sequence vs UUID

2023-01-26 Thread Erik Wienhold
> On 26/01/2023 20:17 CET veem v  wrote:
>
> Hello, We were trying to understand whether we should use UUID or Sequence in
> general for primary keys. In many of the blogs (one is below) across multiple
> databases, I saw over the internet and all are mostly stating the sequence is
> better as compared to UUID. But I think in the case of concurrent data load
> scenarios UUID will spread the contention point whereas sequence can be a
> single point of contention.
>
> So we want to understand from experts here, if there are any clear rules
> available or if we have any pros vs cons list available for each of those to
> understand the exact scenario in which we should go for one over other?
> Basically I wanted to see if we can perform some test on sample data to see
> the percentage of overhead on read and write performances of the query in
> presence of UUID VS Sequence to draw some conclusion in general? And also
> considering open source postgres as the base for many databases like redshift
> etc, so the results which apply to progress would apply to others as well.
>
> https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-performance-lets-discuss/

I think that[1] provides a good summary.  Performance consideration is just one
aspect.  Is there a technical requirement for using UUID over sequential values?

If there's a single generator of primary keys use bigint sequences.  In case of
multiple generators (multi-master replication, sharding, clients generating IDs)
consider UUID.

There are arguments against sequential PK, e.g. they give away too much info and
allow attacks such as forced browsing[2].  The first I can understand: you may
not want to reveal the number of users or customers.  But access control should
prevent forced browsing.

[1] 
https://www.cybertec-postgresql.com/en/uuid-serial-or-identity-columns-for-postgresql-auto-generated-primary-keys/
[2] https://owasp.org/www-community/attacks/Forced_browsing

--
Erik




Re: Why is this SELECT evaluated?

2023-01-28 Thread Erik Wienhold
> On 28/01/2023 20:29 CET Miles Elam  wrote:
>
> Postgres v15
>
> Given this example of a conversion from a byte array to an int8
> masquerading as an "unsigned" int4
> 
>   SELECT (get_byte(bytes, byte_offset)::int8 << 24)
>  | (get_byte(bytes, byte_offset + 1) << 16)
>  | (get_byte(bytes, byte_offset + 2) << 8)
>  | (get_byte(bytes, byte_offset + 3))
> FROM ( VALUES ('\x01'::bytea, 0) ) b(bytes, byte_offset)
>WHERE length(bytes) >= (4 + byte_offset)
>   ;
> 
> Why does this error result?
>
> ERROR: index 3 out of valid range, 0..2
> SQL state: 2202E
>
> I was under the impression that if the WHERE clause evaluated to
> false, the SELECT clause would not be evaluated.

Yes, according to 
https://www.postgresql.org/docs/15/sql-select.html#id-1.9.3.172.7
the WHERE clause is evaluated before the SELECT list.

> Why is get_byte(...) ever run in the first place even though length(bytes)
> is 3?

Postgres also applies constant folding which can be observed in the execution
plan (I removed the 4th get_byte call).  The WHERE clause is always false:

QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=8)
   One-Time Filter: false
(2 rows)

And with those constants the SELECT list is evaluated before the statement is
processed in the documented order.

Does the SQL standard say anything about constant folding and when or if it can
be applied?  I assume it's just an implementation detail of Postgres.  Without
the constant folding I would also expect that query to just return the empty 
set.

get_byte checks the index at runtime.  Adding a fourth byte (index is 0-based)
works as expected with index 3:

test=# select get_byte('\x010203', 3);
ERROR:  index 3 out of valid range, 0..2

test=# select get_byte('\x01020304', 3);
 get_byte
--
4
(1 row)

test=# select get_byte('\x01020304', 4);
ERROR:  index 4 out of valid range, 0..3

With a random bytea length the query is processed in the expected order and
returns one or zero rows but never raises an error:

SELECT
  b,
  (get_byte(bytes, byte_offset)::int8 << 24)
| (get_byte(bytes, byte_offset + 1) << 16)
| (get_byte(bytes, byte_offset + 2) << 8)
| (get_byte(bytes, byte_offset + 3))
FROM (
  VALUES (substring('\x01020304'::bytea from 1 for (random() * 
4)::int), 0)
) b(bytes, byte_offset)
WHERE
  length(bytes) >= (4 + byte_offset);

--
Erik




Re: Sequence vs UUID

2023-01-28 Thread Erik Wienhold
> On 27/01/2023 01:48 CET Ron  wrote:
>
> On 1/26/23 15:55, Erik Wienhold wrote:
> >
> > There are arguments against sequential PK, e.g. they give away too much 
> > info and
> > allow attacks such as forced browsing[2].  The first I can understand: you 
> > may
> > not want to reveal the number of users or customers.  But access control 
> > should
> > prevent forced browsing.
> 
> Shouldn't your application layer isolate the users from the database?  UUIDs 
> are all over the DBs I manage, but the PKs are all sequences.

Yes, I meant the application layer, not Postgres' access control.

--
Erik




Re: Download file from COPY ... TO with pgadmin

2023-01-31 Thread Erik Wienhold
> On 31/01/2023 14:02 CET Marco Lechner  wrote:
>
> Using COPY …TO it is possible to store e.g. the “value” of a bytea cell in a
> directory on the server. E.g. by this:
>
> COPY (SELECT content FROM d_doc WHERE id = 'x123456') TO 
> CONCAT('/tmp/mydoc.pdf’) (FORMAT binary);
>
> As we do not have access to the postgresql fileserver (ssh, …), but can SELECT
> the value of the bytea cell, is It possible to download the bytea directly
> into a file on a client computer when using pgadmin? Is there a solution with
> the COPY … TO command, any pgadmin specific feature or any other simple 
> solution?

Use psql with \copy which targets the client file system.  pgAdmin 5.4+ can also
launch psql.

--
Erik




Re: How to create a new operator inpg for spec data type?

2023-01-31 Thread Erik Wienhold
> On 01/02/2023 06:40 CET jack...@gmail.com  wrote:
>
> I need to create a new operator like '<->' and its syntax is that text1 <-> 
> text2,
> for the usage like this: 'a' <-> 'b' = 'a1b1', so how could I realize this 
> one?
> Can you give me some exmaples.

https://www.postgresql.org/docs/current/sql-createoperator.html

But why use an operator and not just the function that you must create anyway?

--
Erik




Re: database postgres not found

2023-02-01 Thread Erik Wienhold
> On 01/02/2023 11:33 CET Matthias Apitz  wrote:
>
> Hello,
>
> I've a problem with a PostgreSQL 12.x server not setup or managed by me. that
> the database 'postgres' is not found:
>
> $ psql -Upostgres postgres
> psql: error: ERROR:  no such database: postgres
>
> but the database is there as a SELECT shows:
>
> $ psql -Upostgres lbs_lbsoclc01_dev_r1
> psql (12.11)
> Type "help" for help.
>
> lbs_lbsoclc01_dev_r1=# select * from pg_database where datname = 'postgres' ;
>   oid  | datname  | datdba | encoding | datcollate  |  datctype   | 
> datistemplate | datallowconn | datconnlimit | datlastsysoid | datfrozenxid | 
> datminmxid | dattablespace | datacl
> ---+--++--+-+-+---+--+--+---+--++---+
>  14344 | postgres | 10 |6 | en_US.UTF-8 | en_US.UTF-8 | f 
> | t|   -1 | 14343 |  479 |  1 
> |  1663 |
> (1 row)
>
> What does this mean?

Error message "no such database: postgres" is not from Postgres but likely from
PgBouncer.  This means the database is not configured in PgBouncer for clients
to connect to.

--
Erik




Re: moving a database to a new 15.1 server

2023-02-01 Thread Erik Wienhold
> On 01/02/2023 13:24 CET Matthias Apitz  wrote:
>
> The source database in the 12.11 server shows this:
>
> $ psql -Ulbs_lbsoclc01_dev_r1_dbo_u $DATABASE
> psql (12.11)
> Type "help" for help.
>
> lbs_lbsoclc01_dev_r1=>
> lbs_lbsoclc01_dev_r1=> \d
>   List of relations
>  Schema | Name  | Type  |   Owner
> +---+---+
>  dbo| accession_index   | table | lbs_lbsoclc01_dev_r1_dbo_u
>  dbo| acq_bind  | table | lbs_lbsoclc01_dev_r1_dbo_u
>  ...
>
> I dumped this with pg_dump
>
> $ pg_dump --file=dmp.gz --compress=9 --dbname=$DATABASE
>
> Created in the new server the database
>
> $ createdb -U lbs_lbsoclc01_dev_r1_dbo_u -T template0 lbs_lbsoclc01_dev_r1
>
> and the role for thw user with
>
> CREATE ROLE lbs_lbsoclc01_dev_r1_dbo_u WITH SUPERUSER CREATEDB LOGIN PASSWORD 
> 'xx' ;
> CREATE ROLE
>
> and loaded the dump with
>
> $ gzip -dc dmp.gz | psql -U lbs_lbsoclc01_dev_r1_dbo_u lbs_lbsoclc01_dev_r1
>
> This all went fine. But when I now look into the database:
>
> $ psql -Ulbs_lbsoclc01_dev_r1_dbo_u lbs_lbsoclc01_dev_r1
> psql (15.1)
> Type "help" for help.
>
> lbs_lbsoclc01_dev_r1=#
> lbs_lbsoclc01_dev_r1=# \d
> Did not find any relations.
> lbs_lbsoclc01_dev_r1=# \d dbo.accession_index
> Table "dbo.accession_index"
>   Column  | Type  | Collation | Nullable | Default
> --+---+---+--+-
>  iln  | smallint  |   | not null |
>
> lbs_lbsoclc01_dev_r1=# select count(*) from counter;
> ERROR:  relation "counter" does not exist
> LINE 1: select count(*) from counter;
>  ^
> lbs_lbsoclc01_dev_r1=# select count(*) from dbo.counter;
>  count
> ---
> 41
>
> i.e. I have to specify the schema 'dbo' to access the tables.
>
> What I am missing here in this move?

Your search_path does not contain dbo.  Check  SHOW search_path;  in the old
database and set the search_path with ALTER DATABASE in the new database 
accordingly.

--
Erik




Re: From Clause Conditional

2023-02-02 Thread Erik Wienhold
> On 02/02/2023 13:54 CET Zahir Lalani  wrote:
>
> Confidential
>
> Hello All
>
> We are testing a upgrade from pg11 to pg14 and have some issues to overcome.
> One of these is that we have upgraded pgsodium to the latest and there is a
> functional change – this question is not about sodium BTW.
>
> So here is a sample bit of code that I will use to explain the issue – this
> would usually be params passed in, but I have hard coded one particular case
> that does not need the decryption. The code below would return decrypted data
> if the key is supplied (non 0) otherwise return null. This code used to work
> because the secret box call would allow null params. It no longer does. When
> the key is 0, the data passed in would be null.
>
> LEFT JOIN lateral (
> SELECT
> CASE
> WHEN (0 > 0) THEN
> convert_from(crypto_secretbox_open, 'utf8')::JSON
> ELSE
> NULL
> END AS edata
> FROM
> crypto_secretbox_open(coalesce(null, '')::bytea, coalesce(null, '')::bytea,0)
> where (0>0)
> ) enc ON true
>
> The issue is that, even when the key is 0, the select is still run (its part
> of a lateral join) and what we need to achieve is to effectively have a
> conditional where we only run the select if the key > 0 otherwise we return
> null – I have a brain freeze on this! I am sure there is an easy solution,
> but right now I can’t see it.

Looks similar to a post from a few days ago:
https://www.postgresql.org/message-id/flat/CAALojA-nHoxDr7B2k0e1-EtGMPsGPZiCVeS_ds0aHG0SEOrPxg%40mail.gmail.com

I don't know pgsodium but the query optimizer will evalute crypto_secretbox_open
if the function is immutable and is called with constant arguments.

How is the key and the data passed to this query?  Is it a prepared statement or
is the query generated on the fly for specific key and data that is already
known?  In the latter case you can handle the case expression with two different
queries (one with crypto_secret_box and one without) depending on whether a key
exists or not.

--
Erik

PS: Please don't hijack threads on this mailing list.




Switching identity column to serial

2023-02-03 Thread Erik Wienhold
I was wondering if it's possible to drop a column identity (not the column
itself) while keeping the attached sequence.  This would avoid recreating
an identical sequence (especially with a correct start value and owner).

Changing the sequence owner to NONE before dropping identity is not allowed.
Also changing pg_class.relowner to some role did not help.  The sequence is
still dropped together with the column identity.

But I managed it by clearing pg_attribute.attidentity.  See the following
psql session:

test=# create table t (id int generated always as identity, x char);
CREATE TABLE

test=# insert into t (x) values ('a'), ('b') returning *;
 id | x
+---
  1 | a
  2 | b
(2 rows)

INSERT 0 2

test=# select pg_get_serial_sequence('t', 'id');
 pg_get_serial_sequence

 public.t_id_seq
(1 row)

test=# update pg_attribute set attidentity = '' where (attrelid, 
attname) = ('t'::regclass, 'id');
UPDATE 1

test=# alter table t alter id drop identity;
psql:1.sql:6: ERROR:  column "id" of relation "t" is not an identity 
column

test=# select pg_get_serial_sequence('t', 'id');
 pg_get_serial_sequence

 public.t_id_seq
(1 row)

test=# alter table t alter id set default nextval('t_id_seq');
ALTER TABLE

test=# insert into t (x) values ('c'), ('d') returning *;
 id | x
+---
  3 | c
  4 | d
(2 rows)

INSERT 0 2

test=# insert into t (id, x) values (-1, 'e') returning *;
 id | x
+---
 -1 | e
(1 row)

INSERT 0 1

test=# select * from t;
 id | x
+---
  1 | a
  2 | b
  3 | c
  4 | d
 -1 | e
(5 rows)

Is this sufficient or am I missing some detail and messing around with
pg_catalog is not enough (in addition to being risky)?

Some context:

I have to change identity columns to a form that resembles a definition as
serial.  Creating a new column and migrating the primary key constraint is
not an option.

Why is this change necessary?

My team is importing data with QGIS which fails to properly handle identity
columns.  QGIS uses INSERT with OVERRIDING SYSTEM VALUE but tries to insert
NULL although identity columns imply NOT NULL (also it's the primary key).
QGIS tries to generate an ID with nextval but does not use the qualified
sequence name although search_path does not contain the namespace.  It's
weird that QGIS thinks that it should generate the ID instead of delegating
this to the database, yet it uses RETURNING id.  Maybe it needs the ID in
advance for reference.  I don't know.

The "serial" style with nextval as column default works as expected.
Probably because QGIS just uses the column default expression which should
reference the correct sequence.  Oh, did I mention yet that QGIS generates
the ID before issuing an INSERT with RETURNING id?

I'll still open a bug ticket with QGIS but right now there's no other way
than ditching identity columns.

--
Erik




Re: Switching identity column to serial

2023-02-03 Thread Erik Wienhold
> On 04/02/2023 01:54 CET Erik Wienhold  wrote:
>
> I was wondering if it's possible to drop a column identity (not the column
> itself) while keeping the attached sequence.  This would avoid recreating
> an identical sequence (especially with a correct start value and owner).
>
> Changing the sequence owner to NONE before dropping identity is not allowed.
> Also changing pg_class.relowner to some role did not help.  The sequence is
> still dropped together with the column identity.
>
> But I managed it by clearing pg_attribute.attidentity.  See the following
> psql session:

Forgot to mention: tested on 12.13 and 15.1.

>
>   test=# create table t (id int generated always as identity, x char);
>   CREATE TABLE
>
>   test=# insert into t (x) values ('a'), ('b') returning *;
>id | x
>   +---
> 1 | a
> 2 | b
>   (2 rows)
>
>   INSERT 0 2
>
>   test=# select pg_get_serial_sequence('t', 'id');
>pg_get_serial_sequence
>   
>public.t_id_seq
>   (1 row)
>
>   test=# update pg_attribute set attidentity = '' where (attrelid, 
> attname) = ('t'::regclass, 'id');
>   UPDATE 1
>
>   test=# alter table t alter id drop identity;
>   psql:1.sql:6: ERROR:  column "id" of relation "t" is not an identity 
> column
>
>   test=# select pg_get_serial_sequence('t', 'id');
>pg_get_serial_sequence
>   
>public.t_id_seq
>   (1 row)
>
>   test=# alter table t alter id set default nextval('t_id_seq');
>   ALTER TABLE
>
>   test=# insert into t (x) values ('c'), ('d') returning *;
>id | x
>   +---
> 3 | c
> 4 | d
>   (2 rows)
>
>   INSERT 0 2
>
>   test=# insert into t (id, x) values (-1, 'e') returning *;
>id | x
>   +---
>-1 | e
>   (1 row)
>
>   INSERT 0 1
>
>   test=# select * from t;
>id | x
>   +---
> 1 | a
> 2 | b
> 3 | c
> 4 | d
>-1 | e
>   (5 rows)
>
> Is this sufficient or am I missing some detail and messing around with
> pg_catalog is not enough (in addition to being risky)?
>
> Some context:
>
> I have to change identity columns to a form that resembles a definition as
> serial.  Creating a new column and migrating the primary key constraint is
> not an option.
>
> Why is this change necessary?
>
> My team is importing data with QGIS which fails to properly handle identity
> columns.  QGIS uses INSERT with OVERRIDING SYSTEM VALUE but tries to insert
> NULL although identity columns imply NOT NULL (also it's the primary key).
> QGIS tries to generate an ID with nextval but does not use the qualified
> sequence name although search_path does not contain the namespace.  It's
> weird that QGIS thinks that it should generate the ID instead of delegating
> this to the database, yet it uses RETURNING id.  Maybe it needs the ID in
> advance for reference.  I don't know.
>
> The "serial" style with nextval as column default works as expected.
> Probably because QGIS just uses the column default expression which should
> reference the correct sequence.  Oh, did I mention yet that QGIS generates
> the ID before issuing an INSERT with RETURNING id?
>
> I'll still open a bug ticket with QGIS but right now there's no other way
> than ditching identity columns.
>
> --
> Erik




Re: Switching identity column to serial

2023-02-04 Thread Erik Wienhold
> On 04/02/2023 05:41 CET Ron  wrote:
>
> On 2/3/23 18:54, Erik Wienhold wrote:
>
> > I was wondering if it's possible to drop a column identity (not the column
> > itself) while keeping the attached sequence.  This would avoid recreating
> > an identical sequence (especially with a correct start value and owner).
>
>  Why doesn't this work?
>  BEGIN;
>  DROP SEQUENCE t_id;
>  CREATE SEQUENCE new_t_id_seq AS INTEGER OWNED BY t.id;
>  ALTER SEQUENCE new_t_id_seq OWNER TO new_owner;
>  SELECT setval('new_t_id', (SELECT MAX(id) FROM t));
>  SELECT nextval('new_t_id');
>  COMMIT;

This should work but I want to preserve the existing sequence instead of
re-creating it with the same properties.  That's why I was looking for a
shortcut (also code golfing and sheer curiosity).

I haven't thought about using setval but I would need to dynamically
generate the CREATE SEQUENCE anyway to preserve the old sequence definition
with info from pg_sequence.  I assume the sequences were created with
default settings, e.g. cache 1, no cycle.  But I haven't checked the ~100
affected sequences in detail.

Also setting the current value to max(id) is not the same as preserving the
sequence state which may be past max(id) if rows were deleted in the
meantime.  We log DML in audit tables and record the ID of deleted rows.
Therefore I don't want sequences to generate previous values.  This can be
handled by getting nextval from the old sequence before dropping it and
using that as start for the new sequence.

> > Changing the sequence owner to NONE before dropping identity is not allowed.
> > Also changing pg_class.relowner to some role did not help.  The sequence is
> > still dropped together with the column identity.
>
>  Manually diigging around the system catalog is never recommended.

I had the idea from relocating PostGIS a few weeks ago which describes
setting pg_extension.extrelocatable = true. [1]

Now I also checked the implementation of DROP IDENTITY on 12.13 and 15.1:

  1. check column attributes
  2. clear pg_attribute.attidentity
  3. invoke post alter hook (it's a no-op without sepgsql)
  4. drop sequence

My approach is identical to steps 1 and 2.  Of course future releases may
change that.

[1] https://www.postgis.net/2017/11/07/tip-move-postgis-schema/

--
Erik




Re: Slow down dev database transactions/second for testing?

2023-02-05 Thread Erik Wienhold
> On 05/02/2023 23:17 CET Richard Brockie  wrote:
>
> I maintain a Django webapp that uses postgresql and can create inefficient
> queries if I'm not careful. I'm looking for ways to mimic a congested db
> server in development to expose these queries.

pgbench is what your looking for: 
https://www.postgresql.org/docs/current/pgbench.html

You can run custom statements with the --file option.  Get the statements that
Django generates and let pgbench run those to analyze the bottlenecks.  Or let
pgbench create load for some time (see option --time) while you debug your
Django app.

> The configuration of postgresql is complicated - is there a simple method by
> which I could, for example limit the number of transactions/second to a
> certain level by adjusting postgresql.conf?

No.  Postgres will execute as fast as possible with the available resources.

--
Erik




Re: Understanding years part of Interval

2023-02-06 Thread Erik Wienhold
> On 06/02/2023 12:20 CET Marcos Pegoraro  wrote:
>
> I was just playing with some random timestamps for a week, for a month,
> for a year ...
>
> select distinct current_date+((random()::numeric)||'month')::interval from 
> generate_series(1,100) order by 1;
> It´s with distinct clause because if you change that 'month' for a 'year'
> it´ll return only 12 rows, instead of 100. So, why years part of interval
> works differently than any other ?
>
> select '1.01 week'::interval; --> 0 years 0 mons 7 days 1 hours 40 mins 48.00 
> secs
> select '1.01 month'::interval; --> 0 years 1 mons 0 days 7 hours 12 mins 0.00 
> secs
> select '1.01 year'::interval; --> 1 years 0 mons 0 days 0 hours 0 mins 0.00 
> secs

Explained in 
https://www.postgresql.org/docs/15/datatype-datetime.html#DATATYPE-INTERVAL-INPUT:

Field values can have fractional parts: for example, '1.5 weeks' or
'01:02:03.45'. However, because interval internally stores only
three integer units (months, days, microseconds), fractional units
must be spilled to smaller units. Fractional parts of units greater
than months are rounded to be an integer number of months, e.g.
'1.5 years' becomes '1 year 6 mons'. Fractional parts of weeks and
days are computed to be an integer number of days and microseconds,
assuming 30 days per month and 24 hours per day, e.g., '1.75 months'
becomes 1 mon 22 days 12:00:00. Only seconds will ever be shown as
fractional on output.

Internally interval values are stored as months, days, and
microseconds. This is done because the number of days in a month
varies, and a day can have 23 or 25 hours if a daylight savings time
adjustment is involved.

--
Erik




Re: Understanding years part of Interval

2023-02-06 Thread Erik Wienhold
> On 06/02/2023 18:33 CET Marcos Pegoraro  wrote:
>
> Em seg., 6 de fev. de 2023 às 10:59, Erik Wienhold  escreveu:
> > > On 06/02/2023 12:20 CET Marcos Pegoraro  wrote:
> >  >
> >  > I was just playing with some random timestamps for a week, for a month,
> >  > for a year ...
> >  >
> >  > select distinct current_date+((random()::numeric)||'month')::interval 
> > from generate_series(1,100) order by 1;
> >  > It´s with distinct clause because if you change that 'month' for a 'year'
> >  > it´ll return only 12 rows, instead of 100. So, why years part of interval
> >  > works differently than any other ?
> >  >
> >  > select '1.01 week'::interval; --> 0 years 0 mons 7 days 1 hours 40 mins 
> > 48.00 secs
> >  > select '1.01 month'::interval; --> 0 years 1 mons 0 days 7 hours 12 mins 
> > 0.00 secs
> >  > select '1.01 year'::interval; --> 1 years 0 mons 0 days 0 hours 0 mins 
> > 0.00 secs
> >
> >  Explained in 
> > https://www.postgresql.org/docs/15/datatype-datetime.html#DATATYPE-INTERVAL-INPUT:
> >
> >  Field values can have fractional parts: for example, '1.5 weeks' or
> >  '01:02:03.45'. However, because interval internally stores only
> >  three integer units (months, days, microseconds), fractional units
> >  must be spilled to smaller units. Fractional parts of units greater
> >  than months are rounded to be an integer number of months, e.g.
> >  '1.5 years' becomes '1 year 6 mons'. Fractional parts of weeks and
> >  days are computed to be an integer number of days and microseconds,
> >  assuming 30 days per month and 24 hours per day, e.g., '1.75 months'
> >  becomes 1 mon 22 days 12:00:00. Only seconds will ever be shown as
> >  fractional on output.
> >
> >  Internally interval values are stored as months, days, and
> >  microseconds. This is done because the number of days in a month
> >  varies, and a day can have 23 or 25 hours if a daylight savings time
> >  adjustment is involved.
> >
> I´ve sent this message initially to general and Erik told me it's documented,
> so it's better to hackers help me if this has an explaining why it's done 
> that way.
>
> select '1 year'::interval = '1.05 year'::interval -->true ?
> I cannot agree that this select returns true.

The years are converted to months and the fractional month is rounded half up:

1.05 year = 12.6 month
=> 1 year 0.6 month
=> 1 year 1 month(after rounding)

Compare that to 12.5 months to see when the rounding occurs:

12.5 month / 12 month
=> 1.0416... years

Plug 1.0416 and 1.0417 into the interval to observe the rounding:

=# select '1.0416 year'::interval, '1.0417 year'::interval;
 interval |   interval
--+--
 1 year   | 1 year 1 mon

--
Erik




Re: How to create directory format backup

2023-02-08 Thread Erik Wienhold
> On 08/02/2023 21:59 CET Andrus  wrote:
>
> How to create backup in format from which tables can selectively restored?

Dump as custom-format archive (-F custom) and use that with pg_restore and
options --table or --list/--use-list to select what should be restored.

--
Erik




Re: How to create directory format backup

2023-02-08 Thread Erik Wienhold
> On 08/02/2023 22:37 CET Andrus  wrote:
>
> > > How to create backup in format from which tables can selectively
> > > restored?
> > >
> > Dump as custom-format archive (-F custom) and use that with pg_restore
> > and options --table or --list/--use-list to select what should be
> > restored.
> >
> How to select tables interactively like pgAdmin allows to select when
> directory format is used ?
>
> Database contains hundreds of schemas. I need to restore public and other
> other schema.
>
> Whole backup file is scanned to restore only two schemas. It takes lot of
> time.

pg_dump also accepts options --table and --schema to only dump what you need.

> Also directory format allows to use all cores with --jobs=32 parameter.
> Dump and partial restore using custom format are much slower.

Run multiple pg_dump processes in parallel where each processes a subset of
tables with the options mentioned above.

--
Erik




Re: Multi-column index: Which column order

2023-02-14 Thread Erik Wienhold
> On 14/02/2023 18:53 CET Sebastien Flaesch  wrote:
>
> Hello!
>
> When creating an index on multiple columns, does the order of the columns
> matter? (I guess so)

Maybe, depending on the queries.

> It's mostly for SELECT statements using a condition that include ALL
> columns of the index (pkey):
>
> SELECT * FROM art WHERE etb='L1' and code='ART345'
>
> I would naturally put the columns with the most various values first, and
>
> For example, if the "code" column contains thousands of various item ids
> like 'SXZ874', 'ERF345', ... while the "etb" column contains a dozen of
> values like "L1", "LT" and "BX".
>
> Which one is best?
>
> CREATE UNIQUE INDEX ix1 ON art (code, etb)
> or
> CREATE UNIQUE INDEX ix1 ON art (etb, code)
>
> (or its PRIMARY KEY equivalent)

It should not make any difference for the query above.  It can make a
difference for queries that only filter by the second index column or use
inequality constraints on those columns.

> Does it depend on the type of index (Btree, GiST, etc) ?
>
> I could not find that information in the doc.

Yes, see the documentation on multicolumn indexes with details on how they
are used: https://www.postgresql.org/docs/current/indexes-multicolumn.html

But you're limited to btree anyway if you're only interested in unique
indexes.

--
Erik




Re: DELETE trigger, direct or indirect?

2023-02-16 Thread Erik Wienhold
> On 16/02/2023 14:23 CET Dominique Devienne  wrote:
>
> Hi. This is a bit unusual. We have a foreign key between two tables, with
> ON DELETE CASCADE, to preserve referential integrity. But we apparently
> also need to preserve the severed reference (by natural key, i.e. its name),
> to later on reconnect the two entities after-the-fact, should the parent
> row re-appear later on (in the same transaction or not it still unclear).
>
> To achieve this weird requirement, I'd like to know if it is possible in an
> ON DELETE trigger to know whether the deletion is coming from a direct-DELETE
> in the "child table", or whether the deletion is coming from the "parent
> table" CASCADEd to the child table.

Not to my knowledge.  ON DELETE CASCADE behaves like a manual DELETE on the
child table that happens before the DELETE on the parent table.

The process you describe shows that it's not known until the end of the
transaction which parent rows can be deleted.  You can instead track the
parent rows as candidates for deletion in a temp table.  Insert the primary
key of parent rows if you deem them deletable and delete the primary key if
you detect the opposite.  At the end the temp table only contains IDs of
parent rows that can be deleted for sure.

--
Erik




Re: can't get psql authentication against Active Directory working

2023-02-18 Thread Erik Wienhold
> On 18/02/2023 15:02 CET Tomas Pospisek  wrote:
>
> so I'm trying to authenticate psql (on Windows) -> postgres (on Linux)
> via Active Directory.
>
> psql (Linux) -> postgres (Linux) with authentication against Active
> Directory does work.
>
> However the same with psql.exe on Windows does not. I get:
>
>  D:\>C:\OSGeo4W\bin\psql.exe service=the_db
>  psql: error: connection to server at "dbserver.example.lan
>  (192.168.4.104), port 5432 failed: could not initiate GSSAPI
>  security context: No credentials were supplied, or the credentials
>  were unavailable or inaccessible: Internal credentials cache error
>
> psql.exe from the OSGeo4W QGIS Installer *does* include GSS support. (I
> have tried with a different psql.exe without GSS support and it would
> tell me that it does not support GSS).
>
> The .pg_service.conf file in the users $HOME directory looks like this:
>
>  [the_db]
>  host=dbserver.example.lan
>  port=5432
>  user=u...@example.lan
>  gssencmode=require
>
> This same pg_service.conf does work for psql (Linux).

On Windows the service file is not read from $home/.pg_service.conf but
$env:appdata/postgresql/.pg_service.conf (or 
%appdata%/postgresql/.pg_service.conf
when using cmd.exe.)

--
Erik




Re: ERROR: unsupported Unicode escape sequence - in JSON-type column

2023-02-27 Thread Erik Wienhold
> On 27/02/2023 13:13 CET Laurenz Albe  wrote:
>
> I'd be curious to know how the customer managed to do that.
> Perhaps there is a loophole in PostgreSQL that needs to be fixed.

Probably via some data access layer and not directly via Postgres.  It's easy
to reproduce with psycopg:

import psycopg

with psycopg.connect() as con:
con.execute('create temp table jsontab (jsoncol json)')
con.execute(
'insert into jsontab (jsoncol) values (%s)',
[psycopg.types.json.Json('\0')],
)

with con.execute('select jsoncol from jsontab') as cur:
print(cur.fetchall())

try:
with con.execute('select jsoncol::jsonb from jsontab') as cur:
pass
raise AssertionError("jsonb should fail")
except psycopg.errors.UntranslatableCharacter:
pass

Another reason to prefer jsonb over json to reject such inputs right away.
The documentation states that json does not validate inputs in constrast to
jsonb.

Of course the OP now has to deal with json.  The data can be sanitized by
replacing all null character escape sequences:

update jsontab
set jsoncol = replace(jsoncol::text, '\u', '')::json
where strpos(jsoncol::text, '\u') > 0;

But the data access layer (or whatever got the json into the database) must be
fixed as well to reject or sanitize those inputs in the future.

--
Erik




Re: CREATE/DROP ROLE transactional? GRANT/REVOKE?

2023-03-06 Thread Erik Wienhold
> On 06/03/2023 14:19 CET Dominique Devienne  wrote:
>
> Perhaps I missed it in the doc (e.g. [1]), but are DDLs around ROLEs and
> GRANTs transactional?

Have you tried?  DDL is transactional unless stated otherwise (cf. CREATE 
DATABASE,
CREATE INDEX CONCURRENTLY, CREATE TABLESPACE).

Run the following psql script:

drop role if exists alice, bob;

\du

begin;
create role alice;
\du
rollback;

\du

begin;
create role alice;
create role bob;
commit;

\du

begin;
grant alice to bob;
\du
rollback;

\du

begin;
drop role alice;
\du
rollback;

\du

Output:

DROP ROLE
   List of roles
 Role name | Attributes 
| Member of

---++---
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

BEGIN
CREATE ROLE
   List of roles
 Role name | Attributes 
| Member of

---++---
 alice | Cannot login   
| {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

ROLLBACK
   List of roles
 Role name | Attributes 
| Member of

---++---
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

BEGIN
CREATE ROLE
CREATE ROLE
COMMIT
   List of roles
 Role name | Attributes 
| Member of

---++---
 alice | Cannot login   
| {}
 bob   | Cannot login   
| {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

BEGIN
GRANT ROLE
   List of roles
 Role name | Attributes 
| Member of

---++---
 alice | Cannot login   
| {}
 bob   | Cannot login   
| {alice}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

ROLLBACK
   List of roles
 Role name | Attributes 
| Member of

---++---
 alice | Cannot login   
| {}
 bob   | Cannot login   
| {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

BEGIN
DROP ROLE
   List of roles
 Role name | Attributes 
| Member of

---++---
 bob   | Cannot login   
| {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

ROLLBACK
   List of roles
 Role name | Attributes 
| Member of

---++---
 alice | Cannot login   
| {}
 bob   | Cannot login   
| {}
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS 
| {}

> Since I'm creating many ROLEs and making many GRANTs, based info I read from
> PostgreSQL itself (in pg_catalog and elsewhere), should everything be in a
> single transaction?

If it should be atomic and the commands are allowed in transactions, then yes,
use transactions.

--
Erik




Re: psql \conninfo in tabular form?

2023-03-07 Thread Erik Wienhold
> On 07/03/2023 18:58 CET Ron  wrote:
>
> v13.10
>
> Instead of a sentence like this:
> You are connected to database "postgres" as user "postgres" via socket in
> "/var/run/postgresql" at port "5433".
>
> I'd rather have something tabular like:
>     keyword  |   value
> +---
>    database  | postgres
>    user      | postgres
>    host      | /var/run/postgresql
>    port      |  5433

Define variable conninfo with the query in ~/.psqlrc:

\set conninfo 'select * from (values (''database'', 
current_database()), (''user'', session_user), (''host'', 
coalesce(inet_server_addr()::text, 
current_setting(''unix_socket_directories''))), (''port'', 
coalesce(inet_server_port()::text, current_setting(''port'' t(keyword, 
value);'

And run it like that:

postgres=# :conninfo
 keyword  |  value
--+-
 database | postgres
 user | ewie
 host | /run/postgresql
 port | 5432
(4 rows)

--
Erik




Re: Removing trailing zeros (decimal places) from a numeric (pre trim_scale()) with unexpected behaviour

2023-03-15 Thread Erik Wienhold
> On 15/03/2023 14:51 CET magog...@web.de wrote:
>
> I want to remove not needed decimal places / trailing zeros from a numeric.
> I know this can be done starting PG >=13 with TRIM_SCALE(numeric) which would
> solve my issue (with an additional CAST to TEXT at the end).  Unfortunately
> the production database is still running with PostgreSQL 12.x and this is
> something I currently can't change.
>
> So to get rid of the not needed decimal places I tried TO_CHAR(..., 'FM')
> in combination with TRUNC() as shown below with examples. This does not remove
> the decimal places separator if the complete scale digits are zero (60.000).

Cast the to_char result to numeric and then to text.  This will also remove
trailing zeros.

select
  to_char('60.000'::numeric, 'FM999.999')::numeric::text,
  to_char('60.100'::numeric, 'FM999.999')::numeric::text;

 to_char | to_char
-+-
 60  | 60.1
(1 row)

> The current behaviour might be intentional but it 'smells like a bug' to me.

It follows Oracle's to_char behavior:

select to_char('60.000', 'FM999.999') from dual;

TO_CHAR('60.000','FM999.999')
-
60.

--
Erik




Re: NULL pg_database.datacl

2023-03-20 Thread Erik Wienhold
> On 20/03/2023 11:52 CET Dominique Devienne  wrote:
>
> Hi. I'm surprised, I thought ACLs would never be empty for a database.
> Does that mean nobody can connect to this database?
> I guess SUPERUSER and/or its datDBA can?
> What does a NULL AclItem[] mean exactly?

It means that the object has default privileges (before any GRANT or REVOKE
is executed).  For databases this means full privileges for the database owner
and the CONNECT and TEMPORARY privileges for PUBLIC.  So any user can connect
if allowed by pg_hba.conf.

https://www.postgresql.org/docs/current/ddl-priv.html

--
Erik




Re: NULL pg_database.datacl

2023-03-20 Thread Erik Wienhold
> On 20/03/2023 13:50 CET Dominique Devienne  wrote:
>
> On Mon, Mar 20, 2023 at 1:18 PM Erik Wienhold  wrote:
> > > On 20/03/2023 11:52 CET Dominique Devienne  wrote:
> > > What does a NULL AclItem[] mean exactly?
> >
> >  It means that the object has default privileges (before any GRANT or REVOKE
> >  is executed). For databases this means full privileges for the database 
> > owner
> >  and the CONNECT and TEMPORARY privileges for PUBLIC. So any user can 
> > connect
> >  if allowed by pg_hba.conf.
> >
> >  https://www.postgresql.org/docs/current/ddl-priv.html
>
> Thanks Erik. But then, how come aclexplode() is not showing these
> default/implicit privileges?
>
> Is there a SQL function returning those per-type default provileges?
> That I could then coalesce() datacl with?

Use acldefault.  Pass in ownerId=0 to get the privileges for PUBLIC.

https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-ACLITEM-FN-TABLE

--
Erik




Re: Binding Postgres to port 0 for testing

2023-03-25 Thread Erik Wienhold
> On 25/03/2023 18:01 CET Markus Pilman  wrote:
>
> I am building a simple integration test framework for an application that I
> am building. For this project I am planning to use PostgreSQL.
>
> For testing purposes I'd like to always start with an empty database,
> populate data, and, if the test was successful, delete everything. These
> tests are single process tests.
>
> I'd like to run many processes in parallel and have one postgres server
> process running for each. I realize that I could use one and use multiple
> databases but I don't want to do this for a variety of reasons (one being
> that I also want to test the control plane logic, the application is
> multi-tenant and uses a database per tenant, having separate databases
> simplifies debugging etc).
>
> Now the problem is that I need to find a TCP port for each running postgres
> instance. There's multiple ways to do this, but by far the easiest one I
> know is to bind to port 0. So my plan was to start postgres with "-p 0" and
> then parse stdout to figure out which port it actually uses. But that
> doesn't seem to work:
>
> >postgres -D data/ -p 0
>  2023-03-25 16:39:54.271 GMT [13924] FATAL: 0 is outside the valid range for 
> parameter "port" (1 .. 65535)
>
> What would be the recommended way of addressing my issue?

I would try to start Postgres with every port number in a for loop starting
with port number 1024.  The first one that works is your port number.  And you
may not even have to parse stdout if you can pass that port number to your 
tests.

Maybe you can also use pg_virtualenv[0] from Debian's postgresql-common.  It
tries every port number starting from 5432.

[0] 
https://manpages.debian.org/testing/postgresql-common/pg_virtualenv.1.en.html

--
Erik




Re: Binding Postgres to port 0 for testing

2023-03-25 Thread Erik Wienhold
> On 25/03/2023 20:10 CET Markus Pilman  wrote:
>
> Thanks for the suggestions. I didn't know about pg_virtualenv, that's
> interesting. Though it seems to achieve something similar as to containerize
> the test (potentially in a more platform independent way). Though it seems
> pg_virtualenv is mostly doing what my test driver is currently doing. Trying
> out the ports is obviously possible, but it seems a bit hacky to me (though
> if there's no better way I don't think that's a good show-stopper).

You can of course also use Docker and have it map port 5432 to a random host
port.  Use docker-port to find the mapped host port:

docker port CONTAINER 5432/tcp

Testcontainers may also be an option if you want to use Docker:

* https://www.testcontainers.org/modules/databases/postgres/
* https://testcontainers-python.readthedocs.io/en/latest/postgres/README.html

> But I am still wondering: Is there a reason PostgreSQL doesn't allow me to
> bind against port 0? I understand that in a production environment this is
> almost never the thing you want to do, but I wouldn't consider this option
> very dangerous.

One reason for not allowing port zero is Postgres' naming convention of Unix
domain sockets.  The port number is included in the socket filename.

https://www.postgresql.org/docs/current/runtime-config-connection.html#GUC-UNIX-SOCKET-DIRECTORIES

Accepting port zero for a Unix domain socket would not behave the same as
binding a TCP socket to port zero.

Another benefit is that the bound port number is available through the config.
Postgres does not have to keep track of any "random" port number picked by the
operating system.

--
Erik




Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-02 Thread Erik Wienhold
> On 01/04/2023 08:02 CEST jian he  wrote:
>
> Hi,
> https://www.postgresql.org/docs/current/functions-json.html
> > jsonb @@ jsonpath → boolean
> > Returns the result of a JSON path predicate check for the specified JSON
> > value. Only the first item of the result is taken into account. If the
> > result is not Boolean, then NULL is returned.
> > '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2' → t
>
> select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*]');
> return
>
> > jsonb_path_query
> > --
> >  1
> >  2
> >  3
> >  4
> >  5
> > (5 rows)
>
> I don't understand:"Only the first item of the result is taken into account.".
>
> Here,JSON path predicate check for the specified JSON valuereturn true, some
> return false. (1 > 2 is false, 2 > 2 is false).

The result is true if any array element matches the predicate because predicates
are evaluated on sequences.  The documentation for executePredicate in
src/backend/utils/adt/jsonpath_exec.c explains it:

> Predicates have existence semantics, because their operands are item
> sequences.  Pairs of items from the left and right operand's sequences are
> checked.  TRUE returned only if any pair satisfying the condition is found.
> In strict mode, even if the desired pair has already been found, all pairs
> still need to be examined to check the absence of errors.  If any error
> occurs, UNKNOWN (analogous to SQL NULL) is returned.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/jsonpath_exec.c;h=b561f0e7e803f0e5a546ad118a47f625225b9708;hb=HEAD#l1461

Difference between using a predicate as path expression vs filter expression:

=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*] > 2');
 jsonb_path_query
--
 true
(1 row)

=# select jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ > 2)');
 jsonb_path_query
--
 3
 4
 5
(3 rows)

If you want the predicate result for each element, you must apply the predicate
to the rows returned from jsonb_path_query:

=# select elem, elem::float > 2 as pred from 
jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*]') elem;
 elem | pred
--+--
 1| f
 2| f
 3| t
 4| t
 5| t
(5 rows)

--
Erik




Re: DEFINER / INVOKER conundrum

2023-04-03 Thread Erik Wienhold
> On 03/04/2023 13:18 CEST Dominique Devienne  wrote:
>
> My goal is to have clients connect to PostgreSQL,
> and call a function that return a JWT token.
>
> The JWT is supposed to capture the user (login role),
> and the current_role (which has meaning in our app),
> and sign it using a secret corresponding to a mid-tier
> service the client will connect to later.
>
> I've found https://github.com/michelp/pgjwt which seems
> perfect for my use case, but I'm struggling with something.
>
> On the one hand, I want a INVOKER security function,
> to be able to capture the login and current ROLEs.
>
> On the other hand, I want a DEFINER security function,
> to be able to access the secret to sign the JWT with.
>
> That secret will be in a table that regular users of our DB
> do NOT have access to, of course. But that the function
> doing the JWT signing does need access to, of course (again).
>
> I thought I'd have two layers of functions, one INVOKER
> that captures the ROLEs, which then calls the DEFINER one,
> passing the ROLEs captured, but since the INVOKER function
> must also be able to call the DEFINER function, what prevents
> the client from calling it directly, with different (spoofed) ROLEs?
>
> Is there a way out of that conundrum?

A single DEFINER function works if you capture current_user with a parameter
and default value.  Let's call it claimed_role.  Use pg_has_role[0] to check
that session_user has the privilege for claimed_role (in case the function is
called with an explicit value), otherwise raise an exception.

Connect as postgres:

CREATE FUNCTION f(claimed_role text default current_user)
  RETURNS TABLE (claimed_role text, curr_user text, sess_user text)
  SECURITY DEFINER
  LANGUAGE sql
  $$ SELECT claimed_role, current_user, session_user $$;

Connect as alice:

SELECT * FROM f();

 claimed_role | curr_user | sess_user
--+---+---
 alice| postgres  | alice
(1 row)

[0] 
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-ACCESS-TABLE

--
Erik




Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-03 Thread Erik Wienhold
> On 02/04/2023 17:40 CEST Adrian Klaver  wrote:
>
> That is a long way from:
>
> jsonb @@ jsonpath → boolean
>
> Returns the result of a JSON path predicate check for the specified JSON
> value. Only the first item of the result is taken into account. If the
> result is not Boolean, then NULL is returned.

What do you mean?  I responded to the OP's question.  It's not a suggestion
to update the docs.  Obviously it's quite a mouthful and needs to be boiled
down for the docs.  Any suggestions?

--
Erik




Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-03 Thread Erik Wienhold
> On 03/04/2023 17:36 CEST Adrian Klaver  wrote:
>
> On 4/3/23 08:11, Erik Wienhold wrote:
> >> On 02/04/2023 17:40 CEST Adrian Klaver  wrote:
> >>
> >> That is a long way from:
> >>
> >> jsonb @@ jsonpath → boolean
> >>
> >> Returns the result of a JSON path predicate check for the specified JSON
> >> value. Only the first item of the result is taken into account. If the
> >> result is not Boolean, then NULL is returned.
> >
> > What do you mean?  I responded to the OP's question.  It's not a suggestion
> > to update the docs.  Obviously it's quite a mouthful and needs to be boiled
> > down for the docs.  Any suggestions?
>
> For me I don't see how:
>
> Predicates have existence semantics, because their operands are item
> sequences.  Pairs of items from the left and right operand's sequences
> are checked.  TRUE returned only if any pair satisfying the condition is
> found. In strict mode, even if the desired pair has already been found,
> all pairs still need to be examined to check the absence of errors.  If
> any error occurs, UNKNOWN (analogous to SQL NULL) is returned.
>
> resolves to :
>
> Only the first item of the result is taken into account.
>
> In other words reconciling "TRUE returned only if any pair satisfying
> the condition is found."  and "...first item of the result..."

I see.

Thinking about it now, I believe that "first item of the result" is redundant
(and causing the OP's confusion) because the path predicate produces only a
single item: true, false, or null.  That's what I wanted to show with the first
two jsonb_path_query examples in my initial response, where the second example
returns multiple items.

I think the gist of @@ and json_path_match is:

"Returns true if any JSON value at the given path matches the predicate.
 Returns NULL when not a path predicate or comparing different types."

--
Erik




Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-03 Thread Erik Wienhold
> On 03/04/2023 18:37 CEST Adrian Klaver  wrote:
>
> On 4/3/23 09:21, Erik Wienhold wrote:
> >> On 03/04/2023 17:36 CEST Adrian Klaver  wrote:
> >>
> >> On 4/3/23 08:11, Erik Wienhold wrote:
> >>>> On 02/04/2023 17:40 CEST Adrian Klaver  wrote:
> >>>>
> >>>> That is a long way from:
> >>>>
> >>>> jsonb @@ jsonpath → boolean
> >>>>
> >>>> Returns the result of a JSON path predicate check for the specified JSON
> >>>> value. Only the first item of the result is taken into account. If the
> >>>> result is not Boolean, then NULL is returned.
> >>>
> >>> What do you mean?  I responded to the OP's question.  It's not a 
> >>> suggestion
> >>> to update the docs.  Obviously it's quite a mouthful and needs to be 
> >>> boiled
> >>> down for the docs.  Any suggestions?
> >>
> >> For me I don't see how:
> >>
> >> Predicates have existence semantics, because their operands are item
> >> sequences.  Pairs of items from the left and right operand's sequences
> >> are checked.  TRUE returned only if any pair satisfying the condition is
> >> found. In strict mode, even if the desired pair has already been found,
> >> all pairs still need to be examined to check the absence of errors.  If
> >> any error occurs, UNKNOWN (analogous to SQL NULL) is returned.
> >>
> >> resolves to :
> >>
> >> Only the first item of the result is taken into account.
> >>
> >> In other words reconciling "TRUE returned only if any pair satisfying
> >> the condition is found."  and "...first item of the result..."
> >
> > I see.
> >
> > Thinking about it now, I believe that "first item of the result" is 
> > redundant
> > (and causing the OP's confusion) because the path predicate produces only a
> > single item: true, false, or null.  That's what I wanted to show with the 
> > first
> > two jsonb_path_query examples in my initial response, where the second 
> > example
> > returns multiple items.
> >
> > I think the gist of @@ and json_path_match is:
> >
> > "Returns true if any JSON value at the given path matches the predicate.
> >   Returns NULL when not a path predicate or comparing different types."
>
> So basically a variation of jsonb @? jsonpath that returns NULL instead
> of false when confused:
>
> select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ == "test")' ;
> ?column?
> --
>   f

The filter expression does not match any values because predicate '@ == "test"'
returns unknown.  This follows SQL's three-valued logic.

"? (condition)

 [...] The result of that step is filtered to include only those items that
 satisfy the provided condition. SQL/JSON defines three-valued logic, so the
 condition can be true, false, or unknown. The unknown value plays the same role
 as SQL NULL and can be tested for with the is unknown predicate. Further path
 evaluation steps use only those items for which the filter expression returned
 true."https://www.postgresql.org/docs/current/functions-json.html

> select '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] == "test"' ;
>   ?column?
> --
>   NULL

In this case @@ returns null because the predicate returns unknown for all array
elements.  It gets interesting in strict mode.

Lax mode (default) with an array element of matching type found by the 
predicate:

select '{"a":[1,2,3,4,5,"test"]}'::jsonb @@ '$.a[*] == "test"';
 ?column?
--
 t
(1 row)

In strict mode the unknown result for the first array element causes the
predicate evaluation to short-circuit and return unknown right away instead
of testing the remaining elements:

select '{"a":[1,2,3,4,5,"test"]}'::jsonb @@ 'strict $.a[*] == "test"';
 ?column?
--
 NULL
(1 row)

> Otherwise it does the same thing:
>
> select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)' ;
>   ?column?
> --
>   t
>
>   select '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2' ;
>   ?column?
> --
>   t

Yes, if the LHS and RHS types match.  The equivalence is also supported by
documentation in src/backend/utils/adt/jsonb_gin.c:

The operators support, among the others, "jsonb @? jsonpath" and
"jsonb @@ jsonpath".  Expressions containing these operators are easily
expressed through each other.

jb @? 'path' <=> jb @@ 'EXISTS(path)'
jb @@ 'expr' <=> jb @? '$ ? (expr)'

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/jsonb_gin.c;h=e941439d7493365f8954c791f0e2368c080189b8;hb=HEAD#l15

--
Erik




Re: DEFINER / INVOKER conundrum

2023-04-04 Thread Erik Wienhold
> On 04/04/2023 07:55 CEST walt...@technowledgy.de wrote:
>
> Erik Wienhold:
> > A single DEFINER function works if you capture current_user with a parameter
> > and default value.  Let's call it claimed_role.  Use pg_has_role[0] to check
> > that session_user has the privilege for claimed_role (in case the function 
> > is
> > called with an explicit value), otherwise raise an exception.
> >
> > Connect as postgres:
> >
> > CREATE FUNCTION f(claimed_role text default current_user)
> >   RETURNS TABLE (claimed_role text, curr_user text, sess_user text)
> >   SECURITY DEFINER
> >   LANGUAGE sql
> >   $$ SELECT claimed_role, current_user, session_user $$;
>
> For me, checking whether session_user has the privilege for claimed_role
> is not enough, so I add a DOMAIN to the mix:
>
> CREATE DOMAIN current_user_only AS NAME CHECK (VALUE = CURRENT_USER);
>
> CREATE FUNCTION f(calling_user current_user_only DEFAULT CURRENT_USER)
> ...
> SECURITY DEFINER;
>
> This works, because the domain check is evaluated in the calling context.

Nice.  It's equivalent to my version without the domain if the client can
execute SET ROLE before calling f, thereby injecting any role for which
pg_has_role(session_user, calling_user, 'MEMBER') returns true.

Dominique did not say whether he controls the clients or not.

--
Erik




Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-04 Thread Erik Wienhold
> On 04/04/2023 03:50 CEST jian he  wrote:
>
> > "Returns true if any JSON value at the given path matches the predicate.
> >  Returns NULL when not a path predicate or comparing different types."
>
> in first sentence, should we add something "otherwise return false." ?

I omitted the "otherwise false" part because of the corner cases which I did
not want to gloss over.  But the corner cases also apply if the predicate
matches some value, depending on strict mode, as I noticed later and wrote in
my previous message.

Suggestion:

"Returns true if any JSON value at the given path matches the predicate,
 otherwise returns false.  Unless the predicate compares different types
 (depending on strict mode) or the jsonpath is not a path predicate, in
 which case NULL is returned."

I guess it's best to document the corner cases in detail in the notes section
as Adrian pointed out and have the function doc refer to the notes.

> also, should it be "Return true"? (since only one value returned)?

The third-person singular "returns" is correct in this case.  It does not refer
to the number of returned values.

--
Erik




Re: Possible old and fixed bug in Postgres?

2023-04-05 Thread Erik Wienhold
> On 05/04/2023 11:18 CEST Steve Rogerson  
> wrote:
>
> I was looking at perl CPAN Module (DateTime::Format::Pg) and saw that it did 
> something that seemed odd to me with time zones, based on the comment:
>
>      # For very early and late dates, PostgreSQL always returns times in
>      # UTC and does not tell us that it did so.
>
> Early is before 1901-12-14 and late after 2038-01-18
>
> A quick test setting my time zone to be America/Chicago I got
>
> select '1900-01-01 00:00:00'::timestamptz;
>    timestamptz
> 
>   1900-01-01 00:00:00-06
> (1 row)
>
> and
>
> select '2040-01-01 00:00:00'::timestamptz;
>    timestamptz
> 
>   2040-01-01 00:00:00-06
>
>
> These seemed correct to me. I'm guessing this might have been a bug/feature 
> of 
> pg in the long ago.

Judging by the commit message and changed test cases, probably:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=921d749bd4c34c3349f1c254d5faa2f1cec03911

--
Erik




Re: Call a Normal function inside a Trigger Function

2023-04-16 Thread Erik Wienhold
> On 16/04/2023 16:18 CEST FOUTE K. Jaurès  wrote:
>
> Is it possible to call a function inside a trigger function ?
> Any idea or link are welcome. Thanks in advance

Depends on what you want to do with the return value.  Use PERFORM to ignore
the result. [0]  Use SELECT INTO to handle a single-row result. [1]

PERFORM myfunc();
SELECT myfunc() INTO myresult;

[0] 
https://www.postgresql.org/docs/current/plpgsql-statements.html#PLPGSQL-STATEMENTS-GENERAL-SQL
[1] 
https://www.postgresql.org/docs/current/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW

--
Erik




Re: Suppress logging of "pg_hba.conf rejects connection for host"

2023-04-16 Thread Erik Wienhold
> On 16/04/2023 17:02 CEST ertan.kucuko...@1nar.com.tr wrote:
>
> One of the systems running PostgreSQL 14.7 receive a lot of lines like in
> the subject. I have below pg_hba.conf line and that line causes these to be
> logged.
>
> host all all 0.0.0.0/0 reject
>
> If possible, I do not want to see these lines in my logs. But, I failed to
> find a parameter for it.
>
> Is it possible to turn this specific message logging off?

There's no special config for this specific error message.  It is logged as
FATAL so the only way to silence it *and any other messages from DEBUG5 to 
FATAL*
is to set log_min_messages = PANIC.  I don't recommend it.  It also complicates
troubleshooting failing connections in the future if you don't log this message.

When logging to syslog you may be able to discard specific messages.
rsyslog has property-based filters[0] for example:

:msg, contains, "pg_hba.conf rejects connection for host" ~

You should also investigate the clients that try connecting ("a lot" as you
write) and figure out why they keep connecting if you want to reject their
attempts anyway.

[0] 
https://rsyslog.readthedocs.io/en/latest/configuration/filters.html#property-based-filters

--
Erik




Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist

2023-04-19 Thread Erik Wienhold
> On 19/04/2023 15:24 CEST gzh  wrote:
>
> Hi,
>
> I upgraded the version of PostgreSQL from 12.10 to 12.13,

Better upgrade to latest release 12.14.

> when I insert data into the t_mstr table, the to_char function in the t_mstr's
> trigger caused the following error.
>
> psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
>
> There is no problem before the upgrade and to_char(numeric) function comes
> from the Orafce extension.
> The configuration of the old and new databases is as follows.
>
> Database server (old): PostgreSQL 12.10(orafce3.15)
> Database server (new): PostgreSQL 12.13(orafce3.24)
>
> The new database has successfully installed the orafce 3.24 extension.
> It does not occur in "PostgreSQL 12.10 and orafce 3.15",
> but occurs in "PostgreSQL 12.13 and orafce 3.24",
> so either the difference between 12.10 and 12.13 or the difference between
> orafce 3.15 and 3.24 is suspicious.
>
> What is the reason for the problem?

orafce 3.22 moved functions to schema oracle:

https://github.com/orafce/orafce/blob/b492a0f50d5ee866c9870f886401d7c34ad8ccb3/NEWS#L4,L5
https://github.com/orafce/orafce/commit/86a1b51440ca33a04ef47fc3bb704dee26d16753

--
Erik




Re: Question about accessing partitions whose name includes the schema name and a period - is this correct?

2023-04-19 Thread Erik Wienhold
> On 20/04/2023 00:50 CEST Jay Stanley  wrote:
>
> postgres=# alter table my_schema.my_table drop partition 
> "my_schema"."my_schema.my_table_should_not_work";
> ERROR: syntax error at or near ""my_schema""
> LINE 1: alter table my_schema.my_table drop partition "my_schema"."m...
>^
> -or-
> postgres=# alter table my_schema.my_table drop partition 
> my_schema."my_schema.my_table_should_not_work";
> ERROR: syntax error at or near "my_schema"
> LINE 1: alter table my_schema.my_table drop partition my_schema."my_...

The command you're looking for is ALTER TABLE DETACH PARTITION.

DROP PARTITION means dropping the column named "partition".  The syntax error
comes from the parser expecting an optional CASCADE or RESTRICT after DROP 
PARTITION.

--
Erik




Re: missing something about json syntax

2023-04-20 Thread Erik Wienhold
> On 20/04/2023 18:35 CEST Marc Millas  wrote:
>
> Hi,
>
> postgres 15
>
> looks Iike I am missing something, maybe obvious :-(
> In a table with a json column (_data) if I ask psql to select _data from
> mytable with a where clause to get only one line,, I get something beginning
> by
> {"time":"2023-04-19T16:28:01.19780551+02:00","stream":"stderr","_p":"F","log":"{\"level\":\"info\",\"ts\":\"2023-04-19T14:28:01Z\",\"logger\":\"_audit\",\"msg\":\"record\",\"logging_pod\":\"cluster-pgsql\",\"record\":{\"log_time\":\"2023-04-19
>  14:28:01.197 UTC\",\
> etc...

The value of property "log" is a string, not an object.  Notice the escaped
double quotes (\").

> if I create table anothertable as select _data as _data from mytable, it
> creates and feed that new table with all the appropriate data, and when I ask
> psql \d anothertable it says that its a table with a json column.named _data.
> fine !
>
> now if I select json_object_keys(_data) from mytable, I get a list of tags.
> time, stream, _p, log, fine.
> now, if i select json_object_keys(_data) from anothettable, I get an error:
> cannot call json_objet_keys on a scalar..
>
> ???
> both columns are fed and of type json. and postgres didn't throw any error
> feeding them.
> if I create a table with a jsonb column and feed it with the anothertable json
> column, same, fine... but still unusable.
>
> and unusable with all the other ways I did try, like simply
> select _data->'log'->>'level' from mytable, or
> select _data->'level' from anothertable
>
> sure if I look at the json field one is showed { "tag": "value", ...
> and the other is showed "{\"tag\":\"value\", ...

You executed

create table anothertable as select _data->'log' as _data from mytable;

and not

create table anothertable as select _data as _data from mytable;

So you end up with the scalar value of property "log" in anothertable._data.

> not the very same
>
> so 2 questions:
> 1) how postgres can feed a json or jsonb column and CANNOT use the values in
>it ??
> 2) how to "transform" the inappropriate json into a usable one ?
>
> of course, if what I am missing is very obvious, I apologize...

Get the log value with operator ->> and cast the returned text to json:

select (_data->>'log')::json->'level' from mytable;

--
Erik




Re: FW: Error!

2023-04-24 Thread Erik Wienhold
> On 25/04/2023 01:34 CEST Adrian Klaver  wrote:
>
> On 4/24/23 16:16, Arquimedes Aguirre wrote:
> > I got question, because can’t you send a screenshots, with the image
> > it’s much easier to identify the error or problem and you can also read
> > the message, I don’t understand?
>
> Many folks on this list only use text email so screenshots have to be
> opened in another program. Also it is simple to get the text. In the
> terminal/console right click and click on Select All and then hit Enter
> to copy the content and then paste to your email.

And this list is text searchable which helps others who may stumble on the
same or a similar issue in the future.

--
Erik




Re: murmur3 hash binary data migration from Oracle to PostgreSQL

2023-04-24 Thread Erik Wienhold
> On 25/04/2023 03:21 CEST Jagmohan Kaintura  wrote:
>
> We are doing Migration from Oracle to PostgreSQL. In SOurce database we have
> Binary data stored using murmur3 hashing function. In Oracle this data is
> being generated from the Java code and inserted into the Oracle database.

Do you store the hash and the binary data?  The hash is a key to the binary
data?

> As part of Migration processes the reference data on which this murmur3 is
> generated is also getting changed while migrating to PostgreSQL.

Why is the data changing during migration?  Shouldn't a migration preserve
the data and only adapt it if the database model needs to change?

> In PostgreSQL do we have any mechanism for fetching this murmur3 hash
> function for any UUID.

I don't understand what you mean by that.  What does it have to do with UUID?

Do you want to generate the MurmurHash in Postgres?  Postgres has no builtin
support for that hash function and I can't find any extension in a quick
online search.

Or do you want to just look up rows by the MurmurHash?  That's a trivial
SELECT statement.  Store the hash in an indexed column of type bytea to have
performant lookups.

--
Erik




Re: murmur3 hash binary data migration from Oracle to PostgreSQL

2023-04-25 Thread Erik Wienhold
> On 25/04/2023 12:44 CEST Jagmohan Kaintura  wrote:
>
> No we want to generate murmur3 format only.
>
> > On 25-Apr-2023, at 8:52 AM, Jeffrey Walton  wrote:
> >
> > My apologies if I misparsed a couple of statements. I am having
> > trouble determining if you are migrating away from Murmur3.
> >
> > If you are selecting a new digest, then SipHash would be an excellent
> > choice. It was designed to avoid collisions and be fast. Plus it was
> > designed by Jean-Philippe Aumasson and Daniel J. Bernstein. It doesn't
> > get much better than those two fellows.

I don't understand.  In your original message you write:

> On 25/04/2023 03:21 CEST Jagmohan Kaintura  wrote:
>
> In Oracle this data is being generated from the Java code and inserted into
> the Oracle database.

Do you generate the hash in Java or in Oracle?  I don't know if Oracle
Database provides a MurmurHash function but in Java you could use
apache-commons[0] and change your Java code to also insert the hash (if you
don't already do so).

[0] 
https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/digest/MurmurHash3.html

PS: Please don't top post. 
https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics

--
Erik




Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist

2023-04-25 Thread Erik Wienhold
> On 25/04/2023 13:34 CEST gzh  wrote:
>
> >The solution is the same whether you upgrade or not: you need
> >to adjust your search_path to include the "oracle" schema,
> >or else explicitly qualify references to orafce functions.
> Thank you very much for your help.
>
> To use the to_date functions of Orafce 3.0.1, we created the following
> to_date function in the public schema of the old database.
>
> -
> CREATE OR REPLACE FUNCTION to_date(str text) RETURNS date AS $$ SELECT 
> $1::date; $$ LANGUAGE sql IMMUTABLE STRICT; COMMENT ON FUNCTION 
> public.to_date(text) IS 'Convert string to date';
> -
>
> To avoid using a to_date function with the same name and parameter in the
> pg_catalog schema first, the search_path of the old database is set as
> follows:
>
> "$user", public, pg_catalog
>
> Make sure that public is searched before pg_catalog.
> After the database is upgraded, in order to solve the changes in Oracle
> 3.24, we have added oracle schema to the search_path, as shown below:
>
> "$user", public, oracle, pg_catalog
>
> The following error occurred when I ran my application.
>
> 42P13:ERROR:42P13: return type mismatch in function declared to return
> pg_catalog.date
>
> When I put the oracle schema at the end of the search_path, the problem was
> solved.
> The search_path settings without problems are as follows:
>
> "$user", public, pg_catalog, oracle
>
> Why does it report an error when i put oracle between public and pg_catalog?

When you created function to_date(text) your search_path was probably

"$user", public, pg_catalog

Thereby the function was created with return type pg_catalog.date and without
a search_path setting.

The cast to date in the function body, however, is unqualified and thus relies
on the session search_path.  When adding oracle to the session search_path
before pg_catalog, the cast will be to oracle.date (orafce defines its own
date type) instead of pg_catalog.date.  The function return type, however, is
still declared as pg_catalog.date.

To fix this create the function with an explicit search_path, i.e.

CREATE FUNCTION to_date(text)
  RETURNS oracle.date
  SET search_path = oracle
  ...

Or write the cast as $1::oracle.date to not rely on the search_path at all.

--
Erik




Re: Differential Backups in Windows server

2023-04-27 Thread Erik Wienhold
> On 27/04/2023 13:24 CEST Rajmohan Masa  wrote:
>
> Is it possible to take differential Backup inWindows Server ?If possible
> please explain clearly?

Not to my knowledge.  Postgres itself only allows full backups via pg_basebackup
or pg_dumpall.

Barman[0] allows differential backup via rsync+ssh which is, however,
not supported on Windows.  Barman supports Windows with pg_basebackup.

pgBackRest[1] and pg_rman[2] also provide differential backup but neither one
supports Windows.

I found SQLBackupAndFTP when searching for a Windows solution.  It advertises
with support for differential backups, but that's possible with SQL Server
only. [3]

I went with Barman to have PITR on that one Windows server I have to manage
and accept that it only allows full backups.  My Linux Postgres clusters are
covered by rsync+ssh though.

> I tried with different scenarios but I'm unable to take Diff Backup of the
> postgresql database in Windows server.

What have you tried?

[0] https://pgbarman.org
[1] https://pgbackrest.org/
[2] https://github.com/ossc-db/pg_rman
[3] https://sqlbackupandftp.com/features

--
Erik




Re: Re:Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist

2023-04-27 Thread Erik Wienhold
> On 27/04/2023 13:20 CEST gzh  wrote:
>
> When the return type is set to oracle.date, there are hours, minutes, and
> seconds of the date value in the SQL execution result.
> Why is there such a difference and how to solve it?

orafce defines oracle.date as timestamp(0) [0] because Oracle's DATE type has
a precision of one second [1].  That's the point of orafce: to provide Oracle
compatibility.

You can cast oracle.date to pg_catalog.date but then you're in Postgres
territory again.  Depends on what you want to achieve.  If it's just formatting
use oracle.to_char:

SELECT oracle.to_char('2023-04-27'::oracle.date, '-MM-DD');

[0] https://github.com/orafce/orafce/blob/VERSION_3_24_4/orafce--3.24.sql#L343
[1] 
https://oracle-base.com/articles/misc/oracle-dates-timestamps-and-intervals#date

--
Erik




Re: PL/pgSQL doesn't support variables in queries?

2023-05-03 Thread Erik Wienhold
> On 03/05/2023 14:25 CEST J.A.  wrote:
>
> ms-sql person here migrating over to pgsql. One of the first thing's I noticed
> with pgsql (or more specifically, PL/pgSQL) is that it doesn't support
> "variables" in a query?
>
> for example, here's some T-SQL:
>
> DECLARE @fkId INTEGER
>
> SELECT @fkId = fkId FROM SomeTable WHERE id = 1
>
> -- and then do something with that value..
>
> SELECT * FROM AnotherTable WHERE Id = @fkId
> SELECT * FROM YetAnotherTable WHERE FKId = @fkId
> -- etc..

plpgsql does support variable declarations [0] but does not use any special
notation like T-SQL.  An equivalent to your example would be:

DO $$
DECLARE
  v_fkid int;
  v_rec record;
BEGIN
  SELECT fkid INTO v_fkid FROM SomeTable WHERE id = 1;
  SELECT * INTO v_rec FROM AnotherTable WHERE Id = v_fkid;
  -- Do something with v_rec ...
END $$;

Prefixing variable names with v_ is just a convention to avoid ambiguous column
references (assuming that column names are not prefixed with v_) [1].

[0] https://www.postgresql.org/docs/current/plpgsql-declarations.html
[1] 
https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-VAR-SUBST

--
Erik




Re: PL/pgSQL doesn't support variables in queries?

2023-05-03 Thread Erik Wienhold
> On 03/05/2023 14:51 CEST J.A.  wrote:
>
> Oh wow folks! I totally misunderstood the docs then. (I also tried to Read The
> Manual before I posted here, too :blush:)
>
> I must admit, I did try doing something like you suggested Erik. I tried
> things like:
>
> DO $$
>  DECLARE
>  v_application_id uuid;
>  BEGIN
>  SELECT application_id INTO v_application_id FROM applications WHERE code = 
> 'pg-test-cc';
>
> SELECT * FROM application_foo WHERE application_id = v_application_id;
>  -- more SELECT * FROM child tables
>
>  END $$;
>
> but that never worked, with warning:
>
> ERROR: query has no destination for result data
> HINT: If you want to discard the results of a SELECT, use PERFORM instead.
> CONTEXT: PL/pgSQL function inline_code_block line 7 at SQL statement SQL 
> state: 42601
>
> Which is why i (incorrectly?) thought this cannot be done?

plpgsql requires you to either store query results in variables or discard them
as the hint in the error message says.  PERFORM is mainly used to execute
functions for their side-effects only, e.g. PERFORM pg_reload_conf(), or execute
a query where you only want to tell if rows were found by checking special
variable  FOUND  afterwards.

> So is there another trick to doing this instead? Is it maybe via the v_record
> "record" variable instead?

Depends on what you want to do with those application_foo rows.  SELECT INTO
only considers the first row.  I assume you want to loop over the entire result
set.  Then you must use  FOR v_rec IN  LOOP:

DO $$
DECLARE
  v_application_id uuid;
  v_rec record;
BEGIN
  SELECT application_id INTO v_application_id FROM applications WHERE 
code = 'pg-test-cc';

  FOR v_rec IN
SELECT * FROM application_foo WHERE application_id = 
v_application_id
  LOOP
RAISE NOTICE 'v_rec = %', v_rec;  -- Prints each result.
  END LOOP;
END $$;

--
Erik




Re: Invoking SQL function while doing CREATE OR REPLACE on it

2023-05-03 Thread Erik Wienhold
> On 03/05/2023 20:17 CEST Nagendra Mahesh (namahesh)  
> wrote:
>
> I have a Postgres 14.4 cluster (AWS Aurora) to which I connect from my
> application using JDBC.
>
> I use liquibase for schema management - not only tables, but also a bunch of
> SQL stored procedures and functions. Basically, there is one liquibase
> changeSet that runs last and executes a set of SQL files which contain stored
> procedures and functions.
>
> CREATE OR REPLACE FUNCTION bar(arg1 text, arg2 text) RETURNS record LANGUAGE 
> "plpgsql" AS '
> BEGIN
>// function body
> END;
> ';
>
> These functions / procedures are replaced ONLY when there is a change in one /
> more SQL files which are part of this changeSet. (runOnChange: true).
>
> Whenever I do a rolling deployment of my application (say, with a change in
> the function body of bar()), liquibase will execute the CREATE OR REPLACE 
> FUNCTION bar()
> as part of a transaction.
>
> In the few milliseconds while bar() is being replaced, there are other ongoing
> transactions (from other replicas of my application) which are continuously
> trying to invoke bar().
>
> Only in this tiny time window, few transactions fail with the following error:
>
> ERROR: function bar(arg1 => text, arg2 => text) does not exist
>   Hint: No function matches the given name and argument types. You might need 
> to add explicit type casts.
> Position: 4 : errorCode = 42883

CREATE OR REPLACE FUNCTION should be atomic and cannot change the function
signature.  I don't see how a function cannot exist at some point in this case.

Are you sure that Liquibase is not dropping the function before re-creating it?
If Liquibase drops and re-creates the function in separate transactions, the
transactions trying to execute that function may find it dropped when using the
read committed isolation level.

There's also a race condition bug in v14.4 that may be relevant.  It got fixed
in v14.5.  See "Fix race condition when checking transaction visibility" in
https://www.postgresql.org/docs/14/release-14-5.html.

--
Erik




Re: The logfile stop upgrade after a vim write

2023-05-04 Thread Erik Wienhold
> On 04/05/2023 11:54 CEST lz ma  wrote:
>
> 1. pg_ctl -D data -l MyLog
> 2. vim MyLog : add some words, save and exit
> 3. after vim operation, MyLog will never upgrade except restart server
> I know it caused by file descripter only open once at the start by postgres,
> and vim operation rename the file to MyLog~, so postgres can't upgrade

set backupcopy=yes in .vimrc

But why would you edit active log files?

--
Erik




Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 09:14 CEST Laura Smith  
> wrote:
>
> I'm currently doing a CSV export using COPY:
>
> COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' CSV 
> HEADER;
>
> This works great apart from accents are not preserved in the output, for
> example é gets converted to random characters, e.g. √© or similar.
>
> How can I preserve accents ?

Looks like an encoding issue and a mismatch between database encoding and client
encoding.  You can check both with:

SHOW server_encoding;
SHOW client_encoding;

Then either set the client encoding or use COPY's encoding option to match the
database encoding (I assume utf8 in this example):

SET client_encoding = 'utf8';
COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';

--
Erik




Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 13:26 CEST Peter J. Holzer  wrote:
>
> On 2023-05-25 07:14:40 +, Laura Smith wrote:
> > I'm currently doing a CSV export using COPY:
> >
> > COPY (select * from foo where bar='foo') TO '/tmp/bar.csv' DELIMITER ',' 
> > CSV HEADER;
> >
> > This works great apart from accents are not preserved in the output,
> > for example é gets converted to random characters, e.g. √© or similar.
>
> How do you check the output?
>
> If a single character is turned into 2 or 3 characters the issue is
> usually that the program which produces the output (in the case of COPY
> I think that would be the PostgreSQL server, not the client) produces
> UTF-8, but the program consuming it expects an 8-bit character set
> (typically windows-1252). See if oyu can tell that program that the file
> is in UTF-8.
>
> > How can I preserve accents ?
>
> They probably already are preserved.

You're right.  The bytes are probably interpreted as Mac OS Roman:

$ echo é | iconv -f macintosh
é
$ echo -n é | xxd
: c3a9

--
Erik




Re: Maintaining accents with "COPY" ?

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 12:08 CEST Laura Smith  
> wrote:
>
> > Looks like an encoding issue and a mismatch between database encoding and
> > client encoding. You can check both with:
> >
> > SHOW server_encoding;
> > SHOW client_encoding;
> >
> > Then either set the client encoding or use COPY's encoding option to match
> > the database encoding (I assume utf8 in this example):
> >
> > SET client_encoding = 'utf8';
> > COPY (...) TO /tmp/bar.csv DELIMITER ',' CSV HEADER ENCODING 'utf8';
>
> Hi Erik,
>
> Looks like you could well be right about encoding:
>
> postgres=# SHOW server_encoding;
>  server_encoding
> -
>  UTF8
> (1 row)
>
> postgres=# SHOW client_encoding;
>  client_encoding
> -
>  SQL_ASCII
> (1 row)
>
> I will try your suggestion...

The client encoding is not the problem here.  Using SQL_ASCII effectively uses
the server encoding.  SQL_ASCII basically means uninterpreted bytes/characters.

>From https://www.postgresql.org/docs/15/multibyte.html#id-1.6.11.5.7:

"If the client character set is defined as SQL_ASCII, encoding conversion is
 disabled, regardless of the server's character set. (However, if the server's
 character set is not SQL_ASCII, the server will still check that incoming data
 is valid for that encoding; so the net effect is as though the client character
 set were the same as the server's.) Just as for the server, use of SQL_ASCII is
 unwise unless you are working with all-ASCII data."

--
Erik




Re: CREATE TEMPORARY TABLE LIKE

2023-05-25 Thread Erik Wienhold
> On 25/05/2023 15:06 CEST Jim Vanns  wrote:
>
> When basing a temporary table of a source table, are triggers included
> by default? I have this statement;
>
> CREATE TEMPORARY TABLE dev_main (
> LIKE prod_main
> INCLUDING ALL
> EXCLUDING INDEXES
> EXCLUDING CONSTRAINTS
> ) ON COMMIT DELETE ROWS;
>
> And wondering if there is a trigger (row-based after) on prod_main
> it'll fire also on dev_main? I can't find anything in the
> documentation that suggests either way nor can I see an explicit
> EXCLUDING option to be sure triggers aren't copied.

You can check if triggers exist with psql:

\d dev_main

or by checking catalog pg_trigger:

select * from pg_trigger where tgrelid = 'dev_main'::regclass;

But no.  Triggers are not included when creating tables like that.

--
Erik




Re: event trigger should provide more details

2023-05-30 Thread Erik Wienhold
> On 30/05/2023 22:23 CEST Lian Jiang  wrote:
>
> I plan to create an event trigger to detect schema change (e.g. add/remove
> a column, change column type), and write it into a separate table (e.g.
> EVENTS). Then a process periodically reads this table to send schema change
> notification. However, the event trigger 
> (https://www.postgresql.org/docs/current/plpgsql-trigger.html)
> (43.10.2. Triggers on Events) does not provide me info such as which table
> is altered, old and new schema. Am I missing something? Thanks very much for
> any hints.

You must use ddl_command_end event triggers[0] and call function
pg_event_trigger_ddl_commands[1] to get info such as altered table and column.

[0] https://www.postgresql.org/docs/current/event-trigger-definition.html
[1] https://www.postgresql.org/docs/current/functions-event-triggers.html

--
Erik




Re: event trigger should provide more details

2023-05-30 Thread Erik Wienhold
> On 31/05/2023 00:28 CEST Lian Jiang  wrote:
>
> The info useful for me is command_tag, object_type, object_identity.
> classid, objid is not useful since object_identity is more explicit.
> objsubid is not useful because I don't need comment
> (https://www.postgresql.org/message-id/pine.lnx.4.33.0212091822050.15095-100...@leary.csoft.net)
> information for schema change.

You need objsubid to identify the column in pg_attribute to get its type, not
just to get the comment from pg_description as the linked thread says.

> Besides table name, I still need:
> * which columns are added and their types.
> * which columns have type change, the old and new types.
> * which columns are dropped.
>
> Will command field provide this info? I don't have an example and decoding it
> needs C code 
> (https://www.postgresql.org/message-id/2019071343.GA26924%40alvherre.pgsql).
> If I cannot get such info from pg_event_trigger_ddl_commands, I may need to
> maintain schema snapshots myself and diff the old and new snapshots upon an
> alter table/view event. Which way should I go? Thanks a lot.

Right off the bat, I would combine it with a ddl_command_start event trigger to
record the necessary info (current columns and their types) in a temp table.
Query this table in the ddl_command_end event trigger to figure out which
columns have changes.  This can be done entirely in plpgsql without using the
command column.

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 03/06/2023 09:16 CEST Andrus  wrote:
>
> User groups table is defined as
>
> CREATE TABLE IF NOT EXISTS public.kaspriv
> (
> id serial primary key,
> user character(10) NOT NULL,
> group character(35) NOT NULL
> ...
> )
>
> There are hundreds of users. Earlier time grant and revoke commands were
> executed for every user separately. Later revoke and grant commands for
> public were added:
>
> REVOKE ALL ON TABLE public.kaspriv FROM PUBLIC;
> GRANT SELECT ON TABLE public.kaspriv TO PUBLIC;
>
> pgAdmin SQL tab still shows revoke and grant commands for every user also:
>
> REVOKE ALL ON TABLE public.kaspriv FROM PUBLIC;
> REVOKE ALL ON TABLE public.kaspriv FROM someuser;
> REVOKE ALL ON TABLE public.kaspriv FROM someotheruser;
> ...
> GRANT SELECT ON TABLE public.kaspriv TO PUBLIC;
> GRANT SELECT ON TABLE public.kaspriv TO someuser;
> GRANT SELECT ON TABLE public.kaspriv TO someother;
> ...
>
> How to remove those unnecessary user-specific GRANT and REVOKE commands to
> make rights cleaner? pgAdmin does not have delete option for those.

When you run

REVOKE SELECT ON TABLE public.kaspriv FROM someuser;

does it also remove the accompanying REVOKE ALL statement for that user?
That REVOKE SELECT should remove the ACL for someuser from pg_class.relacl and
pgAdmin should no longer find any ACL for that role and thus no longer emit
REVOKE ALL.

> Something like
>
> DROP REVOKE ALL ON TABLE public.kaspriv FROM all EXCEPT public;
> DROP GRANT SELECT ON TABLE public.kaspriv FROM all EXCEPT public;
>
> This will be one-time action. It can be done manually in pgadmin or using
> some script running once.

Automate this with aclexplode[0] to get the privileges for specific grantees.
Loop over the result set in a DO block, generate the REVOKE commands, and
EXECUTE them.

SELECT acl.grantee::regrole, acl.privilege_type
FROM pg_class, aclexplode(relacl) acl
WHERE oid = 'public.kaspriv'::regclass;

> Using
>
> PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
>
> and latest pgAdmin 7.2

[0] 
https://www.postgresql.org/docs/12/functions-info.html#FUNCTIONS-ACLITEM-FN-TABLE

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 03/06/2023 14:46 CEST Erik Wienhold  wrote:
>
> > On 03/06/2023 09:16 CEST Andrus  wrote:
> >
> > DROP REVOKE ALL ON TABLE public.kaspriv FROM all EXCEPT public;
> > DROP GRANT SELECT ON TABLE public.kaspriv FROM all EXCEPT public;
> >
> > This will be one-time action. It can be done manually in pgadmin or using
> > some script running once.
>
> Automate this with aclexplode[0] to get the privileges for specific grantees.
> Loop over the result set in a DO block, generate the REVOKE commands, and
> EXECUTE them.
>
>   SELECT acl.grantee::regrole, acl.privilege_type
>   FROM pg_class, aclexplode(relacl) acl
>   WHERE oid = 'public.kaspriv'::regclass;

Or just execute those REVOKE ALL commands (except for PUBLIC) that pgAdmin
already gives you.

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 03/06/2023 18:14 CEST Andrus  wrote:
>
> > Or just execute those REVOKE ALL commands (except for PUBLIC) that pgAdmin
> > already gives you.
> pgAdmin gives
> REVOKE ALL ON TABLE public.kaspriv FROM someuser;
> I ran it but pgAdmin still gives this statement.

What are the current table privileges when you run this in psql:

\dp public.kaspriv

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 03/06/2023 22:33 CEST Andrus  wrote:
>
> I tried
> alter role alekspoluh reset all

This only resets role-specific settings, not privileges.

> After this command pgAdmin still shows revoke and grant commands for
> alekspoluh role.
> How to remove all grant and revoke assignments for role ?

Please confirm that \dp public.kaspriv no longer shows an ACL for alekspoluh
after running:

REVOKE ALL ON public.kaspriv FROM alekspoluh;

There must be something wrong with pgAdmin if it still shows REVOKE ALL for
that role after its ACL is gone.  Looking at the code, pgAdmin emits REVOKE ALL
for any grantee it find in the ACL.

https://github.com/pgadmin-org/pgadmin4/blob/REL-7_2/web/pgadmin/browser/server_groups/servers/databases/schemas/tables/utils.py#L712

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 03/06/2023 23:34 CEST Andrus  wrote:
>
> psql (12.2 (Debian 12.2-2.pgdg100+1))
>  Type "help" for help.
>
>  sba=# REVOKE ALL ON public.kaspriv FROM alekspoluh;
>  REVOKE
>  sba=# \o result.txt
>  sba=# \dp public.kaspriv
>
> sba=# \q
> #grep alekspoluh result.txt
> Returns nothing. So output does not contain this role.
>
> I re-opened pgadmin. alekspoluh role is no more displayed in kaspriv table
> sql window.
> pgadmin shows only single reset role command. Now it shows
> REVOKE ALL ON TABLE public.kaspriv FROM yllelohmus;
>
> I ran
>
> REVOKE ALL ON TABLE public.kaspriv FROM yllelohmus;
> After that pgadmin shows next single revoke command:
> REVOKE ALL ON TABLE public.kaspriv FROM villuuus;
> It looks like pgAdmin shows only one REVOKE command but actually there are
> more revokes.

>From your first message I was under the impression that pgAdmin shows one
REVOKE ALL for every GRANT, i.e. all REVOKE commands at once.  If that is not
the case you may have found a bug in pgAdmin.  Please ask on the pgadmin-support
list or open a GitHub issue.

Speaking of which, I found https://github.com/pgadmin-org/pgadmin4/issues/5926
which looks like the behavior you're describing.  But this was already fixed in
7.2 and your original post says that you're using 7.2.  Please check if your
version is correct.

--
Erik




Re: How to remove user specific grant and revoke

2023-06-03 Thread Erik Wienhold
> On 04/06/2023 00:08 CEST Andrus  wrote:
>
> > Should I ran separate revoke commands for every user to remove those
> > revokes ?
> > How to remove user-spefic grants ?
> After running revoke commands in psql, GRANT commands disappeared magically.
> It looks like pgAdmin does not allow execute REVOKO commands.

I don't think so.  There's nothing special about REVOKE that pgAdmin may
disallow.

> After running script which adds user group tabel modification rights for
> admin users:
> CREATE POLICY kaspriv_sel_policy ON kaspriv FOR SELECT USING (true);
>  CREATE POLICY kaspriv_mod_policy ON kaspriv USING (
>  lower(kasutaja)= current_user OR kasutaja in
>  ( select kasutaja from kasutaja where ','||firmad||','
>  LIKE '%,'|| (select firmad from kasutaja where lower(kasutaja)= 
> current_user) || ',%'
>  )
>  );
>  ALTER TABLE kaspriv ENABLE ROW LEVEL SECURITY;
>  revoke all on kaspriv from public;
>  grant select on kaspriv to public;
>  grant insert, update, delete on kaspriv to admin1, admin2;
>
> pgAdmin shows revoke commands for those users:
> REVOKE ALL ON TABLE public.kaspriv FROM admin1;
>  REVOKE ALL ON TABLE public.kaspriv FROM admin2;
> How to prevent pgAdmin to show those revokes?

pgAdmin includes the REVOKE commands so that roles will only get the privileges
listed in the subsequent GRANT commands when executing that script.  This makes
sure that the script will reproduce the current privileges regardless of what
may be granted at some later point (in case of an already existing table and
CREATE TABLE IF NOT EXISTS is used) and regardless of any default privileges
that may be defined when creating a new table.

--
Erik




Re: Drivers users by connections

2023-06-06 Thread Erik Wienhold
> On 06/06/2023 09:18 CEST Hrishikesh (Richie) Rode  wrote:
>
> We are not able to find table which give session connection details about
> drivers. In pg_stat_activity details information is not there. Please let us
> know where we can find these information.

Column pg_stat_activity.application_name is the only one I know of that may
provide such details.  But it relies on the driver setting application_name
when connecting.  Some drivers set a default, e.g. pgjdbc, but also allow
overriding with a custom application name that may include additional info
such as driver version.  But in the end application_name is just text without
any inherent structure.

--
Erik




Re: How to securely isolate databases/users in a multi-tenant Postgresql?

2023-06-09 Thread Erik Wienhold
> On 09/06/2023 08:54 CEST Alex Lee  wrote:
>
> I want to make a service that gives each of my users their own PG user and
> database. I want to keep them isolated from each other. There are no special
> extensions installed, it's a pretty vanilla PG cluster.
>
> Are there any considerations beyond making each person their own user and
> owner of their own database like this, and letting them connect to the
> database?
>
> ```
> create user u2745;
> create database d2745 owner u2745;
> -- etc.
> ```

This works but you must revoke the CONNECT privileges on each database from
PUBLIC if you do not restrict connections in pg_hba.conf.  By default every
user can connect to any database if allowed by pg_hba.conf.  It then depends
on the default privileges granted to PUBLIC what users can do in a database
that is not their database.  I would therefore also restrict connections with
pg_hba.conf:

https://www.postgresql.org/docs/current/auth-pg-hba-conf.html

Because you have to manage privileges anyway, it may be easier to use a single
database and define a separate schema for each user and only give him the USAGE
and CREATE privileges on that schema.  But be aware of default privileges that
are granted to PUBLIC.  That is described in the docs as the secure schema usage
pattern:

https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATTERNS

--
Erik




Re: How to store query result into another table using stored procedure

2023-06-09 Thread Erik Wienhold
> On 09/06/2023 07:51 CEST Rama Krishnan  wrote:
>
> CREATE or REPLACE PROCEDURE deleted_cards_count_test(start_date TIMESTAMP, 
> end_date TIMESTAMP) AS $$
> DECLARE
> current_date TIMESTAMP;
> month_start_date TIMESTAMP;
> month_end_date TIMESTAMP;
> month24_end_date TIMESTAMP;
> no_deleted_cards bigint;
> BEGIN
> current_date := start_date;
> month_end_date := to_char(date_trunc('month', current_date) + interval '24 
> month - 1 day' + interval '23 hours 59 minutes 5 seconds','-MM-DD 
> HH24:MI:SS');
> Create temporary table if not exists temp_teport_results(
> month_start_date TIMESTAMP,
> no_deleted_cards bigint
> );
>
> EXECUTE format('
> SELECT COUNT(1) filter (where status =''Undigitized'' and reason is null and 
> updated_date between %L and %L) no_deleted_cards from digi_card where 
> created_date between %L and %L
> group by months',current_date,month_end_date)INTO no_deleted_cards;
> 
> Insert into temp_teport_results (month_start_date,no_deleted_cards) VALUES 
> (month_start_date,no_deleted_cards);
> --- display result
> select * from temp_teport_results;
> END;
> $$ LANGUAGE plpgsql;
>
> It was created successfully, but when I called this procedure with parameters.
> i am getting this below error ,Pls guide me to fix the issue
>
> CALL deleted_cards_count_test( '2019-03-01 00:00:00', '2021-03-31 23:59:59');
> ERROR: too few arguments for format()
> CONTEXT: PL/pgSQL function deleted_cards_count_test(timestamp without time 
> zone,timestamp without time zone) line 16 at EXECUTE

The problem is that you expect four arguments in format to fill the four %L.
You can reuse the two arguments by using %1$L and %2$L for the third and fourth
occurence of %L.

But I don't think you need EXECUTE format() at all.  You can instead write
an INSERT SELECT statement and use the plpgsql variables in place of the format
placeholders %L:

INSERT INTO temp_teport_results (month_start_date, no_deleted_cards)
SELECT count(1) FILTER (
  WHERE status = 'Undigitized' AND reason IS NULL
  AND updated_date BETWEEN current_date AND month_end_date  -- uses the 
variables
)
...

You may want to prefix the variable names with v_ to easily spot them and
reduce the likelyhood of conflicts with column names.  Otherwise qualify the
variable names with the procedure name to avoid conflicts.  See the docs on
variable substitution:

https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-VAR-SUBST

--
Erik




Re: How To: A large [2D] matrix, 100,000+ rows/columns

2023-06-09 Thread Erik Wienhold
> On 09/06/2023 16:00 CEST Wim Bertels  wrote:
>
> Joe Conway schreef op vr 09-06-2023 om 09:16 [-0400]:
> > On 6/8/23 22:17, Pat Trainor wrote:
> > > I need to have a very large matrix to maintain & query, and if not
> > > (1,600 column limit), then how could such data be broken down to
> > > work?
> >
> >   100,000 rows *
> >   100,000 columns *
> >   8 bytes (assuming float8)
> > = about 80 GB per matrix if I got the math correct.
>
> based on my personal experience i would not use postgres in the case
> where you need many columns, u can work around this with json for
> example, but it will likely end up being less easy to work with
>
> as Joe replied: R or Python are probably a better fit,
> or another database that can easily handle a lot of columns,
> postgres is a great database, but not when you need a lot of columns
>
> (as you noted+:
> there might be another backend storage for postgres that can handle
> this better (or in the future?), but i don't think there is one;
> also there is the header for which standard 8K is provisioned anyway,
> so that is the first bottleneck (you can change this value, if you
> compile postgres yourself)
> https://www.postgresql.org/docs/current/limits.html )

Rasdaman may also be an option.  Saw it a few weeks ago on this very list.

https://rasdaman.org
https://www.postgresql.org/message-id/CAFj8pRDjE0mdL6_b86ZDawHtNeRPQLciWos3m3PGJueJ5COSjQ%40mail.gmail.com

--
Erik




Re: pg_service file questions

2023-06-20 Thread Erik Wienhold
> On 20/06/2023 08:11 CEST JUN ZHI  wrote:
>
> I was scanning through the postgresql documentations when i came across this
> webpage:PostgreSQL: Documentation: 15: 34.17. The Connection Service File
> (https://www.postgresql.org/docs/current/libpq-pgservice.html). I am fairly
> new to database and i have a few questions regarding this:
>
> 1. Is pg_service.conf and .pg_service.conf (with a dot at the front)
> different files?

Yes, they're different files.  pg_service.conf is the global service file and
.pg_service.conf the user-specific file in your home directory.

> 2. The documentation stated that the .pg_service.conf is named
> %APPDATA%\postgresql.pg_service.conf on windows which is a directory i can
> not find.

%APPDATA% should resolve to C:/Users//AppData/Roaming

You can run echo %APPDATA% in cmd.exe or echo $env:APPDATA in PowerShell to
show the actual path.  Or enter %APPDATA% in the File Explorer address bar.

You have to create directory %APPDATA%/postgresql.

> 3. The documentation also stated that we can check for the sysconfigdir
> environment variable and point it to somewhere else, but when i checked for
> the sysconfigdir path, it is pointing to C:/PROGRA~1/POSTGR~1/15/etc which
> again, is a directory i can not find : to be specific, i can not find the etc
> file stated in the pathing.

You have to create directory etc if you want to put config files there.

> So where should i put this pg_service file and does it link with the database
> itself?

I would go with the user service file because it takes precedence over the
system-wide file.  The default path of the system-wide file only works on the
database server where Postgres is running.

What do you mean with "link with the database"?  The service is file is read by
libpq before opening a database connection.

--
Erik




Re: [Beginner Question] How to print the call link graph?

2023-07-01 Thread Erik Wienhold
> On 01/07/2023 09:10 CEST Wen Yi  wrote:
> 
> I use the gdb to track the postgres like this:
> 
> ...
> pq_getbyte () at pqcomm.c:980
> 980 in pqcomm.c
> (gdb) next
> 985 in pqcomm.c
> (gdb) next
> 986 in pqcomm.c
> (gdb) next
> SocketBackend (inBuf=0x7ffc8f7e1310) at postgres.c:372
> 
> 372 postgres.c: Directory not empty.
> (gdb) next
> 403 in postgres.c
> (gdb) next
> 406 in postgres.c
> (gdb) next
> 407 in postgres.c
> (gdb) next
> ...
> 
> But the question is:
> It's too slow to input 'next' to run the postgres, I used to try to use the
> 'continut', but the gdb will run the postgres directly and not print the
> function name and code line
> 
> I want to it print like this:
> 
> ... -> pq_getbyte () at pqcomm.c:980 -> SocketBackend (inBuf=0x7ffc8f7e1310) 
> at postgres.c:372 -> ...
> 
> Can someone provide me some advice?
> Thanks in advance!

The Postgres wiki has a page on this topic:

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

Look for "backtrace" and gdb's bt command.

--
Erik




Re: psql -c command parse "select $$hello$$" failed

2023-07-04 Thread Erik Wienhold
> On 04/07/2023 14:21 CEST jian he  wrote:
>
> not sure this is the expected result.
>
> /home/jian/postgres/pg16_test/bin/psql -d test_dev -p 5455 -c "select 
> $$hello$$"
> 2023-07-04 20:15:51.066 CST [1562050] ERROR:  trailing junk after
> numeric literal at or near "884070h" at character 8
> 2023-07-04 20:15:51.066 CST [1562050] STATEMENT:  select 884070hello884070
> ERROR:  trailing junk after numeric literal at or near "884070h"
> LINE 1: select 884070hello884070

The error is expected because the shell replaces $$ with its process ID inside
double quoted strings.  Check out Bash quoting [1] (assuming that you use bash
but this applies to every(?) shell).

Either use single quotes around the statement or escape every $ with \$:

psql -c 'select $$hello$$'
psql -c "select \$\$hello\$\$\"

[1] https://www.gnu.org/software/bash/manual/bash.html#Quoting

--
Erik




Re: Strange behaviour on function

2023-07-05 Thread Erik Wienhold
> On 05/07/2023 14:23 CEST Lorusso Domenico  wrote:
>
> Hello guys,
> here a simple function
>
> CREATE OR REPLACE FUNCTION bind_action(
>  sqlstr text,
>  hrec hstore)
>  RETURNS text
>  LANGUAGE 'plpgsql'
>  COST 100
>  immutable PARALLEL SAFE
> AS $BODY$
> declare
>  _sqlstr text=sqlstr;
>  _k text;
>  _debug text;
> begin
>  _debug= '--Start' || _sqlstr;
>  foreach _k in array akeys(hrec) loop
>  _debug =_debug || format($$
>  hstore: %s %s
>  sqlStr:$$, _k, hrec[_k]);
>  _sqlstr=replace(_sqlstr, ':'||_k||':', hrec[_k]);
>  _debug =_debug || _sqlstr;
>
>  end loop;
>
>  raise notice 'final %',_debug;
>  return _sqlstr;
> end;
> $BODY$;
>
> and here a simple test
> do
> $$
> declare
>  sqlstr text=':id::bignt,:surpa:,:disfa:';
>  hs hstore;
> begin
>  hs['id']=789;
>  hs['disfa']='';
>  raise notice '%',bind_action(sqlstr,hs);
> end;
> $$;
>
> and it works.
> But...
> When I call this function in a function called by a trigger it down't work
> _debug variable becomes null, also _sqlstr becomes null...

Does the hstore contain nulls?  Function replace returns null in that case.

Please show us the trigger, its function, and a reproducer.

--
Erik




Re: Strange behaviour on function

2023-07-05 Thread Erik Wienhold
> On 05/07/2023 17:16 CEST Adrian Klaver  wrote:
>
> https://www.postgresql.org/docs/current/plpgsql-trigger.html
>
> 1)
> "A trigger function must return either NULL or a record/row value having
> exactly the structure of the table the trigger was fired for."
>
> 2) I am not seeing where you use:
>
> "TG_ARGV[]
>
>  Data type array of text; the arguments from the CREATE TRIGGER
> statement. The index counts from 0. Invalid indexes (less than 0 or
> greater than or equal to tg_nargs) result in a null value."
>
> So I don't see how sqlstr is being set?

Domenico did not provide the trigger definition, only function bind_action
which he calls from a trigger function.  Also bind_action cannot be a trigger
function because it does not return trigger.

--
Erik




Re: function to_char(unknown) is not unique at character 8

2023-07-06 Thread Erik Wienhold
> On 06/07/2023 11:19 CEST gzh  wrote:
>
> I upgraded the version of PostgreSQL from 12.6 to 12.13,
> when I execute the sql below , the to_char function caused the following 
> error.
>
> ---SQL--
> select TO_CHAR('100');
>
> ERROR: function to_char(unknown) is not unique at character 8
> HINT: Could not choose a best candidate function. You might need to add 
> explicit type casts.
>
> There is no problem before the upgrade and to_char function comes from the
> Orafce extension.
> The configuration of the old and new databases is as follows.
>
> Database server (old): PostgreSQL 12.6(orafce3.13)
> Database server (new): PostgreSQL 12.13(orafce3.24)
>
> The new database has successfully installed the orafce 3.24 extension.
> It does not occur in "PostgreSQL 12.6 and orafce 3.13",
> but occurs in "PostgreSQL 12.13 and orafce 3.24",
> so either the difference between 12.6 and 12.13 or the difference between
> orafce 3.13 and 3.24 is suspicious.
>
> What is the reason for the problem and how to fix the error?

This rings a bell:

https://www.postgresql.org/message-id/1597875806.606392.1681915893771%40office.mailbox.org

Either qualify functions with namespace oracle, e.g. oracle.to_char('100'),
or place oracle on the search path.

--
Erik




Re: INSERT UNIQUE row?

2023-07-10 Thread Erik Wienhold
> On 10/07/2023 04:25 CEST p...@pfortin.com wrote:
>
> On Sun, 9 Jul 2023 17:04:03 -0700 Adrian Klaver wrote:
>
> >On 7/9/23 15:58, p...@pfortin.com wrote:
> >> Hi,
> >>
> >> Trying to figure out how to insert new property addresses into an
> >> existing table.
> >>
> >> Can a UNIQUE constraint be applied to an entire row?  Adding UNIQUE to
> >> each column won't work in such a case since there are multiple properties
> >> * on the same street
> >> * in the same town
> >> * with the same number on different streets
> >> * etc...
> >
> >Does the locality you are in have something like the Property ID# and/or 
> >Parcel # / Geo ID shown here:
> >
> >https://property.whatcomcounty.us/propertyaccess/PropertySearch.aspx?cid=0
>
> Thanks!  Just getting started on this issue and this made me realize my
> current data source may not be the best...  Much appreciated!

Also keep in mind that it's not trivial to model addresses, even in a single
country.  Some database constraints may become a footgun.

https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/

--
Erik




  1   2   3   >