Re: Nested Stored Procedures - ERROR: invalid transaction termination 2D000

2025-03-22 Thread Kevin Stephenson
Christophe and Tom, thank you for your responses, but I'm still a bit confused. 
In my original email, the Test 2 case is allowing a ROLLBACK in the EXCEPTION 
clause without throwing an error. Is it a NOP ROLLBACK being applied to an 
aborted subTX, a real full ROLLBACK, or something else? Please advise.

Thanks,
Kevin Stephenson

From: Tom Lane 
Sent: Saturday, March 22, 2025 7:59 AM
To: Christophe Pettus 
Cc: Kevin Stephenson ; pgsql-gene...@postgresql.org 

Subject: Re: Nested Stored Procedures - ERROR: invalid transaction termination 
2D000

Christophe Pettus  writes:
> A procedure cannot issue top-level transaction control statements from within 
> an exception block, and attempting to do so raises the error you saw.  This 
> includes procedures that are called from within an exception block.

Yeah.  Postgres doesn't have autonomous transactions (not yet anyway),
and you can't fake them like that.

A way that does work, I believe, is to set up a second session with
dblink[1] and use that to issue the autonomous transaction.  Ugly
and inefficient for sure, but if you've gotta have it...

regards, tom lane

[1] 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2Fcurrent%2Fdblink.html&data=05%7C02%7C%7Ce846300d6b9c402760ec08dd69521aad%7C84df9e7fe9f640afb435%7C1%7C0%7C638782523529471489%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=2Rn9iT1VcDJgCXesww3AcwD16UIWE3HsEgniD0Byodk%3D&reserved=0<https://www.postgresql.org/docs/current/dblink.html>


Nested Stored Procedures - ERROR: invalid transaction termination 2D000

2025-03-22 Thread Kevin Stephenson
Hi all,

I'm assessing the feasibility of implementing a full featured "DB as API" 
concept in PostgreSQL (PG) and have run across an apparent inconsistency in the 
transaction (TX) handling behavior with nested stored procedures. This apparent 
inconsistency is present in both 16.4 and 17.4 running on Linux. I'm using 
pgAdmin and other clients with the default autocommit behavior (i.e. no 
AUTOCOMMIT OFF magic START TRANSACTION; commands are being sent by the clients).

Per my understanding of the docs and some PG source code review:


  *
When a top-level stored procedure is called it implicitly creates a TX if there 
is no current TX.
  *
When a nested stored procedure is called it implicitly creates a subTX for that 
invocation.
  *
When a BEGIN/EXCEPTION block is used it implicitly creates a subTX for that 
block.

(It is not clear if a top-level procedure that uses BEGIN/EXCEPTION in the 
outermost block bothers with a subTX as it would be logically coincident with 
the main TX. A similar situation exists for nested procedures that use 
BEGIN/EXCEPTION, i.e., is there a coincident subTX inside a subTX?)

In my testing, as shown in the script below, when using structured exception 
handling in nested stored procedures with an autonomous TX workaround (for 
error logging), results in error 2D000 (see Test 3). Verbose logging shows it 
to be caused by function _SPI_rollback() at line 400 (16.4) or 399 (17.4) in 
spi.c. What seems inconsistent is that if the outer procedure does not use an 
EXCEPTION block (see Test 2 ), 2D000 is not thrown and the autonomous TX 
workaround works as desired.

Please advise if this is expected behavior.

Much thanks,
Kevin Stephenson

-- WARNING: Script contains DROP statements.
-- Greatly simplified schema for demonstration.
DROP TABLE IF EXISTS public.error_log;
CREATE TABLE public.error_log (
logging_routine_nametext NULL,
sqlstatetext NULL,
sqlerrm text NULL
);

DROP TABLE IF EXISTS public.dummy;
CREATE TABLE public.dummy (
datatext NULL
);

CREATE OR REPLACE PROCEDURE public.inner_proc()
LANGUAGE plpgsql AS $$
DECLARE
dummy_var int;

BEGIN
-- Assuming subTX implicitly starts under (main) TX 'A'
INSERT INTO public.dummy (data) VALUES ('inner_proc');
dummy_var = 1/0;

EXCEPTION
-- Assuming only subTX implicitly rolled back
WHEN OTHERS THEN
-- Autonomous TX workaround.
ROLLBACK; -- rollback TX 'A' and start new TX 'B'

INSERT INTO public.error_log (logging_routine_name, sqlstate, sqlerrm)
VALUES ('inner_proc', SQLSTATE, SQLERRM);

-- commit TX 'B' and start new TX 'C'
COMMIT;
-- Autonomous TX workaround finished.

-- Rethrow for caller to handle.
RAISE;

END;$$;

CREATE OR REPLACE PROCEDURE public.outer_proc_simple()
LANGUAGE plpgsql AS $$
BEGIN
-- TX 'A' starts here
-- Simple example with no exception handling in outer proc.
INSERT INTO public.dummy (data) VALUES ('outer_proc_simple');
CALL public.inner_proc();
-- TX 'C' in aborted state with uncaught exception bubbling up to caller.
END;$$;

CREATE OR REPLACE PROCEDURE public.outer_proc_complex()
LANGUAGE plpgsql AS $$
BEGIN
-- TX 'A' starts here
-- Complex example that allows additional error logging.
INSERT INTO public.dummy (data) VALUES ('outer_proc_complex');
CALL public.inner_proc();

EXCEPTION
WHEN OTHERS THEN
-- TX 'C' should already be in aborted state. Finish it off and start 
TX 'D'.
ROLLBACK;

INSERT INTO public.error_log (logging_routine_name, sqlstate, sqlerrm)
VALUES ('outer_proc', SQLSTATE, SQLERRM);

-- We want to rethrow again so commit TX 'D'.
COMMIT;
RAISE; -- app layer can handle as appropriate
END;$$;

-- Test 1 (Works as expected.)
CALL public.inner_proc();
/*
ERROR:  division by zero
CONTEXT:  ... (truncated for brevity)
*/
SELECT * FROM public.dummy;
-- empty result set
SELECT * FROM public.error_log;
-- inner_proc, 22012, division by zero

-- Test 2 (Works as expected.)
TRUNCATE TABLE public.dummy;
TRUNCATE TABLE public.error_log;
-- Note: Do not run TRUNCATEs and CALL in a single batch.
-- Creates an outer TX that would not be done in real use.
CALL public.outer_proc_simple();
/*
ERROR:  division by zero
CONTEXT:  ... (truncated for brevity)
*/
SELECT * FROM public.dummy;
-- empty result set
SELECT * FROM public.error_log;
-- inner_proc, 22012, division by zero

-- Test 3 (Fails?)
TRUNCATE TABLE public.dummy;
TRUNCATE TABLE public.error_log;
--
CALL public.outer_proc_complex();
/*
ERROR:  invalid transaction termination
CONTEXT:  PL/pgSQL function inner_proc() line 14 at ROLLBACK
SQL statement "CALL public.inner_proc()"
PL/pgSQL function 

Re: BTREE index: field ordering

2025-03-29 Thread Kevin Stephenson
Moreno,

You have two lines in your WHERE clause commented out, the first line is a 
duplicate check on flag = TRUE, and the other line involves several checks on 
the "tipo" column. Will the final query or set of related queries actually need 
to filter on the "tipo" column? You currently have "tipo" in second place in 
your candidate index.

Thanks,
Kevin

From: Moreno Andreo 
Sent: Friday, March 28, 2025 5:38 AM
To: PostgreSQL mailing lists 
Subject: BTREE index: field ordering

Hi,
Postgres 16.4 (planning to go on 17.4)
I'm creating some indexes based on some slow query reported by logs.
These queries involve a WHERE with more than 5 fields, that are matching by =, 
<>, LIKE and IN()
I read that equality fields must be first, then the others.
Is it correct?

Based on this query

SELECT COUNT(id) AS total
FROM nx.tbl1
WHERE
(date_order >= '2025-03-21')
AND (date_order <= '2025-03-29')
AND (flag = TRUE)
AND ((
  -- (flag = TRUE)
  -- AND (((tipo <> 'C') AND (tipo <> 'V') AND (tipo <> 'F')) OR 
(tipo IS NULL) OR (tipo = ''))
  (((op <> 'C') OR (op IS NULL)) OR (tipo = 'F'))
  AND (s_state IN 
('ENQ','WFR','BLQ','BLR','WFA','FRW','FRO','','0001'))
  AND (tiporic IS NOT NULL)
  AND (tiporic NOT LIKE '%cart%')
) OR (
  (tiporic LIKE '%cart%') AND (S_state <> 'CON') AND (s_state <> 
'0002') AND ((op <> 'C') OR (op IS NULL))
))  AND (priv IS NULL OR priv = false OR (priv = true and idpriv = 
'TEST')));

Should the following index be correct?

CREATE INDEX IF NOT EXISTS tbl1_count_idx on nx.tbl1 USING BTREE(flag, tipo, 
op, priv, idpriv, date_order, s_state, tiporic);

Would it be better to create a separate GIN/GIST index for the field matched 
with LIKE?

Thanks in advance,
Moreno


Re: Quesion about querying distributed databases

2025-03-29 Thread Kevin Stephenson
Bumping this old thread to clarify a few points.

As an initial note on terminology, a "server" can be called a server, an 
instance, a server instance, or in PostgreSQL's (PG hereafter) case, a "DB 
Cluster." They all are used interchangeably in the wild to mean the same thing, 
with perhaps "server" being the most ambiguous. And that thing is a running 
process that is listening to (typically) a single port and handing connections 
off to individual threads or processes (or some other construct). The comments 
below use the word "instance" for consistency and brevity.

MariaDB
As mentioned by others, MariaDB (and MySQL proper, both hereafter referred to 
as MySQL) instances only support a single database. The CREATE DATABASE 
statement is a de facto synonym for CREATE SCHEMA (it is disappointing that 
MySQL and MariaDB docs still do not mention this significant fact). To disabuse 
those who believe MySQL support multiple databases, query any of the 
INFORMATION_SCHEMA tables and for all those that have the *_CATALOG column 
(CATALOG is a synonym for DATABASE), notice every row in every table that shows 
a catalog column, they all say "def", as in default. Further, the identifier 
used in CREATE DATABASE will show up in INFORMATION_SCHEMA tables in *_SCHEMA 
columns.

MySQL supports one and two-part object naming for DML and it supports foreign 
data access using the Federated (deprecated in MariaDB) and FederatedX 
(MariaDB) storage engines. One-part names use the currently selected schema 
(via USE statement) along with . Two-part names are 
.. Foreign data (e.g. tables) are also accessed using 
two-part names and there are limitations on what queries can be used with them 
(I believe they can be organized in a way the one-part names would work as 
well). I'm unaware of how advanced the pushdown optimizations have become, but 
the concept of pushdown is critical to understand both for the MySQL storage 
engines as well as for the Federated* "foreign data wrappers" (FDW).

To summarize, all MySQL instances have a single database called "def" and there 
is no way to create another database on a single instance at this time (MySQL 
9.2, MariaDB 11.8). To have more than one database currently requires running 
additional instances of MySQL on a different port. The idea that there are 
"high performance cross-database queries" in MySQL is simply incorrect at this 
time. There are high-performance cross-schema queries in MySQL, as there are in 
Microsoft SQL Server (MSSQL hereafter) and PG.

Note: MariaDB is planning on supporting real databases in 12.0 and they will be 
called CATALOGs, as some other RDMSs do. See:
https://mariadb.com/kb/en/create-catalog/

It's a big TBD on how well this will perform with InnoDB and other storage 
engines for cross-catalog queries (if that will even possible).

MSSQL (On-prem, VM, Azure SQL Managed Instance)
MSSQL supports one, two, three and four-part object naming.

One part names are the  and use the user's currently selected 
database (via USE  command or as set in the connection string), 
along with the user's default schema as a prefixes, and that schema prefix is 
typically dbo (similar to public in PG) to resolve object names (which behind 
the scenes are always three-part names: 
..). The default schema can be 
changed, similar to the PG search path being changed..

Two-part names are . and again use the user's 
currently selected database to get the fully qualified three-part name.

Where MSSQL differs from both MySQL and PG is in the support for three-part 
names. These are .., and you can in 
fact run real "cross-database" MSSQL queries without FDW trickery. 
Historically, I believe there were some light limitations on their usage, such 
as in Foreign Keys and Triggers, but that may no longer be the case.

Linked Servers are the MSSQL near equivalent of FDW and are used in four-part 
names: 

Two other MSSQL concepts are CONTAINED DATABASES and Azure SQL Database (with 
"Elastic Queries" as FDW, shockingly in "preview" for about a decade now). 
Those are beyond the scope of this email.

PG
On the PG side (which I know the least about), it does have real databases with 
schemas, and does support FDW. But since it uses a shared WAL (at least MSSQL 
separates logs per database, unclear on InnoDB, et. al.), and it uses a shared 
security system, I don't understand why it doesn't support cross-database 
queries. But that question would be best asked in a separate thread (any maybe 
it's already in the archives anyways). Putting all relevant objects in a single 
database with different schemas has already been mentioned as a workaround.

As far as some of the other back and forth on database (and system) 
architecture, that's really an enormous (and specialty) topic. Having 
previously worked on one of the largest MSSQL installations in the world, I 
have not seen anything in PG that would prevent a proper scaling strategy as 
long as "DB as API" was not being att