PQresultMemorySize of ResultSet larger than expected
Hi, I've instrumented by libpq code, to show the byte size of the result-set. And I'm suprised that it's never smaller than 2,264 bytes, then the next size I see is 3,288, etc... Even when receiving no rows at all, in fact. I've trimmed-out the SQL, but these are all SELECT statements. The output below is from Windows, if that matters. The two larger ones at the bottom are fetching bytea blobs. So, is this expected? I'm surprised 2KB+ is used to return no rows basically. Where's the space going? Is the reported space the same or similar to the actual bytes returned on the wire? Or they somehow add up local allocations client-side? Any way to actually know how many bytes were returned on the wire? I'm basically curious, and would appreciate pointers to better understand this. Thanks, --DD Running 1 test case... ... `-> got 1 rows x 16 cols, 3,288 bytes (2.7 MB/s) in 1.159ms ... `-> got 1 rows x 18 cols, 3,288 bytes (2.2 MB/s) in 1.417ms ... `-> got 1 rows x 18 cols, 3,288 bytes (4.1 MB/s) in 0.771ms ... `-> got 0 rows x 15 cols, 2,264 bytes (2.1 MB/s) in 1.041ms ... `-> got 0 rows x 3 cols, 2,264 bytes (2.2 MB/s) in 0.991ms ... `-> got 0 rows x 15 cols, 2,264 bytes (3.3 MB/s) in 0.662ms ... `-> got 0 rows x 3 cols, 2,264 bytes (4.3 MB/s) in 0.503ms ... `-> got 1 rows x 14 cols, 3,288 bytes (3.6 MB/s) in 0.863ms ... `-> got 1 rows x 16 cols, 3,288 bytes (5.0 MB/s) in 0.623ms ... `-> got 0 rows x 15 cols, 2,264 bytes (3.6 MB/s) in 0.602ms ... `-> got 0 rows x 3 cols, 2,264 bytes (4.4 MB/s) in 0.494ms ... `-> got 0 rows x 15 cols, 2,264 bytes (3.7 MB/s) in 0.591ms ... `-> got 0 rows x 3 cols, 2,264 bytes (3.9 MB/s) in 0.548ms ... `-> got 1 rows x 1 cols, 818,441 bytes (78.1 MB/s) in 9.996ms ... `-> got 1 rows x 1 cols, 3,288 bytes (7.9 MB/s) in 0.396ms ... `-> got 1 rows x 1 cols, 36,345 bytes (33.4 MB/s) in 1.038ms *** No errors detected *** tests_acme_foo in 2.926s (user: 0.219s) 38 MB Press any key to continue . . .
COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
Hi. It's my first time using COPY TO. And first time using built-in CSV support. Performs well. BUT... The code below (real code, but using a custom libpq wrapper lib) is run on a few tables, with unit tests that verify the number of lines of the output file. And for a few of those tables, there's a mismatch, the output from PostgreSQL "has too many lines". I've tracked these to text values in the DB with embedded newlines. These values are 'normal'. I'm not use to CSV, but I suppose such newlines must be encoded, perhaps as \n, since AFAIK CSV needs to be 1 line per row, no? So how's one supposed to configure the CSV output for the DB with embedded newlines? Thanks, --DD auto rset = my_exec(*conn_, "COPY MY_TAB TO STDOUT WITH (FORMAT CSV, HEADER)"); if (rset.status() != PGRES_COPY_OUT) { raise("CSV Export via SQL COPY error: ", rset.error_msg()); } std::ofstream os(file_name); bool done = false; while (!done) { auto buf = pq::CopyOutBuffer::get(*conn_); switch (buf.status()) { case pq::CopyOutStatus::eWait: assert(false); continue; case pq::CopyOutStatus::eDone: done = true; continue; case pq::CopyOutStatus::eError: raise("PQgetCopyData: {}", conn_->error_msg()); case pq::CopyOutStatus::eData: break; // process it below } // Each buffer seems to a single line of output, // with Unix-newline at the end, on all platforms. os.write(buf.data(), buf.size()); } os.close();
Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
On Fri, Mar 11, 2022 at 7:38 PM Daniel Verite wrote: >> Dominique Devienne wrote: > > These values are 'normal'. I'm not used to CSV, but I suppose such newlines > > must be encoded, perhaps as \n, since AFAIK CSV needs to be 1 line per row, > > no? > > No, but such fields must be enclosed by double quotes, as documented > in RFC 4180 https://datatracker.ietf.org/doc/html/rfc4180 Hi Daniel. OK, good to know, thanks. > Consider this output: > psql> COPY (values (1, E'ab\ncd'), (2,'efgh')) TO STDOUT CSV; > 1,"ab > cd" > 2,efgh yes, this is consistent with what I am seeing in the debugger. The value is indeed double-quoted, and definitely contains a _raw_ '\n' newline, and not an encoded "backslash then n", as David was writing. > That's 2 records on 3 lines. > If you feed this to a parser and it chokes on it, it means that it's not a > valid CSV parser. I never pretended that parser to be a CSV parser :). It's a naive "wc -l" equivalent written by someone else. And it's comparing PostgreSQL CSV output to some old legacy "CSV" output that's home grown, which does encode newlines as '\\' and 'n', and which most likely is not RFC compliant (I'm not suprised! :))). In my case, backward-compat is more important than "compliancy" (if that's a word!), so I can easily do that "\n" encoding myself, as a post-processing on the buffer I get back. Thank you for the help. --DD PS: And David, no, it's not my wrapper that's at fault here :). It's a thin wrapper, that's just easier, terser, and safer (RAII) to use compared to naked libpq.
Re: COPY TO STDOUT WITH (FORMAT CSV, HEADER), and embedded newlines
On Fri, Mar 11, 2022 at 8:32 PM Daniel Verite wrote: > Dominique Devienne wrote: > > so I can easily do that "\n" encoding myself, as a post-processing on > > the buffer I get back. > > Alternatively, it might be easier to use the default TEXT format of > COPY rather than CSV, as the TEXT format already produces \n for > line feeds, along with half a dozen other special backslashes sequences. > See https://www.postgresql.org/docs/current/sql-copy.html I actually just submitted CSV+PostProcessing :) But sure, if TEXT does the kind of pseudo-CSV I need, I'd change it to use it. I'll look into it next week. Thanks again Daniel. --DD
PSQL bug?
Made a typo, using } instead of ) to test a weird table name, and got disconnected. Normal? --DD ddevienne=> create table "t |" ( id int}; ddevienne(> ); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. ddevienne=> create table "t |" ( id int); ddevienne=> drop table "t |"; ddevienne=> \! psql -V psql (PostgreSQL) 12.1 ddevienne=> select version(); version - PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit (1 row)
Re: PSQL bug?
On Thu, Mar 17, 2022 at 11:38 AM Pavel Stehule wrote: >> Made a typo, [...], and got disconnected. Normal? --DD > I don't see any problem on master OK, good. Must have been fixed already then. Or was a fluke. Thanks for checking. --DD
Re: Major release of advanced PostgreSQL driver written in C++
On Tue, Apr 12, 2022 at 12:17 PM Dmitry Igrishin wrote: > I'm excited to announce Pgfe 2.0.0, an advanced and feature rich > PostgreSQL driver written in C++. Congrats on the release. I've been following your progress, and saw good stuff in your commits. Could you please more clearly state in the project's README.md the minimal version of C++ required, and whether the interface or impl changes with higher versions of std? Your hello-world example (extracted below), uses lambda-auto (C++14), probably variadic templates (C++11 and up, C++17 for fold-expressions), etc... So if you require C++20, that's important to know (since out of bounds to me, for example). In the same vein, what platforms / compilers / versions it's built and tested on is also useful to know upfront. (again, as an example, I need to be cross-platform and cross-compiler, including Windows and MS VS). My $0.02. And thanks for sharing your work. --DD // Executing statement with positional parameters. conn.execute([](auto&& r) { std::printf("Number %i\n", to(r.data())); }, "select generate_series($1::int, $2::int)", 1, 3);
Who am I? Where am I connected?
LibPQ has various defaults for the host, user, and DB name. There's also the password file, the service file and service name. In the example below, I can connect with a "naked" psql invocation. Once connected, can I find out all aspects of the connection string? Or where they came from, like a pgpass.conf or service file? How to get the host, port, db name once connected? SHOW and pg_settings does not appear to be it, at first glance. Thanks, --DD c:\Users\ddevienne>psql psql (12.1, server 14.2) WARNING: psql major version 12, server major version 14. Some psql features might not work. WARNING: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page "Notes for Windows users" for details. Type "help" for help. ddevienne=>
Re: Who am I? Where am I connected?
On Wed, May 18, 2022 at 12:07 PM Dominique Devienne wrote: > LibPQ has various defaults for the host, user, and DB name. > There's also the password file, the service file and service name. > In the example below, I can connect with a "naked" psql invocation. > > Once connected, can I find out all aspects of the connection string? Thank you all for \conninfo. I was more thinking at the time about the SQL-way to get that info, not the psql way. But thanks to https://www.postgresql.org/docs/current/functions-info.html I managed to emulate it, modulo resolving the server's IP into a hostname. ddevienne=> \conninfo You are connected to database "ddevienne" as user "ddevienne" on host "localhost" (address "::1") at port "5432". ddevienne=> select current_database() || ' ' || session_user || ' ' || inet_server_addr() || ':' || inet_server_port(); ?column? -- ddevienne ddevienne ::1/128:5432 (1 row) > Or where they came from, like a pgpass.conf or service file? OTOH, no one replied to that part of the question. How to know if the user or database name was defaulted? Or came from a service-file, using a given service name? Is there no way, except by reverse-engineering the logic of the env-vars and built-in defaults?
Re: Who am I? Where am I connected?
On Wed, May 18, 2022 at 5:43 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Wed, May 18, 2022 at 3:08 AM Dominique Devienne > wrote: > >> Once connected, can I find out all aspects of the connection string? >> Or where they came from, like a pgpass.conf or service file? >> >> How to get the host, port, db name once connected? >> SHOW and pg_settings does not appear to be it, at first glance. >> >> > The server has no clue how the values sent to it came into existence - nor > should it. > > Whether and how any particular client might expose this kind of debugging > information (or upgrade it to proper state info) is up to the client. I do > not know what options psql offers. > AFAIK, it’s not psql that does this though, it’s libpq the official client api. And the fact libpq has no way to surface that info seems like an important oversight. >
No default for (user-specific) service file location on Windows?
Hi, The doc is explicit about defaults for the password file: >From https://www.postgresql.org/docs/current/libpq-pgpass.html Linux: ~/.pgpass Windows: %APPDATA%\postgresql\pgpass.conf But for the service file OTOH, only the Linux default is documented: From: https://www.postgresql.org/docs/current/libpq-pgservice.html Linux: ~/.pg_service.conf Windows: N/A I tried the obvious %APPDATA%\postgresql\pg_service.conf, with no luck. So is this a lack in the documentation for the service file? Or is it instead a lack of symmetry in the implementations of these two mechanisms? If the latter, any chance the symmetry would be restored? (with corresponding doc update). I guess the only work-around specific to Windows at this point, if the latter, would be defining the PGSERVICEFILE env-var explicitly? --DD
Re: No default for (user-specific) service file location on Windows?
On Thu, May 19, 2022 at 10:17 AM Julien Rouhaud wrote: > > Or is it instead a lack of symmetry in the implementations of these > > two mechanisms? > > As far as I can see from the code it should be > %APPDATA%\postgresql\.pg_service.conf > > Does this one work? Yes it does! Thanks. > There is a discrepancy (apart from the documentation) as the default > file name is the same on Linux and Windows. The doc is explicit about Windows for the password file, probably should be for the service file as well IMHO. It's unfortunate that the naming conventions are different. Might need to be warned about in the doc too. My $0.02. Could also be changed to try both on Windows, with or w/o the leading dot, but I suspect this kind of change won't be considered worth implementing :). Thanks again Julien, --DD
Re: No default for (user-specific) service file location on Windows?
On Fri, May 20, 2022 at 5:31 AM Julien Rouhaud wrote: > On Thu, May 19, 2022 at 06:38:46PM -0400, Tom Lane wrote: > > Julien Rouhaud writes: > > Pushed, with some minor twiddling to make ... descriptions more alike. > Thanks a lot! And thank you both! --DD
Re: accessing postgres from c++
On Tue, Jun 21, 2022 at 7:59 AM Laurenz Albe wrote: > I am surprised that you choose to be fettered by the constraints of a generic > API like ODBC. > For me, that only makes sense if you want to be portable to different > databases. Ditto. > I would use ... C API of libpq, or, .. C++ boilerplate around it, use libpqxx. There's also this modern C++ libpq wrapper, announced on this list earlier this year: https://github.com/dmitigr/pgfe PS: Haven't used it. but followed it's development (to inform my own non-OSS C++ wrapper dev), and it looked solid from a distance. PPS: I think we tried libpqxx in the past, and I kinda remember it forced you into non-binary mode, which was a non-started for performance.
About revoking large number of privileges; And the PUBLIC role.
Hi. I'm struggling to delete databases because of grants to roles on objects of those DBs. These DBs can have a large'ish number of schemas, 100-300 is typical. and define a bunch of ROLEs "specific" to those schemas. Normally "login user" ROLEs are never granted explicit access to objects, instead only the "db specific" ROLEs get those grants, and regular users are granted some of those ROLEs. So my goal is to delete all those "db specific" ROLEs, then the DB with all its schemas. Which implies REVOKE'ing grants on all those "db specific" ROLEs first. (and "login users" just implicitly lose membership in "db specific" roles when the latter are dropped) OK, so to help me achieve that "mass revoking", I thought I'd use pg_shdepend, but turns out to not be that easy to figure this shared catalog out... I'm made some progress on that, but then trying various manual REVOKEs, I often don't see any changes in pg_shdepend, and I'm not sure whether it's because of PUBLIC, or because of DEFAULT PRIVILEGES, or because I'm not always revoking using the same ROLE as the one that did the grants, or something else I don't yet know about... So I'd thought I'd seek clarifications here, and go back to asking a few basic basic-principle questions. The way the GRANTs are made right now is via: `ALTER DEFAULT PRIVILEGES IN SCHEMA $schema GRANT $priv TO $role` With (so far) $priv taking all these values: - "SELECT ON TABLES", - "USAGE, SELECT ON SEQUENCES", - "EXECUTE ON ROUTINES", - "USAGE ON TYPES". the default privileges are updated *before* schema objects are created. those all "db specific" (and "schema specific" too) roles are getting their object grants via DEFAULT PRIVILEGES. So my first question is whether revoking from the DEFAULT PRIVILEGES is enough to "ungrant" all those object privileges? My reading of the doc seem to imply it does, but after for example ALTER DEFAULT PRIVILEGES IN SCHEMA $schema REVOKE SELECT ON TABLES FROM $role nothing in pg_shdepend changed. Could it be related to the PUBLIC role? Which bring me to questions on PUBLIC. I suspect there's something I don't understand here. >From my reading, all ROLEs are implicitly members of PUBLIC, and you cannot avoid that? And it seems many GRANTs are implicitly made to public, w/o my realizing it? Just recently, I discovered any user could connect to new databases I created, when I didn't want that. I've started to explicitly `revoke all on database {} from public` everytime I create a new DB. But I just don't quite understand why I can't seem to avoid PUBLIC having implicit access. Am I missing something? So similarly, is PUBLIC getting implicit access to my DEFAULT PRIVILEGES too? My pg_shdepend results seem to indicate so, although I'm not 100% sure (and why I'm here now). What steps do I need to take to ensure PUBLIC gets "nothing" on the DBs and SCHEMAs I create? I always want grants to be made explicitly, on specific ROLEs I control, never implicitly. Right now, I'm doing most of my experimentation as a SUPERUSER, which could be skewing my (non-conclusive) results. That's unlikely to be the case in the "real-world". I suppose it matters who revokes privileges? How can I translate rows in pg_shdepend into the proper REVOKE call? ``` select c.relnamespace::regnamespace::text, r.rolname, count(*) from pg_database db join pg_shdepend dep on db.oid = dep.dbid join pg_roles r on dep.refobjid = r.oid join pg_class c on dep.objid = c.oid where datname = current_database() and refclassid::regclass::text = 'pg_authid' and classid::regclass::text = 'pg_class' group by 1, 2 ``` So far, my (naive?) attempt above tells me how my dependencies I have between ROLEs and SCHEMAs. That tells me how my dependencies I have between them (I think at least!), but I don't know how to turn what the aggregation is hiding into the proper REVOKE calls, from what's inside pg_shdepend. Can somehow help with that? Should it be a separate thread, just on that? Finally, a tangentially related question, to end this first email, related to what is transactional and what isn't. The doc doesn't seem to say much on this subject. - And all GRANT/REVOKE transactional? - Is DB create/drop transactional? I'm trying to get to grasp with dropping DBs and ROLEs, and would appreciate this list's expert opinions. I suspect this is only the first message of a thread, I tried to put just enough info I thought relevant to my first questions. Thanks, --DD
Re: About revoking large number of privileges; And the PUBLIC role.
On Thu, Jul 7, 2022 at 3:52 PM Tom Lane wrote: > > Dominique Devienne writes: > > Hi. I'm struggling to delete databases because of grants to roles on > > objects of those DBs. > > > These DBs can have a large'ish number of schemas, 100-300 is typical. > > and define a bunch of ROLEs "specific" to those schemas. Normally "login > > user" > > ROLEs are never granted explicit access to objects, instead only the > > "db specific" ROLEs > > get those grants, and regular users are granted some of those ROLEs. > > > So my goal is to delete all those "db specific" ROLEs, then the DB > > with all its schemas. > > Which implies REVOKE'ing grants on all those "db specific" ROLEs first. > > You should not really have to revoke those manually. > The normal process for that is to use DROP OWNED BY. Except we already went through that, that DROP OWNED BY acquires too many locks. Increasing max_locks_per_transaction when it fails is just not an option IMHO. One user had to raise it to 32K for his particular DB, which is not even that large. Or are you saying setting it to 1M or 1B is "safe", and should be required setup for users? Is revoking privileges taking locks? Is dropping a DB taking locks? If neither are, then I can work around the limitations of DROP OWNED BY. So will the community help me figure this out? BTW, I'm also hoping revoking privs, and dropping roles and dbs will be faster than DROP OWNED BY. That DB took 30min to 1h to get rid of, via DROP OWNED BY, which seems to long to delete a bunch of files, no?
Number of updated rows with LibPQ
Hi, Is there a way to programmatically now how many rows an UPDATE did update? I've read about [PQcmdTuples][1], but surely I shouldn't have to parse that string, no? For selects, I have [PQnTuples][2], but what to do on INSERT, UPDATE, DELETE? Parse the result of PQcmdTuples myself??? If so, what's the 0 in the INSERT below? Is the output of PQcmdTuples "stable", i.e. "official"? Not locale dependent? Thanks, --DD [1]: https://www.postgresql.org/docs/14/libpq-exec.html#LIBPQ-PQCMDTUPLES [2]: https://www.postgresql.org/docs/14/libpq-exec.html#LIBPQ-PQNTUPLES postgres=# create table foo (v int); CREATE TABLE postgres=# insert into foo values (1), (2), (3); INSERT 0 3 postgres=# update foo set v = 2*v where v = 2; UPDATE 1 postgres=# delete from foo where v = 3; DELETE 1 postgres=# select v from foo where v > 1; ... (1 row) postgres=# delete from foo; DELETE 2 postgres=# drop table foo; DROP TABLE postgres=#
[libpq] OIDs of extension types? Of custom types?
Hi. I'm using binary binds and results for DMLs, as well as for COPY. So far, I've stayed within built-in scalar and array types, thus I could hardcode the OIDs of values "in the type-system" (in C++, via traits). But I'd like to venture into extension (e.g. hstore) and custom (enums, domain, etc...) types. Thus I'm wondering: 1) whether "official" extensions have fixed/stable OIDs, like in my hstore example. If so, where are they defined? 2) how should I be looking up OIDs for custom (or extension?) types with libpq? Any specific APIs? Or I need to do SQL instead? 3) If I duplicate custom types per-schema, to keep them standalone, they'll get different OIDs, right? Thanks for any insights on the above. --DD
Re: Number of updated rows with LibPQ
On Wed, Oct 5, 2022 at 8:17 PM Tom Lane wrote: > Laurenz Albe writes: > > On Wed, 2022-10-05 at 16:38 +0200, Dominique Devienne wrote: > > Yes, you have to use PQcmdTuples(), and you have to convert the string to > > an integer. > > But don't worry: the result will *not* be "INSERT 0 5", it will be just "5" Thanks. What's the leading 0 though, then? I guessed it might be the number of rows returned, but it isn't, see below: postgres=# create table foo (id serial primary key, v int); CREATE TABLE postgres=# insert into foo (v) values (1), (2) returning id; id 1 2 (2 rows) INSERT 0 2 > Yeah, just applying atoi() or atol() to the result should be enough. Thanks too. Since I'm in C++, I used instead, and discovered it can be empty something, not 0. I guess atoi() would have hidden that distinction, and worked anyway (returning 0). In the same topic, I've noticed an INSERT returns PGRES_COMMAND_OK, while an INSERT-RETURNING returns PGRES_TUPLES_OK. So there's no way to use the status to distinguish a SELECT from anything else? A RETURNING clause makes any statement supporting it an hybrid of a command and a query, but then how does one know the exact "kind" of the statement? E.g. So how does psql show INSERT in either cases? By parsing the SQL itself, client-side? Or is there a libpq API on PGresult that would allow to get the type of statement the result is from? Thanks, --DD
Re: [libpq] OIDs of extension types? Of custom types?
On Fri, Oct 14, 2022 at 2:31 PM Laurenz Albe wrote: > You use the #defines like TEXTOID for the built-in Oids, right? I don't. I used https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_type.dat as a reference. I suspect that should be fairly stable, right? I have at least 2 or 3 dozen OIDs pairs (scalar + array) of primitives and other types (Oid, Name, Text, Bytea, Uuid, etc...). Are there #defines for all of those? Where? template<> struct OidTraits { // boolean, true/false static constexpr Type type{ "bool", 1, Oid{ 16 }, Oid{ 1000 } }; }; template<> struct OidTraits { // variable-length string, binary values escaped static constexpr Type type{ "bytea", -1, Oid{ 17 }, Oid{ 1001 } }; }; etc... > For types from an extensions, you would run a query on "pg_type". OK, thanks.
Re: [libpq] OIDs of extension types? Of custom types?
On Fri, Oct 14, 2022 at 4:35 PM Tom Lane wrote: > > I don't. I used > > https://github.com/postgres/postgres/blob/master/src/include/catalog/pg_type.dat > > They're stable Good to know, thanks. > but writing magic numbers leads to unreadable code. > Use the macros from catalog/pg_type_d.h. OK. I see that header. But it seems to be a "server" header, not a client one. I.e. I'm not sure it's a good idea to depend on such a header for pure client-side libpq code. > >> For types from an extensions, you would run a query on "pg_type". > > OK, thanks. > > In SQL queries, you can avoid hard-wiring anything by writing > things like "'hstore'::regtype". It may or may not be possible > to avoid fetching the OID altogether that way. Something like below you mean? Thanks for the tip. I do need to OIDs, on binds and results (defines in OCI speak), in my C++ code. Because I try to enforce strong type safety between the C++ world, and the libpq "bytes". I don't check OIDs of the values on binds, I just give them to libpq on execute(). But for results, I compare the actual OID (from PGresult) and the expected OID from the C++ type (via the traits). ddevienne=> select 'hstore'::regtype; regtype - hstore (1 row) ddevienne=> select 'hstore'::regtype::oid; oid --- 207025799 (1 row) ddevienne=> select 'uuid'::regtype::oid; oid -- 2950 (1 row) ddevienne=> select 'uuid[]'::regtype::oid; oid -- 2951 (1 row)
Re: Number of updated rows with LibPQ
On Fri, Oct 14, 2022 at 4:39 PM Daniel Verite wrote: > Or a SELECT, or an UPDATE RETURNING, or a DELETE RETURNING. > The code still need to look at PQcmdStatus() to learn which kind it is, Thanks Daniel. This is exactly what I needed. --DD PS: Now I wonder whether PQcmdTuples() points to the same buffer as PQcmdStatus(), except with an offset...
byte-size of column values
Hi. I'm surprised by the result for bit(3) and char, when calling pg_column_size(). Why 6, instead of 1? The doc does mention 5-8 bytes overhead, but I expected those for varying bit, not fixed-sized bit typed values. How come? Similarly, why 2 for char? Is it linked to Unicode? 1 byte for the varying UTF-8 encoded length, then the (potentially) multi-byte encoding? (the very bit pattern of UTF-8 allows to infer the encoded length, so storing the length explicitly is theoretically not even necessary) Similarly, enums are always 4 bytes I read, despite rarely exceeding cardinalities beyond a single byte can store. How does one store as compactedly as possible several small enums, on millions of rows? And is the minimum column size always 2? I'm sure many we call out "premature optimization", but isn't using 32 bits instead of 2, 3 (or 8, to round to a byte) wasteful, in disk space, thus then ultimately energy? (OK, that last one is pushing it :) ). I'm sure there are reasons for the above. And I guess I'm curious about them. Thanks, --DD ddevienne=> create table foo (b3 bit(3), i2 int2, i4 int4, i8 int8, c char); CREATE TABLE ddevienne=> insert into foo values (b'101', 1002, 1004, 1008, 'C'); INSERT 0 1 ddevienne=> select pg_column_size(b3), pg_column_size(i2), pg_column_size(i4), pg_column_size(i8), pg_column_size(c) from foo; pg_column_size | pg_column_size | pg_column_size | pg_column_size | pg_column_size ++++ 6 | 2 | 4 | 8 | 2 (1 row)
Re: byte-size of column values
On Tue, Oct 18, 2022 at 6:04 PM David G. Johnston wrote: > On Tue, Oct 18, 2022 at 8:53 AM Dominique Devienne > wrote: >> I'm surprised by the result for bit(3) and char, when calling >> pg_column_size(). > The base type is what matters, if the length of the actual type is a parameter > (the (n) part) the underlying type must be variable. Thanks. Interesting. Didn't know (n)-suffixed "fixed-length" types where always based on variable-size ones. >> How does one store as compactedly as possible several small enums > int2 OK, I see. Thanks again. > p.s., pretend char doesn't even exist. I realize that now. Wasn't obvious to me, despite the warning in the doc.
Re: byte-size of column values
On Tue, Oct 18, 2022 at 6:04 PM Tom Lane wrote: > Dominique Devienne writes: > > I'm surprised by the result for bit(3) and char > > The doc does mention 5-8 bytes overhead, but I expected > > those for varying bit, not fixed-sized bit typed values. > > Your expectation is incorrect. Postgres always treats these types > as variable-length, whether or not the column has a length constraint. OK. Still, wasn't such a stretch to assume that, no? Now I know better, thanks to you and David. I'm not sure the doc on types talks about that either. Didn't see it for sure (but could still be there and I missed it). > Thus, there's always a header to store the actual length. That can > be either 1 or 4 bytes (I think the doc you are looking at might be > a little out of date on that point). Even the doc on v15 (or devel) still says 5-to-8. https://www.postgresql.org/docs/15/datatype-bit.html And on my v12, that's born out from my experimentation. Being used to SQLite using varints, I'd have expected fewer overhead bytes for the size, like your 1-to-4. > Because of the popularity of variable-width character encodings, > a column declared as N characters wide isn't necessarily a fixed > number of bytes wide, making it a lot less useful than you might > think to have optimizations for fixed-width storage. Between that > and the fact that most Postgres developers regard CHAR(N) as an > obsolete hangover from the days of punched cards, no such > optimizations have been attempted. Thanks for the background. I definitely appreciate PostgreSQL's large 1GB limit on text and bytea columns, coming from Oracle's tiny 4K one, which created us all kind of headaches. For kicks, I looked at bpchar, blank-padded-char, and its extra byte, which I assume is again some kind of length, there at least the overhead is small compared to bit(n). 1 bytes versus 5 bytes is no small difference. ddevienne=> create table bar (bpc bpchar(16)); CREATE TABLE ddevienne=> insert into bar values ('foo'), ('bar baz'); INSERT 0 2 ddevienne=> select length(bpc), pg_column_size(bpc), '<'||bpc||'>' from bar; length | pg_column_size | ?column? ++--- 3 | 17 | 7 | 17 | (2 rows)
How to store "blobs" efficiently for small and large sizes, with random access
Hi. I'd like some advice storing blobs (millions of them), ranging from very small, to large > 1GB. I know about bytea versus lo, and I have probably read most of what's out there about them :) Upfront, I have to state that I'm not keen on lo, because of security considerations. We store blobs in many different schemas, and users can access some schemas, and not others. So the fact the lo table is unique for the whole database would allow users to see blobs from any schema, as I understand it. Right? OTOH, lo has random access, which I also need... (I'm also not a fan of lo needing triggers for lifetime management) (nor of the requirement to have an explicit transaction to use lo). Here are the main requirement I need to fulfil: 1) store literally millions of rows, 1 "blob" per row. (scientific data). 2) most "blobs" are smallish, below a few KBs. bytea's perfect for that. 3) yet many blobs are on the dozens of MBs. bytea's still OK. 4) but some blobs exceed the 1GB byte limit. Found at least a dozen just in our test data, so clients will have them. 5) for accessing larger blobs, the API I must implement accesses contiguous chunks of the blobs. Thus I need random access. I'm porting the backend of that API from Oracle to PostgreSQL. In Oracle we used RAW for smaller values, and SecureFile blobs for larger ones. Oracle LOBs are similar to lo I guess, providing random access, except higher performance for large sizes, but slower than bytea for smaller one. But a PITA to deal with, with locators, breaking pre-fetching, etc... PostgreSQL bytea is much better and simpler, except limited to 1GB... Bytea also has no direct random access, except via substr[ing], but how efficient and "random access" is that? For example, SQLite stores large values in overflow pages, and has to traverse all those pages to pointer-chase the chain of those for large blobs, so the substr() "random access" is NOT O(1) and involves way too much IO (i.e. O(N)). Do TOAST'ed large values (text or bytea) fare better in term of substr() efficiency, close to O(logN) and only the necessary IO for what's actually requested by the substr() (modulo the extrema 2K chunks of TOAST)? Some posts on the internet also peg lo to be faster above 20MB compared to bytea, for example. Is that even true? One post I read (from Daniel Verite I think) kinda hinted the lo table is not that different from TOAST ones, with 2K chunks, thus I don't see why lo would be any faster than a TOAST'ed bytea for example. Any insights from someone with knowledge of the internals can share on this? At this point, I have not yet redone the extensive benchmarking we did a few years ago between Oracle and PostgreSQL, which went into lo versus bytea then. And benchmark are hard to do, not having easy access to different kind of servers with different storage backends, or cloud-hosted PG. It's too easy to get biais from a local setup, leading to a design that'd not perform optimally in a different one. That's why I'm asking a more open-ended question to experts on this list. Because of #4 above, I need to either use lo (but see above, notably the security concern), or roll-up my own TOAST^2 (squared), as I call it, where I manually "shard" / chunk large blobs in an auxiliary table, itself TOAST'ed of course, with some threshold for chunks (e.g. 1 or 4 or 16 MB perhaps). The latter keeps the blobs in the schema (good for security), lifetime is managed by FKs (as usual), and the sharding limits the ill-effects of "emulating" random-access with substr() if necessary. I've already done things like this in SQLite land (also has the 1GB limit for its text and blob types). So is this a terrible idea? What alternatives do I have? I'd really appreciate some expert advice on the above, before I go too far down the rabbit hole. Thanks, --DD PS: Another drawback of lo is that because it's a single table, it still subject to the 32 TB limit on a relation. The 4TB limit per lo is way more than we need, but the 32 TB limit may actually be more of an issue for our larger clients, which have thousands of projects, each with upwards of a few millions of those blobs. bytea values being stored in different schemas (per-project, a design constraint), puts that limit per-project which will be more than enough. For the sum of all projects, maybe not... I.e. with real client-case of 3K projects, that puts an average of only 10GB of lo's per-project (i.e. schema), which could very well be problematic...
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh wrote: > First advice, don't do it. We started off storing blobs in DB for “TX > safety” > Not really an option, I'm afraid. > , but backup/restore quickly became too cumbersome so we ended up moving > all blobs out and only store reference in DB. > This required us to make a “vacuum system” that cleans up the blob-storage > regularly as ROLLBACK/crash can make it out of sync. > Note the fact the data is spread in many mostly independent schemas. Might ease the backup/restore? I'm not much of a DBA though... > We chose storing as LO because with it, streaming large blobs (not using > much memory) actually worked, with JDBC at least. > I'm in C++, with I believe efficient use of binary binds and results, and use of COPY as much as possible, so as good as it gets I think (that I know of, with libpq), in terms of performance. Haven't looked at libpq's new in v14 pipeline mode yet though. Investigated Cursor vs Statement too, and it's a tradeoff between latency and throughput. Not sure how good JDBC drivers for PostgreSQL are. I'm doing my best to go fast with libpq. In any case, thanks for your input. But it's not really a question of "if". But of "how". Putting thousands of large blobs in the file system is a no go. Assuming the clients can even see the file system the server sees. This is a 2-tier system, there's no mid-tier that would somehow magically handle proper security and lifetime management of these blobs. Thanks, --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 1:00 PM Andreas Joseph Krogh wrote: > Ok, just something to think about; Thank you. I do appreciate the feedback. > Will your database grow beyond 10TB with blobs? The largest internal store I've seen (for the subset of data that goes in the DB) is shy of 3TB. But we are an ISV, not one of our clients, which have truly massive scale for data. And they don't share the exact scale of their proprietary data with me... > If so try to calculate how long it takes to restore, and comply with SLA, > and how long it would have taken to restore without the blobs. Something I don't quite get is why somehow backup is no longer needed if the large blobs are external? i.e. are you saying backups are so much more worse in PostgreSQL than with the FS? I'm curious now. Also, managing the PostgreSQL server will be the client's own concern mostly. We are not into Saas here. As hinted above, the truly massive data is already not in the DB, used by different systems, and processed down to the GB sized inputs all the data put in the DB is generated from. It's a scientific data heavy environment. And one where security of the data is paramount, for contractual and legal reasons. Files make that harder IMHO. Anyways, this is straying from the main theme of this post I'm afraid. Hopefully we can come back on the main one too. --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 1:38 PM Andreas Joseph Krogh wrote: > There's a reason “everybody” advices to move blobs out of DB, I've learned. > I get that. I really do. But the alternative has some real downsides too. Especially around security, as I already mentioned. That's why I'd like if possible to get input on the technical questions of my initial post. That's not to say we wouldn't ultimately move out the big blobs outside the DB. But given how much that would complexify the project, I do believe it is better to do it as a second step, once the full system is up-and-running and testing at scale has actually been performed. We've already moved other kind of data to PostgreSQL, from SQLite DBs (thousands) this time, and ported "as-is" the sharding done on the SQLite side to PostgreSQL (despite TOAST). And so far, so good. With good ingestion rates. And decent runtime access to data too, in the albeit limited testing we've had so far. Now we need to move this other kind of data, from proprietary DB-like files this times (thousands too), to finish our system, and be able to finally test the whole system in earnest, and at (our limited internal) scale. So you see, I'm not completely ignoring your advise. But for now, I'm inquiring as to the *best* way to put that data *in* PostgreSQL, with the requirements / constraints I've listed in the first post. It may indeed be a bad idea long term. But let's make the most of it for now. Makes sense? Am I being unreasonable here? --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 3:01 PM Daniel Verite wrote: > Dominique Devienne wrote: > > the fact the lo table is unique for the whole database would allow > > users to see blobs from any schema, as I understand it. > Each large object has its own set of permissions. This is a significant > difference with bytea, since every creation of a new large object > may need to be followed by GRANT statements. > Also if the roles and the access policies are changed in the > lifetime of the app, that might imply massive REVOKE/GRANT > statements to apply to existing objects. Thank you Daniel. Very interesting, and something I definitely didn't know. I believe that's doable, given our design on ROLEs, but would for sure be both a PITA, and additional management / code to deal with. At least GRANTs are transactional like the new LO oids themselves, I think, so now I know it would be possible to properly secure the LOs. This insight is greatly appreciated. --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 3:05 PM Alvaro Herrera wrote: > On 2022-Oct-19, Dominique Devienne wrote: > > OTOH, lo has random access, which I also need... > > Generally speaking, bytea sucks for random access, because if a TOAST > item is compressed, it has to be always read from the beginning in order > to decompress correctly. However, if you set > ALTER TABLE tab ALTER COLUMN byt SET STORAGE EXTERNAL > then compression is not used, and random access becomes fast. Thank you Álvaro. Quite insightful as well. I was aware of TOAST compression, but didn't make the connection to the side-effects on random-access. But now that TOAST has LZ4 support, which decompresses extremely fast, compared to ZLib (I have experience with LZ4 for a WebSocket-based server messages), and choosing an appropriately small shard/chunk size, that might be mitigated somewhat. Would need testing / benchmarking to compare uncompressed vs LZ4, at various chunk and subset/offset sizes, of course. Anybody has an answer to my question regarding how substr() works on bytea values? I.e. is it "pushed down" / optimized enough that it avoids reading the whole N-byte value, to then pass it to substr(), which then returns an M-byte value (where M < N)? If TOAST stores 2,000 chunks, and those chunks' PKs are numbers, simple arithmetic should be able to select only the chunks of interest, those incurring only the necessary IO for the selected range, no? Or the fact subsetting a bytea currently requires substr() prevents using such a scenario? And if so, why not support a native subsetting notation that did support that scenario, like the obvious bytea_col[offset, count] or bytea_col[start:end]? Seems to be me efficient native subsetting of varlength values would be quite valuable.
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 4:29 PM Daniel Verite wrote: > In your case I would probably opt for bytea (as opposed to large > objects), and slicing the blobs in the application in chunks Thanks for the advice, and the valuable info on LO permissions. --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 5:30 PM Alvaro Herrera wrote: > That's exactly what I was trying to say. > If there's no compression, we don't read prior chunks. Great to read that. I'll probably try to benchmark w/ and w/o compression eventually. Need to deal with other issues first, will take a while to report back on this. > This can be seen in detoast_attr_slice() Thank you for the pointer. I'll be sure to have a look at that code. And thanks again for chiming in, with very useful info Alvaro.
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Wed, Oct 19, 2022 at 5:05 PM Laurenz Albe wrote: > On Wed, 2022-10-19 at 12:48 +0200, Dominique Devienne wrote: > > On Wed, Oct 19, 2022 at 12:17 PM Andreas Joseph Krogh > > wrote: > > > First advice, don't do it. We started off storing blobs in DB for “TX > > > safety” > > Not really an option, I'm afraid. > You should reconsider. Ruling out that option now might get you into trouble > later. Large Objects mean trouble. Andreas, Ericson, Laurenz, thanks for the advice. I'll be sure to discuss these concerns with the team. We have other (bigger) data in the file system, albeit more of a read-only nature though perhaps. And this is an area I'm not familiar with how security is handled, so I'll investigate it to see if a path forward to externalize the largish blobs (currently destined to live in the DB) is possible. So I hope you can see I'm not dismissing what you guys are saying. But before I finish this thread for now, I'd like to add that I consider unfortunate a state of affairs where NOT putting the data in the DB is the mostly agreed upon advice. It IMHO points to a weak point of PostgreSQL, which does not invest in those use-cases with large data, perhaps with more file-system like techniques. Probably because most of the large users of PostgreSQL are more on the "business" side (numerous data, but on the smaller sizes) than the "scientific" side, which (too often) uses files and files-in-a-file formats like HDF5. FWIW, when Oracle introduced SecureFile blobs years ago in v11, it represented a leap forward in performance, and back then we were seeing them being 3x faster than LO at GB sizes, if I recall correctly, with throughput that challenged regular networked file-system like NFS. That was over 10 years ago, so who knows where we are now. And from the posts here, the issues with large blobs may be more related to backup/restore perhaps, than runtime performance. Having all the data in the DB, under a single security model, is a big win for consistency and simplicity. And the fact it's not really possible now is a pity, in my mind. My (probably uninformed) opinion on this is the large blobs are handled just like other relational data, in paged storage designed for smaller data. I.e. file-like blobs are shoehorned into structures which are inappropriate for them, and that a rethink and redesign is necessary specifically for them, similar to the Oracle SecureFile one of old. I have similar gripes with SQLite, which is otherwise a fantastic embedded DB. Just see how the SQLite-based Fossil-SCM fails to scale for very large repo with big (e.g. game) assets, and how it similarly failed to scale in SVN a long time ago, to be replaced by a forest-of-files (which GIT also uses). DBs like PostgreSQL and SQLite should be better at this. And I hope they get there eventually. Sorry to turn a bit philosophical at this. It's not a critic per-se. More of the personal musing of a dev in this space for a long time. FWIW. Thanks, --DD
Re: How to store "blobs" efficiently for small and large sizes, with random access
On Thu, Oct 20, 2022 at 12:21 PM Laurenz Albe wrote: > On Thu, 2022-10-20 at 10:32 +0200, Dominique Devienne wrote: > > FWIW, when Oracle introduced SecureFile blobs years ago in v11, it > > represented a leap forward in > > performance, and back then we were seeing them being 3x faster than LO > > at GB sizes, if I recall correctly, > > I don't know what exactly they are, but I suspect that they are just > files (segments?) in Oracle's "file system" (tablespaces/datafiles). > So pretty much what we recommend. I don't think so :) Pretty much the opposite in fact. Don't let the term SecureFile foul you. From the DB client's perspective, they are a server-side value like any other, living in a row/col (relation/tuple), pretty much like a BYTEA value. But unlike BYTEA, and like LO, what you SELECT or UPDATE is a LOB Locator, not the content itself, then used in separate APIs, so very much like LO. So you get the best of both BYTEA (acts like a value, in a tab/col), and LO (random access). Fully transactional. In fact, the LOB locator in a handle to the MVCC machinery, so you can get the locator(s) within a transaction, commit that transaction, and later if you read from the locator, you are still AS OF that transaction (i.e. read-consistent). That's super handy for lazy-loading the data in the app on demand. You of course expose yourself to "snapshot too old". We missed that dearly in PostgreSQL. Could emulate it with an explicit SNAPSHOT left open, not nearly as convenient. And there are specialized APIs that allow operating on *multiple* LOBs in a single server round-trip, which was essential for performance, for smaller ones. In Oracle, anything above 32KB had to be a LOB (at the time), yet you don't want to do a round-trip for each and every 32KB chunk of data, when you have thousands like load. (unless you shard yourself "manually", but good luck matching the perf of SecureFile LOBS) I'm not privy of how they were implemented internally. But I do believe the old blobs they were replacing (at the time) where more like LO is, i.e. handled (mostly) like the other datatypes, in the table infrastructure; while these new (in v11) blobs were handled internally completely differently, more in a file-system manner. Thus the large performance gap between the OLD and NEW Oracle LOBs. But from the outside, that's an "implementation detail". They were in the DB, transactional, value-like (modulo the level of indirection for random access), and importantly, efficient. I really wish PostgreSQL had an equivalent. There's apparently an Oracle ACE on this list, so you can fill in the gaps above, or correct any falsehoods I wrote above. This is over 10 years old, so I was a big fan of another ACE, Tom Kyte, whose books helped me a lot, and I was neck-deep in OCI for a few years, but I was just a mostly-self-taught Oracle developer, so definitely not an expert like an ACE. FWIW, --DD
Re: Getting PSQL in Windows to support TAB/Autocomplete via modified readline...
On Mon, Nov 21, 2022 at 4:58 PM Tom Lane wrote: > Kirk Wolak writes: > > We have our team of Windows developers, leveraging PSQL. But honestly, > > it feels crippled after using PSQL in Linux for any length of time. Losing > > auto-complete sucks (multi-line query recall/editing is lost as well). > > > In researching this problem, it appears that the decision was made like > > 17yrs ago, when windows did not have a realistic "terminal" type > > interface. Assuming we target Windows 8.1 or higher, I believe this goes > > away. > > It'd certainly be nice if we could use Readline on Windows. I do not > think we want to buy into maintaining our own fork of Readline, if that's > what you're trying to suggest. If it "just works" now, that'd be great. > Otherwise, maybe you should be speaking to the Readline maintainers > about what changes are needed in it? FWIW, I've been using https://github.com/arangodb/linenoise-ng for Linux and Windows, but that's C++ and no longer supported (but worked well enough for me). The main https://github.com/antirez/linenoise itself does not build as-is on Windows still, I believe. --DD
Re: Getting PSQL in Windows to support TAB/Autocomplete via modified readline...
On Mon, Nov 21, 2022 at 6:12 PM Kirk Wolak wrote: > On Mon, Nov 21, 2022 at 11:01 AM Dominique Devienne > wrote: > > FWIW, I've been using https://github.com/arangodb/linenoise-ng for Linux > > and Windows, > > I've look at linenoise and without tab/autocomplete it's not worth the > effort, IMO. Not sure to understand... Linenoise* has completion hooks. Completion is something done in the app, not readline/linenoise. Only the app has the context to know what to complete the text with. But perhaps I'm missing something? --DD https://github.com/arangodb/linenoise-ng/blob/master/include/linenoise.h
Re: how to implement add using upsert and trigger?
On Mon, Nov 28, 2022 at 1:37 PM yin@163.com wrote: > on conflict(itemid) do update > set value = excluded.value + new.value, cnt = excluded.cnt +1 where > excluded.itemid = new.itemid; OT, but isn't `where excluded.itemid = new.itemid` redundant, given `on conflict(itemid)`? I'm asking more because I'm not sure, for my own education. Thanks, --DD
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Thu, Dec 1, 2022 at 4:23 PM Tom Lane wrote: > "David G. Johnston" writes: > > On Thu, Dec 1, 2022 at 7:59 AM Martijn Tonies (Upscene Productions) < > > m.ton...@upscene.com> wrote: > >> Can you modify the server code to store the original body in proc.prosrc > >> again? It would be very helpful. > > > I seem to recall that this option had been discussed and rejected when this > > feature went in. The parsed body is a feature because its contents can be > > updated due to, e.g., renaming of objects. The text source of the original > > command would be unable to be updated in the same way and so it is possible > > the text and the parsed contents would diverge over time, which is a > > situation we do not want to have. > > Indeed. We used to have a similar situation with respect to column > default expressions and CHECK constraint expressions. Eventually we got > rid of the textual storage of both, because it couldn't be maintained > in a reasonable way. > > I think the answer here is "don't use the new syntax if you want the > function body stored textually". You can have one set of benefits, > or the other set, but not both at once. FWIW, we have a custom schema introspection and diff'ing ad-hoc framework, and the fact the original SQL is not conserved as-is has also created issues for us. On Oracle, our SQL was preserved as-is, so could be compared reliably. While on PostgreSQL, some names-in-SQL are rewritten, the text reformatted, etc... So this byte-perfect comparison is no longer possible, and we must rely on heuristics (a.k.a, hacks...) to do a fuzzy-compare (since we lack a real PG-compatible SQL parsers to do an infoset-comparison instead, at the AST level for example). So it's not just a matter of browsing the schema. For us, it's a *functional* issue. --DD
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Thu, Dec 1, 2022 at 5:10 PM Laurenz Albe wrote: > On Thu, 2022-12-01 at 16:38 +0100, Dominique Devienne wrote: > > FWIW, we have a custom schema introspection and diff'ing ad-hoc framework, > > This is arguable, but my opinion is that this is not a robust way to > do development. You should use a schema versioning tool like Liquibase, > develop schema migration scripts and maintain the SQL code in a source > repository like other software. We don't maintain SQL. We maintain a *Logical* model, and generate the physical model from it. FKs, NKs, Enums, CHECK constraints, indexes, etc.. all that "implementation details" is programmatically generated, and always consistent, from a much higher-level and simpler model. And you also get auto-upgrade most of the time, greatly increasing development velocity too. I would argue that NOT doing it this way, is the non-robust way myself :) We've been refactoring a large data-model maintained manually like you advocate, and I can't tell you how many anomalies we've discovered and had to fix, using the more robust formalism of using a high-level logical model and (DDL) code gen. I guess is a DBA-versus-Developer point-of-view difference. --DD
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Thu, Dec 1, 2022 at 6:41 PM Adrian Klaver wrote: > On 12/1/22 09:24, Dominique Devienne wrote: > > I guess is a DBA-versus-Developer point-of-view difference. --DD > > What this points to is that there are multiple ways to handle this, many > external to the server itself. My take is that the system catalogs are > there for the proper operation of the server and that is their task, > first and foremost. If you can piggyback of that then great, but with > the knowledge that the information may change to meet the needs of the > server not external users. I see. Still, Oracle preserves SQL as-is. SQLite preserve SQL as-is. Would be nice if PostgreSQL did too. That's all I'm saying. Having in libpq functionality that allows to do the same kind of SQL normalization / rewriting done server-side would help. Then I could use that to generate the DDL "just right" the first time. For now, our current work-around is a two step process. We first generate at-build-time the DDL using "our syntax" and instantiate the schema. Then introspect that and re-generate code with the "rewritten syntax". Subsequent generation (it's dynamic, at runtime) will use the re-generated code that matches the syntax re-write. Thus now the introspection and diff'ing match the in-memory DDL. Still, that's a PITA. I of course don't contest that PostgreSQL maintains what it needs. But ALSO maintaining the original, at least until a re-write is necessary on renames, would go A LONG WAY to satisfy the OP and myself in our use-cases. FWIW. --DD
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Thu, Dec 1, 2022 at 8:09 PM Christophe Pettus wrote: > > On Dec 1, 2022, at 11:05, Dominique Devienne wrote: > > I see. Still, Oracle preserves SQL as-is. SQLite preserve SQL as-is. > > Would be nice if PostgreSQL did too. That's all I'm saying. > > Since this is a custom-built system, there is nothing keeping you from > creating your own table in the database that stores the original text of the > function. That's not the point. If a DBA updates one of our triggers or proc or whatever else, the recorded info in a custom table won't be affected. We are diff'ing the server-side schema, against the expected in-memory model of the physical model. Thus the dictionaries are the only source of truth we can trust for the current state of the schema. And beside minor syntactic differences, and some more troublesome object-name rewrites, this is exactly what we want. The system itself needs to preserve the original DDL IMHO. --DD
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Fri, Dec 2, 2022 at 1:37 PM Laurenz Albe wrote: > Great; then go ahead and use those databases, if it is important for you. Now come on. We all love PostgreSQL. But that doesn't mean we can't disagree on some decisions. Especially when you are a USER of PostgreSQL, not a DEV of it, and it's too late by the time you are even aware of the changes. Not everyone can chime in on the dev-list when those are discussed. >From a user point of view, can also be seen as a "regression", when an observable property of the system changes to a new different / incompatible way, to some extent. I'm not saying it is, still it is a change one discovers too late, creates pain to some, and is both worth reporting and discussing. Given the tone of the discussion here though, I don't think there's much to hope for a middle ground... > In the same vein, I don't think any of those databases have ... bloom indexes. SQLite has bloom filters now :) (not persistent indexes)
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On Thu, Dec 1, 2022 at 8:51 PM Tom Lane wrote: > Do you really fail to see the contradictions in this? You want the > database to preserve the original DDL, but you also want it to update > in response to subsequent alterations. You can't have both those Hi. I probably didn't express myself correctly. I don't think there's a contradiction. I originally wrote: "maintaining the original, at least until a re-write is necessary on renames". But that I meant that the SQL would be preserved as-is, *initially*. But that if/when a rename affecting that SQL happens, then it's fair game to re-write it. Because then the diff between my in-memory code-generated DDL, and the server-side DDL is no longer a false positive, as it is now from the "pre-emptive" re-write. What is creating me pain, is the fact the re-write of the SQL is *eager* instead of *lazy*. I.e. I'm paying for the rewrite, even when it's not strictly necessary (from my POV at least). I hope that makes more sense. Thanks, --DD
Re: Tools for moving normalized data around
On Wed, Jan 18, 2023 at 9:03 PM Jeremy Smith wrote: >> How to do this with two tables connected via one foreign key, that is >> explained a dozen times in Stackoverflow. But what if the tree is 50 >> tables and 120 foreign key columns? >> It can be done. But probably not manually. >> >> So, as this seems a very usual use-case for normalized data, is there >> any tooling available? Anywhere? (I searched, I didn't find.) > > I haven't used it, but this seems like the problem that Jailer is trying to > solve: https://github.com/Wisser/Jailer Seems to dothe subsetting alright, but it doesn't mention the conflict-resolution for inserting into an already populated target schema. We've done something similar, merging many same-schema DBs into a single local SQLite DB (with conflicts, thus no constraints), iteratively resolving the conflicts (SK/PK and NK) per-table in topological order, propagating PK changes to FKs in child tables later. Then load the result into a full-constrained PostgreSQL DB. Clever'er minds can probably do it all in SQL, but we did it with a combination of imperative code and SQL. Not exactly your use case Peter, but close enough I think. I don't think a tool could have done what we did, it's too ad-hoc and specific to our use case. Took a while, and required lots of testing (unit tests, and functional QA tests). FWIW. --DD
Re: Sequence vs UUID
On Mon, Jan 30, 2023 at 5:11 PM veem v wrote: > CREATE TABLE test1_UUID ( id bigint,source_id varchar(36) PRIMARY KEY, Name > varchar(20) ); Maybe if you used a "native" 16-byte uuid, instead of its textual serialization with dashes (36 bytes + length overhead), the gap would narrow.
Fwd: Sequence vs UUID
Copying the list... -- Forwarded message - From: Dominique Devienne Date: Fri, Feb 3, 2023 at 4:57 PM Subject: Re: Sequence vs UUID To: veem v On Thu, Feb 2, 2023 at 8:47 PM veem v wrote: > Tested the UUIDv7 generator for postgres as below. > With regards to performance , It's still way behind the sequence. [...] > explain analyze select count(nextval('myseq') ) from > generate_series(1,10); > Execution Time: 59.687 ms > > explain analyze select count(gen_random_uuid()) from > generate_series(1,100'000); > Execution Time: 904.868 ms > > explain analyze select count(uuid_generate_v7()) from > generate_series(1,10); > Execution Time: 1711.187 ms > Something's off regarding Guid generations IMHO... You generate 100K Guids in ~1s. While we generate (in C++, Windows Release, using Boost) 16M of them in +/- the same time: Enabling Performance tests >>> >> > generate 16'000'000 guids in 0.980s (user: 0.984s) 12 MB >>> >> generate 16'000'000 guids in parallel on 4 CPUs in 0.309s (user: 1.188s) >>> 12 MB >>> >> That's 2 orders of magnitude faster. Sure there's some overhead from the SQL, but still. Something seems fishy. And that's on a 2.5y old desktop. --DD
Re: Sequence vs UUID
On Fri, Feb 3, 2023 at 5:48 PM veem v wrote: > Actually I did the testing by connecting to "https://dbfiddle.uk/"; > postgres version -15. > > PostgreSQL 15.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 > 20210514 (Red Hat 8.5.0-10), 64-bit > > Am I doing it wrong, please confirm? > > No clue. But even SQLite can generate 1M uuid (16 bytes random blobs are equivalent) is 1/3 of the time, so 30x faster. So your timings of generating 100K uuids and counting them seems way too slow to me. --DD sqlite> select count(randomblob(16)) from generate_series(1,1000*1000); QUERY PLAN `--SCAN generate_series VIRTUAL TABLE INDEX 3: addr opcode p1p2p3p4 p5 comment - - -- - 0 Init 0 1500 Start at 15 1 Null 0 1 10 r[1..1]=NULL 2 VOpen 0 0 0 vtab:274D3E0 0 3 Integer1 4 00 r[4]=1 4 Multiply 6 6 50 r[5]=r[6]*r[6] 5 Integer3 2 00 r[2]=3 6 Integer2 3 00 r[3]=2 7 VFilter0 1120 iplan=r[2] zplan='' 8 Function 1 8 7 randomblob(1) 0 r[7]=func(r[8]) 9 AggStep0 7 1 count(1) 1 accum=r[1] step(r[7]) 10VNext 0 8 00 11AggFinal 1 1 0 count(1) 0 accum=r[1] N=1 12Copy 1 9 00 r[9]=r[1] 13ResultRow 9 1 00 output=r[9] 14Halt 0 0 00 15Transaction0 0 1 0 1 usesStmtJournal=0 16Integer1000 6 00 r[6]=1000 17Integer168 00 r[8]=16 18Goto 0 1 00 ┌───┐ │ count(randomblob(16)) │ ├───┤ │ 100 │ └───┘ Run Time: real 0.278 user 0.250000 sys 0.00 > On Fri, 3 Feb 2023 at 21:28, Dominique Devienne > wrote: > >> Something's off regarding Guid generations IMHO... >> You generate 100K Guids in ~1s. While we generate (in C++, Windows >> Release, using Boost) 16M of them in +/- the same time: >> >
Re: Sequence vs UUID
On Tue, Feb 7, 2023 at 3:47 PM Merlin Moncure wrote: > On Mon, Feb 6, 2023 at 1:22 PM Peter J. Holzer wrote: > >> On 2023-02-06 20:04:39 +0100, Julian Backes wrote: >> But UUIDs are random and that plays havoc with locality. > > > This is really key. [...] the databases I've seen that are written with > the > UUID pattern appear to be written by developers oblivious to this fact. > Well, perhaps these developers are not dealing with temporally clustered data, like commerce related DB, and more scientific data? In any case, this developer will definitely investigate ULIDs, vs UUIDs, vs Ints (sequences), based on the info from this thread. I'm aware of fragmentation issues, and cache hit/miss issues etc, in general; but was probably not sufficiently making the mental connection with UUIDs and PostgreSQL. So thanks everyone.
Using PostgreSQL for service discovery and health-check
Hi. I'm requesting advice on something I'm about to start on. In a normally 2-tier system, where "desktop" native apps connect directly to PostgreSQL to access data, some things must be mediated via a middle-tier service (i.e. 3-tier architecture). That service is HTTP based thus clients (the "desktop" native apps) must know the URL (host+port+path) of a server providing the requested service. Since clients must already have access to PostgreSQL to operate, I wanted to add a table in PostgreSQL for services, and server(s) on startup (probably using a random port) would register in that table, and deregister on shutdown. Also, since crashes are a fact of life, the server would regularly update the DB with a "heartbeat" on some frequency (e.g. 1s?), so clients would select (or see) only registered services with a "fresh enough" heartbeat timestamp. That's basically my plan. Now come the questions: 1) will updating a row every second (for example) create issues? 2) if yes to #1 above, what would be good mitigation tactics? Use different table for service vs heartbeat? Special kind of table? or configured in a particular way? 3) if a service crashes, it won't remove its row(s), obviously. What kind of mechanism exists to "reap" "zombie" services? 4) Related to #3 above, I think built-in "cron"-like services are only available via extensions, not in PostgreSQL proper. Why? Seems like such an essential service. 5) Which cron-like extension to use? Especially since we run both on-prem but also in managed-PostgreSQL on the cloud? I'd appreciate community input. Thanks, --DD PS: Note that there could be more than 1 server registered, for the same service, possibly on the same machine, for redundancy. But I think that's mostly orthogonal to my questions above.
WHERE col = ANY($1) extended to 2 or more columns?
Hi. We are implementing an API which takes a list of row keys, and must return info about those rows. To implement that efficiently, in as few round-trips as possible, we bind a (binary) array of keys (ints, uuids, or strings) and that works great, but only if the key is a scalar one. Now we'd like to do the same for composite keys, and I don't know how to do that. Is it possible? Could someone please help out or demo such a thing? We are doing it in C++ using libpq, but a pure SQL or PL/pgSQL demo would still help (I think). Thanks, --DD
Re: Using PostgreSQL for service discovery and health-check
On Thu, Feb 9, 2023 at 4:46 PM David G. Johnston wrote: > On Thu, Feb 9, 2023 at 8:30 AM Dominique Devienne > wrote: > >> That's basically my plan. Now come the questions: >> 1) will updating a row every second (for example) create issues? >> 2) if yes to #1 above, what would be good mitigation tactics? Use >> different table for service vs heartbeat? Special kind of table? or >> configured in a particular way? >> 3) if a service crashes, it won't remove its row(s), obviously. What kind >> of mechanism exists to "reap" "zombie" services? >> 4) Related to #3 above, I think built-in "cron"-like services are only >> available via extensions, not in PostgreSQL proper. Why? Seems like such an >> essential service. >> 5) Which cron-like extension to use? Especially since we run both on-prem >> but also in managed-PostgreSQL on the cloud? >> >> > You can probably get good mileage from CREATE UNLOGGED TABLE. > Thanks. Although I guess the fact it's not replicated to standby servers could be a problem? That's not something we test now, and also something the on-prem DBA (i.e. a client of ours) might want to setup on his/her own, on top of our client/server arch I guess. I have no experience with stand-bys (replication) and HA in PostgreSQL. Would having the main service table be a regular one, and the service_heartbeat be an unlogged one be replication friendly? I.e. if fail over to the stand-by happens, the service table is still there and populated, but the service_heartbeat is empty, but then the services would start populating it "transparently" no? I.e. would using 2 tables instead of 1 be a better design?
Re: WHERE col = ANY($1) extended to 2 or more columns?
On Thu, Feb 9, 2023 at 4:51 PM David G. Johnston wrote: > On Thu, Feb 9, 2023 at 8:41 AM Dominique Devienne > wrote: > >> Now we'd like to do the same for composite keys, and I don't know how to >> do that. >> > > An array-of-composites is simply: > SELECT ARRAY[ ROW(1, 'one'), ROW(2, 'two') ]::composite_type[]; > Thanks. I don't consider that "simple" myself :). But I'm definitely not an advanced PostgreSQL user! Would still appreciate a more fleshed out demo, if anyone is kind enough to provide it. Thanks, --DD
Re: WHERE col = ANY($1) extended to 2 or more columns?
On Thu, Feb 9, 2023 at 5:03 PM Dominique Devienne wrote: > On Thu, Feb 9, 2023 at 4:51 PM David G. Johnston < > david.g.johns...@gmail.com> wrote: > >> On Thu, Feb 9, 2023 at 8:41 AM Dominique Devienne >> wrote: >> >>> Now we'd like to do the same for composite keys, and I don't know how to >>> do that. >>> >> >> An array-of-composites is simply: >> SELECT ARRAY[ ROW(1, 'one'), ROW(2, 'two') ]::composite_type[]; >> > > Thanks. I don't consider that "simple" myself :). But I'm definitely not > an advanced PostgreSQL user! > Would still appreciate a more fleshed out demo, if anyone is kind enough > to provide it. Thanks, --DD > Also, I'm still not sure how to write that WHERE clause, with the $1 being an array of a composite type. And since this is a binary bind, what kind of OIDs to use. Your example seems to generate a type on-the-fly for example David. Will we need to create custom types just so we have OIDs we can use to assemble the bytes of the array-of-composite bind? I believe there's an implicit ROW type per table created. Are there also implicit types for composite PKs and/or UNIQUE constraints? Lots of questions...
Re: Using PostgreSQL for service discovery and health-check
On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver wrote: > On 2/9/23 07:30, Dominique Devienne wrote: > > In a normally 2-tier system, where "desktop" native apps connect > > directly to PostgreSQL to access data, some things must be mediated via > > a middle-tier service (i.e. 3-tier architecture). That service is HTTP > > based thus clients (the "desktop" native apps) must know the URL > > (host+port+path) of a server providing the requested service. > > Is there more then one server providing the same service? > Yes. That was my PS: basically. The client can connect to any one, randomly. We need at least one of course. But there could me more than 1, yes. > > Since clients must already have access to PostgreSQL to operate, I > > wanted to add a table in PostgreSQL for services, and server(s) on > > startup (probably using a random port) would register in that table, and > > deregister on shutdown. Also, since crashes are a fact of life, the > > server would regularly update the DB with a "heartbeat" on some > > frequency (e.g. 1s?), so clients would select (or see) only registered > > services with a "fresh enough" heartbeat timestamp. > > Would it no be easier to not have random ports and just attempt > connections to the servers either: > 1) In the client with reattempt to different port on failure. > 2) From Postgres server and update table to have current up servers. > I'm sorry, but I'm not following. Can you perhaps rephrase? Regarding ports, once you have registration of services, just seems easier to me to NOT have a fixed port, and let the host assign any port to the HTTP server. Those servers are not user-facing directly, from the client side, it calls an API and lookup of the service and connection to the HTTP server is transparent is an implementation detail, so the port used doesn't matter. In-DB registration of (HTTP) servers makes the while URL an implementation detail.
Re: WHERE col = ANY($1) extended to 2 or more columns?
On Thu, Feb 9, 2023 at 5:37 PM David G. Johnston wrote: > On Thu, Feb 9, 2023 at 9:28 AM Alban Hertroys wrote: > >> > On 9 Feb 2023, at 16:41, Dominique Devienne >> wrote: >> > Now we'd like to do the same for composite keys, and I don't know how >> to do that. >> >> This works: >> => select (1, 'one'::text) in ((1, 'two'::text), (2, 'one'::text), (1, >> 'one'::text), (2, 'two'::text)); >> > But you cannot write the right-side of the IN as a single parameter which > seems to be the primary constraint trying to be conformed to. > Right. The goal is to (re)use a prepared statement (i.e. plan once), and bind the RHS (binary) array and do a single exec (single round-trip) to get the matching rows. AFAIK, this is the fastest way. If there's a better/faster way, I'm interested. --DD
Re: Using PostgreSQL for service discovery and health-check
On Thu, Feb 9, 2023 at 5:51 PM Adrian Klaver wrote: > On 2/9/23 08:16, Dominique Devienne wrote: > > On Thu, Feb 9, 2023 at 5:05 PM Adrian Klaver The flip side of that is that with known ports it would it easier to > have a process on the Postgres machine or in the database that checks > the ports on regular basis. And as part of that process mark any non > responding ports as inactive. That would solve the zombie problem. > That's one possibility. But the "reaper" process could just as well scan the service table, and probe those too. So again, I'm not sure what the fixed-port approach gains me, beside perhaps the reaper not having to connect to PostgreSQL itself. I'm OK with connecting. Thanks for the your input. Always good to have one's arguments challenged by experts.
Losing my latin on Ordering...
Hi. Porting a unit test to PostgreSQL, we got a failure related to ordering. We've distilled it to the below. The DB is en_US.UTF-8, and the sorting we get does not make sense to me. The same prefix can be sorted differently based on the suffix apprently, which doesn't make any sense to me. Surely sorting should be "constant left-to-right", no? What are we missing? I'm already surprised (star) comes before (space), when the latter "comes before" the former in both ASCII and UTF-8, but that the two "Foo*" and "Foo " prefixed pairs are not clustered after sorting is just mistifying to me. So how come? For now we can work-around this by explicitly adding the `collate "C"` on the queries underlying that particular test, but that would be wrong in the general case of international strings to sort, so I'd really like to understand what's going on. Thanks, --DD PS: if I try "en_US.UTF-8" or "en_US"."UTF-8" for the collate, it fails. How come what pg_database.datcollate displays is not a valid value for collate? PPS: We tried on v12 and v13 I believe. Somehow my v14.2 on Windows doesn't have en_US as a collation... ddevienne=> with t(v) as (values ('Foo All'), ('Foo*'), ('Foo Brief'), ('Foo*All')) select '<'||v||'>', v::bytea from t order by v collate "en_US"; ?column? | v -+-- | \x466f6f2a | \x466f6f20416c6c | \x466f6f2a416c6c | \x466f6f204272696566 (4 rows) ddevienne=> with t(v) as (values ('Foo All'), ('Foo*'), ('Foo Brief'), ('Foo*All')) select '<'||v||'>', v::bytea from t order by v collate "C"; ?column? | v -+-- | \x466f6f20416c6c | \x466f6f204272696566 | \x466f6f2a | \x466f6f2a416c6c (4 rows)
Re: Losing my latin on Ordering...
On Tue, Feb 14, 2023 at 11:23 AM Laurenz Albe wrote: > On Tue, 2023-02-14 at 10:31 +0100, Dominique Devienne wrote: > > Hi. Porting a unit test to PostgreSQL, we got a failure related to > ordering. > > > > We've distilled it to the below. The DB is en_US.UTF-8, and the sorting > we get > > does not make sense to me. The same prefix can be sorted differently > based on > > the suffix apprently, which doesn't make any sense to me. > > > > Surely sorting should be "constant left-to-right", no? What are we > missing? > > No, it isn't. That's not how natural language collations work. > Honestly, who expects the same prefix to sort differently based on what comes after, in left-to-right languages? How does one even find out what the (capricious?) rules for sorting in a given collation are? I'm aware of sorting taking numerical numbers in text influencing sort, so "Foo10" comes after "Foo9", but that's not what we are discussing here. "Foo*" and "Foo " have no logical relatioship, like 9 and 10 do. > > I'm already surprised (star) comes before (space), when the latter "comes > > before" the former in both ASCII and UTF-8, but that the two "Foo*" and > "Foo " > > prefixed pairs are not clustered after sorting is just mistifying to me. > So how come? > > Because they compare identical on the first three levels. Any difference > in > letters, accents or case weighs stronger, even if it occurs to the right > of these substrings. > That's completely unintuitive... > > For now we can work-around this by explicitly adding the `collate "C"` on > > the queries underlying that particular test, but that would be wrong in > the > > general case of international strings to sort, so I'd really like to > understand > > what's going on. > > Yes, it soulds like the "C" collation may be best for you. That is, if > you don't > mind that "Z" < "a". > I would mind if I asked for case-insensitive comparisons. So the "C" collation is fine with general UTF-8 encoding? I.e. it will be codepoint ordered OK?
Re: Losing my latin on Ordering...
On Tue, Feb 14, 2023 at 12:35 PM Alvaro Herrera wrote: > On 2023-Feb-14, Dominique Devienne wrote: > > Honestly, who expects the same prefix to sort differently based on what > > comes after, in left-to-right languages? > Look, we don't define the collation rules. > Ok, ok, sorry. To you, Laurenz, and everyone. I obviously disagree with these rules, but I'm a nobody, so who cares :) > > So the "C" collation is fine with general UTF-8 encoding? > > I.e. it will be codepoint ordered OK? > > Sure, just make sure to use the definition of C that uses UTF-8 encoding > (I think it's typically called C.UTF-8). > OK, so for new DBs, sounds like we need to CREATE DATABASE ... WITH LOCALE 'C.UTF-8' ENCODING UTF8 Correct? But what about existing DBs? Can the collation be changed a posteriori? ALTER DATABASE does not seem to support the same options. We don't want to have to sprinkle COLLATE "C" all over the place in the code. And there are quite a few DBs out there already. What to do about them?
DELETE trigger, direct or indirect?
Hi. This is a bit unusual. We have a foreign key between two tables, with ON DELETE CASCADE, to preserve referential integrity. But we apparently also need to preserve the severed reference (by natural key, i.e. its name), to later on reconnect the two entities after-the-fact, should the parent row re-appear later on (in the same transaction or not it still unclear). To achieve this weird requirement, I'd like to know if it is possible in an ON DELETE trigger to know whether the deletion is coming from a direct-DELETE in the "child table", or whether the deletion is coming from the "parent table" CASCADEd to the child table. Thanks, --DD
Re: DELETE trigger, direct or indirect?
On Thu, Feb 16, 2023 at 5:59 PM Adrian Klaver wrote: > On 2/16/23 08:55, David G. Johnston wrote: > > On Thu, Feb 16, 2023 at 9:46 AM Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 2/16/23 05:23, Dominique Devienne wrote: > > > Hi. This is a bit unusual. We have a foreign key between two > tables, > > > with ON DELETE CASCADE, to preserve referential integrity. But we > > > apparently also need to preserve the severed reference (by > > natural key, > > > i.e. its name), to later on reconnect the two entities > > after-the-fact, > > > should the parent row re-appear later on (in the same transaction > > or not > > > it still unclear). > > > > This is going to need a more detailed description of the relationship > > between the two tables: > > > > 1) The actual FK relationship. > > > > 2) What "...preserve the severed reference (by natural key, i.e. its > > name)" means? > > > > 3) What information will be used to reconnect the child rows to the > > parent rows? > > > > > > Maybe the OP should be using ON DELETE SET NULL instead of CASCADE? > > That is where I am headed, however it will need more information to > determine whether that makes sense or not. > OK, I started writing SET NULL won't help, but I'll back up and try to give more info, as requested. Pseudo SQL at this point. create table entity (name text primary key, ...); create table enity_list (name text primary key, ...); create table entity_list_member( list_name text not null references entity_list(name) on delete cascade on update cascade, entity_name text not null references entity(name) on delete cascade on update cascade primary key (list_name, entity_name) ); Above is the current situation. When the entity is deleted, it's implicitly deleted from all list that mention it. Referential Integrity 101 I guess. But apparently, it's common enough for an entity to be deleted and reloaded, not necessarily in the same transaction, that losing the list(s) membership on delete is considered "a bug". One solution is to not do any reference integrity in the lists. But that opens the door to garbage in a little too wide I think. So on second thought, maybe the SET NULL could be of use. I'd add a second non-FK column on the member assoc-table, transfering the old entity name to it thanks to an UPDATE on entity_name, thus preserving the old name. Then an INSERT trigger on entity could locate any (indexed) "stashed" entity names in that extra non-FK column in entity_list_member, to retransfer the name back to the primary FK column. I'd need to adjust the PK to a coalesce(), and ensure the two columns are mutually exclusive. Sounds like that might work, no?
How to avoid Trigger ping/pong / infinite loop
Hi. I have a large "legacy" code base that write information necessary for Row-Level-Security in a highly denormalized custom had-hoc text format for values, in key-value pairs in a table, which would be either impossible or too slow to base any RLS policy on. The values are basically lists or maps, depending on the key. I also have new code that will read and write that same information, but would much prefer to use a normalized data model, the same one that would be appropriate for efficient RLS. So I'm wondering if I can have my cake and eat it too, by synchronizing the normalized and denormalized information (necessarily duplicated then), server-side via triggers, w/o running into infinite loops. Are there techniques for situations like this? This question is not too far from my earlier question, in the sense that a trigger would need to know the context in which it was triggered, i.e. directly (then update the other model), or indirectly (don't update, the change is boomerang'ing around from our own change). Any thoughts? --DD PS: At time point, changing the legacy code base is not really an option...
Re: How to avoid Trigger ping/pong / infinite loop
On Thu, Feb 16, 2023 at 6:58 PM Adrian Klaver wrote: > > "... which would be either impossible or too slow to base any RLS policy > on." > > and > > "At time point, changing the legacy code base is not really an option..." > > seem to be at odds. > I don't see why you say that. So is the current system working or not? > The legacy system is 3-tier, so uses the denormalized info in C++ in the mid-tier, while the new system uses PostgreSQL and is 2-tier, so the same denormalized info must drive both modes of execution. The same code-base is used for both, but when different backends. Making the code base work under two back ends, is already hard enough, w/o changing it more extensively to use a new normalized model even in the legacy case. So is that clearer? But that's a bit orthogonal to my question too. > > I also have new code that will read and write that same information, but > > would much prefer to use a normalized data model, the same one that > > would be appropriate for efficient RLS. > > > > So I'm wondering if I can have my cake and eat it too, by synchronizing > > the normalized and denormalized information (necessarily duplicated > > then), server-side via triggers, w/o running into infinite loops. > > A VIEW over both sets of data? > I'm not following. How is that related to mutual synchronization via triggers? Keeping two copies of the data is of course denormalization, but cannot be avoided. One copy is basically an optimization for RLS, so could be read-only I guess, making the sync one-way and simpler, but then that would force any new code to also use the old denormalized way to update the info. Thus I'd prefer the new model to be read-write, but then that requires two-sync sync. Thus my question.
Re: DELETE trigger, direct or indirect?
On Thu, Feb 16, 2023 at 6:51 PM Adrian Klaver wrote: > You have two tables with list in their name, so are rows deleted from > both. Just to be clear enity_list should actually be entity_list? > > Also how are entity and enity_list related? > I have to confess that your questions surprise me a bit. I thought the model was pretty clear from the SQL. We have two entities, Foo (my entity table), and another Bar (my entity_list table), with Bar reference 0 or more Foos, recording which ones it references in an association table. Are the values for the name field in entity and enity(entity)_list the > same for a given entity? > The name of Foo and Bar are completely independent. > > transfering the old entity name to it thanks to an UPDATE on > > entity_name, thus preserving the old name. > > How? > Again how would you determine where the action started? > Deleting a Foo cascades to the _member assoc-table. If the cascade is SET NULL, then I can have an UPDATE TRIGGER on _member. I no longer care where the action started. > > Then an INSERT trigger on entity could locate any (indexed) "stashed" > > entity names in that extra non-FK column in entity_list_member, > > How would it locate it if the name that defined the FK(entity(name)) was > NULL? > In the extra non-FK column I mentioned explicitly, in the _member assoc-table.
Re: Postgres Index and Updates
On Thu, Mar 2, 2023 at 10:08 AM Laurenz Albe wrote: > On Thu, 2023-03-02 at 15:53 +0800, Navindren Baskaran wrote: > If the other column is updated, it depends. If the updated column is not > indexed and there is enough room for the new row version in the same > table block, the index doesn't have to be modified. Otherwise it is. > > See > https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ Very interesting article, thanks Laurenz. I wasn't aware of HOT and its relation to fillfactor. At the risk of highjacking this thread a bit, although still on topic I think. What's the 2023 status of that zheap table storage work? And more specifically about a particular data-model of mine. I have a 3-level hierarchical parents-children-grandchidren table structure: 1) a "root" table, with tens to thousands (~200K max) of (small) rows. 2) a "folder" table, with 20 to 50 (small) rows *per* "root" row. 3) several "value" tables, with again a 20 to 100 (large to very large) rows per "folder" row. The root and folder tables must maintain a "last modified" timestamp for their respective subtrees, which must be maintained via triggers (how else?). That makes those tables Update-heavy no? So from your article, those two tables, with smaller rows (and fewer rows total in general) should have larger fillfactors to increase the chances of an HOT update? Am I interpreting your article (and its linked articles) correctly for this situation? TIA, --DD
CREATE/DROP ROLE transactional? GRANT/REVOKE?
Hi. Perhaps I missed it in the doc (e.g. [1]), but are DDLs around ROLEs and GRANTs transactional? Since I'm creating many ROLEs and making many GRANTs, based info I read from PostgreSQL itself (in pg_catalog and elsewhere), should everything be in a single transaction? FWIW, I come from Oracle (where DDL is non-transactional and an implicit COMMIT), and SQLite (where ROLEs and GRANTs don't apply), thus this perhaps silly question. I'm already aware that SCHEMAs, TABLEs, etc... are transactionally created. But given that ROLEs are cluster-wide, and the doc on DDLs say nothing, I prefer to ask. Thanks, --DD [1]: https://www.postgresql.org/docs/current/sql-droprole.html
Re: CREATE/DROP ROLE transactional? GRANT/REVOKE?
On Mon, Mar 6, 2023 at 4:06 PM Tom Lane wrote: > Dominique Devienne writes: > > Hi. Perhaps I missed it in the doc (e.g. [1]), but are DDLs around ROLEs > > and GRANTs transactional? > > Your expectation is set wrongly. Thanks for the precision Tom. Although I'm not sure where you read my expectectation was wrong. But nevermind. > DDL commands in Postgres are > transactional unless their man page explicitly says they're not. > OK, didn't know that. Glad to read it. Could you point to where in the doc this is stated? It's hard to be sure about something the doc does *not* mention, when what is *implied* is hard to find, or at a distance. Especially at least another well known RDBMS differ in that department. I much prefer the way PostgreSQL handles DDL, still it's not obvious. My $0.02. Thanks, --DD
Re: CREATE/DROP ROLE transactional? GRANT/REVOKE?
On Mon, Mar 6, 2023 at 3:02 PM Erik Wienhold wrote: > > On 06/03/2023 14:19 CET Dominique Devienne wrote: > > Perhaps I missed it in the doc (e.g. [1]), but are DDLs around ROLEs and > > GRANTs transactional? > > Have you tried? Nope. I thought about it, but then I could have convinced myself on an invalid demo. So I preferred to ask the experts. > DDL is transactional unless stated otherwise (cf. CREATE DATABASE, > CREATE INDEX CONCURRENTLY, CREATE TABLESPACE). > I now see (and fully appreciate) the Note below: - CREATE DATABASE cannot be executed inside a transaction block. I didn't realize the fact CREATE ROLE didn't have that note was significant. Run the following psql script: > Thanks for the demo. Appreciated. > Since I'm creating many ROLEs and making many GRANTs, based info I read > from > > PostgreSQL itself (in pg_catalog and elsewhere), should everything be in > a > > single transaction? > > If it should be atomic and the commands are allowed in transactions, then > yes, > use transactions. > Thanks again. --DD
Re: CREATE/DROP ROLE transactional? GRANT/REVOKE?
On Mon, Mar 6, 2023 at 4:31 PM Tom Lane wrote: > Dominique Devienne writes: > > On Mon, Mar 6, 2023 at 4:06 PM Tom Lane wrote: > >> DDL commands in Postgres are > >> transactional unless their man page explicitly says they're not. > > > Could you point to where in the doc this is stated? > > For example, for CREATE DATABASE the first "Note" para in [1] is > > CREATE DATABASE cannot be executed inside a transaction block. > > I don't think we are entirely consistent about whether this is > mentioned in "Description" or "Notes", but it's there somewhere. > Yes, I noticed that, once I read Erik's email (went to GMail's SPAM folder...). Still, my little brain didn't leap to the conclusion that other DDLs were implicitly transactional, because missing of a note another DDL's doc has. All I'm saying it's not as obvious as you experienced PostgreSQL folks seems to think it is. FWIW. Thanks again, --DD
public schema grants to PUBLIC role
Hi. I've recently realized via a post (or article?) from Laurenz that the PUBLIC role has CREATE privilege on the 'public' schema by default (see query below). I guess it can't be avoided? OK, then I'll REVOKE that privilege when creating a new DB. Like I already revoked the default CONNECT to PUBLIC on the DB. But I'm wondering about unexpected side-effets. In particular, we need extensions, which are loaded in public by default. Will USAGE of public be enough for LOGIN users having access to the DB to use extensions? More broadly, we want to secure the DB so that all DB access and schema access are explicit. Anything else to be aware of please, beside the two mentioned above? Thanks, --DD ``` => select grantor::regrole::text, case grantee when 0 then 'PUBLIC' else grantee::regrole::text end, privilege_type as priv, is_grantable as adm from pg_namespace, lateral aclexplode(nspacl) where nspname = 'public'; grantor | grantee | priv | adm --+--++- postgres | postgres | USAGE | f postgres | postgres | CREATE | f postgres | PUBLIC | USAGE | f postgres | PUBLIC | CREATE | f (4 rows) ```
Re: public schema grants to PUBLIC role
On Thu, Mar 9, 2023 at 2:13 PM Laurenz Albe wrote: Hi, On Thu, 2023-03-09 at 10:34 +0100, Dominique Devienne wrote: > > Hi. I've recently realized via a post (or article?) from Laurenz that > the PUBLIC > > role has CREATE privilege on the 'public' schema by default (see query > below). > > I guess it can't be avoided? > > It can be avoided if you connect to "template1" and > > REVOKE CREATE ON SCHEMA public FROM PUBLIC; > > there *before* you create a new database. > Right. Didn't think of that. Thanks. > Or, as Christoph said, if you use v15 or better. > Because Managed Azure is still stuck at 14.2, that's currently not an option. We need both on-prem and managed Azure. > > More broadly, we want to secure the DB so that all DB access and schema > access are explicit. > > Anything else to be aware of please, beside the two mentioned above? > > Avoid SECURITY DEFINER functions with no "search_path" set: > https://www.cybertec-postgresql.com/en/abusing-security-definer-functions/ Thanks for the reminder. We already set "search_path" on our functions, because the client code does not always set the search_path at all, which was resulting in errors. No DEFINER functions either, yet. I also plan to look at the new function syntax, that eagerly resolve references at DDL time, rather than runtime, to avoid the search_path dependency at runtime completely. Although I'm worried about the introspection rewriting already discussed recently... --DD
Seq Scan because of stats or because of cast?
I'm interested in the members of specific roles, providing the roles of interest to the query via an array of integers (binary bind in code, not textual array literal like I had to use to have the EXPLAIN work, see below). I figured that query would use the "pg_auth_members_role_member_index" index, but instead it's using a sequential scan. And I'm wondering is this is because the cardinality of that catalog is small (172), which is just an artifact of my dev-testing, or whether that's because I cast roleid to an int4, preventing the use of the index? In production, the cardinality will be much greator, which is why I worry a bit. Also, I don't really need the grantor and admin_option columns for now, thus it could even be an index-only scan, IF the index was used by the plan. I tried changing the cast around, or allowing an index-only scan, but it's still a Seq Scan on the table (see below). Is there a way to know why the index is not used, in any of my attempts? I currently does not support (binary) binding Oids in my case, thus the ::int4 casts. Would supporting binding actual Oid arrays instead of Int4 arrays help in this case? I'd appreciate some insights here. Thanks, --DD PS: libpq uses unsigned int for Oid. So do OIDs go above 2GB in practice? I'm asking, since I'm casting to ::int4, thus if they do, then that case might overflow. PPS: Are OIDs recycled / reused? Or are they monotonically increasing? What happens when the Cluster runs out of OIDs? Are they Cluster-wide unique or it depends on the OID type? dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4, admin_option dd_pns2-> FROM pg_auth_members dd_pns2-> WHERE roleid::int4 = ANY($1); ERROR: there is no parameter $1 LINE 3: WHERE roleid::int4 = ANY($1); ^ dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4, admin_option dd_pns2-> FROM pg_auth_members dd_pns2-> WHERE roleid::int4 = ANY(array[1,2,3]); QUERY PLAN Seq Scan on pg_auth_members (cost=0.00..5.33 rows=3 width=13) Filter: ((roleid)::integer = ANY ('{1,2,3}'::integer[])) (2 rows) dd_pns2=> explain SELECT roleid::int4, member::int4, grantor::int4, admin_option dd_pns2-> FROM pg_auth_members dd_pns2-> WHERE roleid = ANY(array[1::oid,2::oid,3::oid]); QUERY PLAN Seq Scan on pg_auth_members (cost=0.00..5.33 rows=3 width=13) Filter: (roleid = ANY ('{1,2,3}'::oid[])) (2 rows) dd_pns2=> explain SELECT roleid::int4, member::int4 dd_pns2-> FROM pg_auth_members dd_pns2-> WHERE roleid = ANY(array[1::oid,2::oid,3::oid]); QUERY PLAN --- Seq Scan on pg_auth_members (cost=0.00..5.33 rows=3 width=8) Filter: (roleid = ANY ('{1,2,3}'::oid[])) (2 rows) dd_pns2=> \d pg_auth_members Table "pg_catalog.pg_auth_members" Column| Type | Collation | Nullable | Default --+-+---+--+- roleid | oid | | not null | member | oid | | not null | grantor | oid | | not null | admin_option | boolean | | not null | Indexes: "pg_auth_members_member_role_index" UNIQUE, btree (member, roleid), tablespace "pg_global" "pg_auth_members_role_member_index" UNIQUE, btree (roleid, member), tablespace "pg_global" Tablespace: "pg_global" dd_pns2=> select count(*) from pg_auth_members; count --- 172 (1 row)
Re: Seq Scan because of stats or because of cast?
On Mon, Mar 13, 2023 at 2:53 PM Tom Lane wrote: > regards, tom lane > Thank you very much Tom. Very informative.
Re: Seq Scan because of stats or because of cast?
On Mon, Mar 13, 2023 at 2:53 PM Tom Lane wrote: > > PS: libpq uses unsigned int for Oid. So do OIDs go above 2GB in practice? > > Yes, eventually. > OK, I've added support for native OIDs values (i.e. unsigned int, sizeof(4)) in my libpq wrapper. Tested with binary binding and getting of scalar and array values. But to truly test this is working OK, I'd need OIDs in the range [2^31, 2^32), while the OIDs in my DB only reach in the 200M range. So, any way to force the DB to create OIDs in that special range? Without hosing my DB / Cluster that is... This is not a throw-away DB / Cluster. Thanks, --DD
Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?
On Thu, Mar 16, 2023 at 6:48 AM Laurenz Albe wrote: > As we wrote, some of us think that cursors are useful, and we tried to > explain why we think that. If you don't think that cursors are useful, > don't use them. We are not out to convince you otherwise. > Perhaps OT (I only skimed this thread) but when I compared Cursors to regular Statements / Queries from a LIBPQ client application perspective, on the same "streamable" queries (i.e. w/o a sort), Cursor shined in terms of time-to-first-row, compared to waiting for the whole ResultSet, but getting the full result OTOH was 2x as long with Cursor, compared to the regular SELECT Statement. Thus in my mind, it really depends on what you value in a particular situation, latency or throughput. --DD PS: In my testing, I used forward-only cursors PPS: I don't recall the ResultSet cardinality or byte size, nor the batching used with the Cursor.
Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?
On Thu, Mar 16, 2023 at 9:23 AM Pavel Stehule wrote: > čt 16. 3. 2023 v 9:18 odesílatel Dominique Devienne > napsal: > >> [...] depends on what you value in a particular situation, latency or >> throughput. --DD >> > > cursors are optimized for minimal cost of first row, queries are optimized > for minimal cost of last row > That's a nice way to put it Pavel. And to have it both ways, use COPY in binary protocol? That way the rows are streamed to you in arbitrary chunks as soon as available (I hope), and the burden is on you the client to decode and use those rows in parallel as they are "streamed" to you. I've yet to test that (thus the 'i hope' above). I used COPY binary for INSERTs, and COPY text/json for SELECTs, not yet COPY binary for SELECTs. I'm hoping the latency of COPY will be small compared to a regular SELECT where I have to wait for LIBPQ to assemble the whole ResultSet. Are my hopes unfounded? --DD
Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?
On Thu, Mar 16, 2023 at 11:15 AM Pavel Stehule wrote: > čt 16. 3. 2023 v 9:55 odesílatel Dominique Devienne > napsal: > >> On Thu, Mar 16, 2023 at 9:23 AM Pavel Stehule >> wrote: >> >>> čt 16. 3. 2023 v 9:18 odesílatel Dominique Devienne >>> napsal: >>> >>>> [...] depends on what you value in a particular situation, latency or >>>> throughput. --DD >>>> >>> >>> cursors are optimized for minimal cost of first row, queries are >>> optimized for minimal cost of last row >>> >> >> That's a nice way to put it Pavel. >> >> And to have it both ways, use COPY in binary protocol? >> > > COPY is a different creature - it has no execution plan, and it is not > interpreted by the executor. > OK. Not sure what that means exactly. There's still a SELECT, with possibly WHERE clauses and/or JOINs, no? Doesn't that imply an execution plan? I'm a bit confused. > Using COPY SELECT instead SELECT looks like premature optimization. > Possible. But this is not an e-commerce web-site with a PostgreSQL backend here. This is classical client-server with heavy weight desktop apps loading heavy weight data (in number and size) from PostgreSQL. So performance (throughput) does matter a lot to us. And I measure that performance in both rows/sec and MB/sec, not (itsy bitsy) transactions / sec. > The performance benefit will be minimal ([...]). > COPY matters on INSERT for sure performance-wise. So why wouldn't COPY matter for SELECTs too? > Cursors, queries can use binary protocol, if the client can support it. > I already do. But we need all the speed we can get. In any case, I'll have to try and see/test for myself eventually. We cannot afford to leave any performance gains on the table.
NULL pg_database.datacl
Hi. I'm surprised, I thought ACLs would never be empty for a database. Does that mean nobody can connect to this database? I guess SUPERUSER and/or its datDBA can? What does a NULL AclItem[] mean exactly? (BTW, datallowconn is true) Thanks, --DD ddb=> select datname, datdba::regrole::text from pg_database where datacl is null; datname | datdba -+ qadb| qauser (1 row) PS: Was also surprised with #42883: ERROR: no binary output function available for type aclitem Thankfully datacl::text[] works fine in binary mode.
Re: NULL pg_database.datacl
On Mon, Mar 20, 2023 at 1:18 PM Erik Wienhold wrote: > > On 20/03/2023 11:52 CET Dominique Devienne wrote: > > What does a NULL AclItem[] mean exactly? > > It means that the object has default privileges (before any GRANT or REVOKE > is executed). For databases this means full privileges for the database > owner > and the CONNECT and TEMPORARY privileges for PUBLIC. So any user can > connect > if allowed by pg_hba.conf. > > https://www.postgresql.org/docs/current/ddl-priv.html Thanks Erik. But then, how come aclexplode() is not showing these default/implicit privileges? Is there a SQL function returning those per-type default provileges? That I could then coalesce() datacl with? Or some other means to list actual / effective privileges, even those implicit ones? Thanks, --DD ddb=> select datname, grantor::regrole::text, grantee::regrole::text, privilege_type, is_grantable ddb-> from pg_database ddb-> left join lateral aclexplode(datacl) on true ddb-> where datacl is null; datname | grantor | grantee | privilege_type | is_grantable -+-+-++-- qadb| | || (1 row)
Re: NULL pg_database.datacl
On Mon, Mar 20, 2023 at 2:18 PM Erik Wienhold wrote: > > On 20/03/2023 13:50 CET Dominique Devienne wrote: > > > > On Mon, Mar 20, 2023 at 1:18 PM Erik Wienhold wrote: > > > > On 20/03/2023 11:52 CET Dominique Devienne > wrote: > > > > What does a NULL AclItem[] mean exactly? > > > > > > It means that the object has default privileges [...] > > > https://www.postgresql.org/docs/current/ddl-priv.html > > > > Is there a SQL function returning those per-type default provileges? > > Use acldefault. Pass in ownerId=0 to get the privileges for PUBLIC. > > > https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-ACLITEM-FN-TABLE Thanks again. After experimenting a little, sounds like acldefault('d', datdba) is what I want, since it seems to give something closer to the AclItem[] actually instantiated on the first GRANT or REVOKE. This has been an education for me, once again. I appreciate the help. Thanks, --DD ddb=> select datname, coalesce(datacl, acldefault('d', 0)) from pg_database where datacl is null; datname |coalesce -+ qadb| {=Tc/0,=CTc/0} (1 row) ddb=> select datname, coalesce(datacl, acldefault('d', datdba)) from pg_database where datacl is null; datname |coalesce -+ qadb| {=Tc/qauser,qauser=CTc/qauser} (1 row) ddb=# revoke all on database qadb from public; REVOKE ddb=# select datname, datacl from pg_database where datname = 'qadb'; datname | datacl -+- qadb| {qauser=CTc/qauser} (1 row) ddb=# grant connect, temporary on database qadb to public; GRANT ddb=# select datname, datacl from pg_database where datname = 'qadb'; datname | datacl -+ qadb| {qauser=CTc/qauser,=Tc/qauser} (1 row) ddb=# select count(*) from pg_database where datacl is null; count --- 0 (1 row)
Convert pg_constraint.conkey array to same-order array of column names
We have a query returning 1 row per constraint column, which until recently we didn't realize wasn't preserving order of the columns. A colleague fixed that, with something like below: SELECT ... FROM pg_catalog.pg_constraint cnstr ... CROSS JOIN LATERAL UNNEST(cnstr.conkey) WITH ORDINALITY AS cols(value, rank) ORDER BY cols.rank But I'm wondering about getting 1 row per constraint instead, and fetching an array of column names. So is there a way to "convert" int2[] conkey array into a text[] of those column names?
Re: Convert pg_constraint.conkey array to same-order array of column names
On Thu, Mar 23, 2023 at 4:20 PM Adrian Klaver wrote: > On 3/23/23 04:12, Dominique Devienne wrote: > > CROSS JOIN LATERAL UNNEST(cnstr.conkey) WITH ORDINALITY AS cols(value, > rank) > > ORDER BY cols.rank > A before coffee solution: > Thanks for answering Adrian. And sorry for the delay in responding. > WITH ck AS ( > SELECT > conrelid, > unnest(conkey) AS ky > FROM > pg_constraint > WHERE > conrelid = 'cell_per'::regclass > ) > This part surprised me. I didn't know a table-valued function could be used like this on the select-clause. Both queries below yield the same rows for me, in the same order: => select conname, unnest(conkey), conrelid::regclass::text from pg_constraint where conrelid::regclass::text like ... and cardinality(conkey) = 8; => select conname, key.value, conrelid::regclass::text from pg_constraint cross join lateral unnest(conkey) as key(value) where conrelid::regclass::text like ... and cardinality(conkey) = 8; So your compact form is equivalent to the second form? What about the order? Is it guaranteed? I was "raised" on the "order is unspecified w/o an order-by-clause". Why would be it be different here? In our case, the query is more complex, with joins on pg_namespace, pg_class, and pg_attribute, on all constraints of a schema, and the order came out wrong w/o adding WITH ORDINALITY and ordering on it. Thus I worry the order is plan-dependent, and not guaranteed. Am I wrong to worry? The form you provide seems no different from our old form, to my non-expert eye. --DD
Re: Convert pg_constraint.conkey array to same-order array of column names
On Fri, Mar 24, 2023 at 11:51 AM Alvaro Herrera wrote: > On 2023-Mar-23, Dominique Devienne wrote: > > But I'm wondering about getting 1 row per constraint instead, > > and fetching an array of column names. > > IIRC you can use array_agg(unnest ORDER BY ordinality), grouping by > constraint OID or name. > Thanks. This lead me to a better search, yielding two informative results: https://stackoverflow.com/questions/60877627/perform-unnest-inner-join-and-then-array-agg-as-part-of-an-update-query http://amyszczepanski.com/2019/03/25/unnest-and-array-agg-respecting-order.html
Re: PostgreSQL vs MariaDB
On Fri, Mar 24, 2023 at 3:50 PM Jeffrey Walton wrote: > On Fri, Mar 24, 2023 at 7:07 AM Inzamam Shafiq > > Can someone please list pros and cons of MariaDB vs PostgreSQL [...] > > > > [...] MySQL has over [1700 CVEs going back to 1999.][1] > In contrast, PostgreSQL has about [240 CVEs going back to 1999.][2] > > [1]: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=MySQL > [2]: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=PostgreSQL OTOH, I checked a few, and they are not about PostgreSQL per-se, but projects using PostgreSQL or extensions to PostgreSQL. Thus I assume the same happens for MySQL. Of course, I didn't check that many, just a small sample. Also, there seems to be a low signal-to-noise ratio on CVEs. See what [Richard Hipp of SQLite fame says about CVEs][3]. This (low quality of CVEs) comes up often on the SQLite Forum. FWIW. Still, your input is insightful, qualitatively. Thanks, --DD [3]: https://sqlite.org/cves.html
Re: Convert pg_constraint.conkey array to same-order array of column names
On Fri, Mar 24, 2023 at 5:40 PM Adrian Klaver wrote: > On 3/24/23 03:28, Dominique Devienne wrote: > > On Thu, Mar 23, 2023 at 4:20 PM Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote: > Your original question was: > > "But I'm wondering about getting 1 row per constraint instead, > and fetching an array of column names. > > So is there a way to "convert" int2[] conkey array into a text[] of those > column names?" > > That is what I showed as a simple example. > Sorry, didn't mean to offend you. My post started with "preserving order of the columns", but I probably didn't make it clear enough preserving the source array's order was also a requirement.
Plans for ON DELETE CASCADE? Which index is used, if at all?
Hi. Given the classical parent/child schema below, with an ON DELETE CASCADE FK constraint, I'd like to know which index is used (if any) to delete child rows affected by the CASCADE. But explain (analyze) does not give me that. The reason I'm asking, is because we currently (automatically) add indexes on FK columns, to avoid full-scan of child tables when a parent row is deleted (i.e. un-indexed FKs), but I realized that index we create explicitly is also a *prefix* of the natural-key constraint, thus I suspect redundant with the index supporting that NK constraint. But before getting rid of that index, which I suspect is redundant, I wanted to verify. And I happen to discover I don't know how to verify it. My PostgreSQL Fu is not great... So, could someone please: 1) teach me how to know which index is used when cascading FKs in child tables? 2) confirm my suspicion that the child_parent_idx index below is redundant, given the child_parent_name_key one? Thanks, --DD ``` dd=> create table parent (id int generated always as identity primary key, name text not null unique); CREATE TABLE dd=> create table child (id int generated always as identity primary key, parent int not null references parent(id) on delete cascade, name text not null, unique(parent, name)); CREATE TABLE dd=> create index child_parent_idx on child(parent); CREATE INDEX dd=> insert into parent(name) values ('foo'); INSERT 0 1 dd=> insert into child(parent, name) values (1, 'c1'); INSERT 0 1 dd=> insert into child(parent, name) values (1, 'c2'); INSERT 0 1 dd=> insert into parent(name) values ('bar'); INSERT 0 1 dd=> insert into child(parent, name) values (2, 'c3'); INSERT 0 1 dd=> explain (analyze) delete from parent where id = 2; QUERY PLAN -- Delete on parent (cost=0.15..8.17 rows=1 width=6) (actual time=0.052..0.053 rows=0 loops=1) -> Index Scan using parent_pkey on parent (cost=0.15..8.17 rows=1 width=6) (actual time=0.021..0.024 rows=1 loops=1) Index Cond: (id = 2) Planning Time: 0.090 ms Trigger for constraint child_parent_fkey: time=0.242 calls=1 Execution Time: 0.329 ms (6 rows) dd=> \d+ child Table "public.child" Column | Type | Collation | Nullable | Default| Storage | Stats target | Description +-+---+--+--+--+--+- id | integer | | not null | generated always as identity | plain| | parent | integer | | not null | | plain| | name | text| | not null | | extended | | Indexes: "child_pkey" PRIMARY KEY, btree (id) "child_parent_name_key" UNIQUE CONSTRAINT, btree (parent, name) "child_parent_idx" btree (parent) Foreign-key constraints: "child_parent_fkey" FOREIGN KEY (parent) REFERENCES parent(id) ON DELETE CASCADE Access method: heap ```
Cluster table based on grand parent?
Hi again, I just sent a question regarding parent/child and cascading FKs. But in reality, our schema has not 2 but 3 "layers", with an additional grandchild "leaf" table (see below). Given that many acces patterns are parent-based, i.e. get all child of given parent, or get all grandchild of given child, I can use [CLUSTER][1] leveraging the natural-key (parent, name) constraint/index. But for grandchild rows, doesn't mean the rows for a given (grand)parent won't be fully clustered? Yes, our software often accesses rows in child and grandchild for a given parent row. So can grandchild table(s) be "fully" clustered per-(grand)parent? Would that require denormalizing, and adding an extra grandparent column FK in grandchild, to achieve that? And if that's the case, then there are two "paths" to CASCADE a delete from parent; Would that be a problem? (w.r.t. performance or otherwise?) Finally, does cluster affect associated toast tables too? (the doc doesn't say) Thanks for any insights. --DD PS: At this point, I don't even know how much cluster affects performance. But because it can affect the schema structure (by denormalizing), i'd rather know early. [1]: https://www.postgresql.org/docs/current/sql-cluster.html ``` dd=> create table parent (id int generated always as identity primary key, name text not null unique); CREATE TABLE dd=> create table child (id int generated always as identity primary key, parent int not null references parent(id) on delete cascade, name text not null, unique(parent, name)); CREATE TABLE dd=> create table grandchild (id int generated always as identity primary key, parent int not null references child(id) on delete cascade, name text not null, unique(parent, name)); CREATE TABLE ```
Re: Plans for ON DELETE CASCADE? Which index is used, if at all?
On Tue, Mar 28, 2023 at 3:23 PM Tom Lane wrote: > Dominique Devienne writes: > > Hi. Given the classical parent/child schema below, with an ON DELETE > > CASCADE FK constraint, I'd like to know which index is used (if any) to > > delete child rows affected by the CASCADE. > > > But explain (analyze) does not give me that. > > Yeah, it will just report the time spent in the FK triggers, > not what they were doing exactly. > > IIRC, you can see the CASCADE operations with contrib/auto_explain, > if you enable auto_explain.log_nested_statements. > Thanks. Looks like this won't be easily available to me :(. --DD dd=> select * from pg_available_extensions where name like '%auto%'; name | default_version | installed_version |comment -+-+---+--- autoinc | 1.0 | | functions for autoincrementing fields (1 row) dd=> select * from pg_available_extensions where name like '%explain%'; name | default_version | installed_version | comment --+-+---+- (0 rows)
Re: Cluster table based on grand parent?
On Tue, Mar 28, 2023 at 5:08 PM Peter J. Holzer wrote: > On 2023-03-28 07:26:47 -0700, Adrian Klaver wrote: > > On 3/28/23 06:17, Dominique Devienne wrote: > > > PS: At this point, I don't even know how much cluster affects > performance. > > I think that this depends a lot on your access patterns As I wrote, per-parent access to child and grandchild rows is typical. So w/o parent-based clustering of grandchild table(s), access those rows could potential seek to several (~50, see below) smaller clusters with arbitrary gaps. Cardinality is a few to ~20K on parent, x10-x50 on child, x20 - x100 on grandchild. So total row count rarely exceeds the 1M - 10M range. But there are LOBs/BYTEa... > (especially on > how much you update the grandchild table and whether those updates can > be HOT), so you will probably have to measure it yourself with a > realistic work load. > In this particular case, there aren't much UPDATEs, because of a deficiency of the client applications, which mostly do DELETE+INSERT instead of UPDATEs. Although we have to cascade modified dates up the parent hierarchy, so some UPDATEs do occur, but mostly on the less numerous child and parent tables. > (Personally I doubt the impact is large, but I don't know your data or > your access patterns.) > OK. > > > But because it can affect the schema structure (by denormalizing), > > > i'd rather know early. > > > > You will need to explain to me how it denormalizes? It reorders rows by > > index definition and does not maintain that order over updates and > inserts. > > I think he means that in order to cluster the grandchild table by the > parent.id > he would have to denormalize the table. > exactly. Thanks for your input.
Re: Cluster table based on grand parent?
On Tue, Mar 28, 2023 at 6:06 PM Ron wrote: > You can only get from parent to grandchild via child.id to > grandchild.parent, so why not cluster grandchild on grandchild.parent? > Hi. I don't understand your question. Yes, of course, if I want all grand-children of a given parent, I'll do for example: select p.id, c.id, c.name, gc.* from grandchild gc join child c on gc.parent = c.id join parent p on c.parent = p.id where p.name = $1 But w/o clustering on a denormalized grandchild.grandparent FK column, as Peter showed, and cluster only on grandchild.parent, that's not going to access a mostly continuous range of pages to fetch those all grandchild rows for that one parent. But probably 10 to 50 "row-clusters", given the fan-out I mentioned earlier at the child-table level. Or am I missing something?
libpq: empty arrays have rank 0 in binary results? whatever the type's rank?
Hi. Selecting/fetching an empty 1D array using a binary-mode PGresult, gives me back 12 bytes, and the first 4, the rank, is 0, something I was not expecting. I was expecting dims_rank = 1, then first_dim = 0 Normal? Next two ints are kinda useless given the 0 rank? It's easy to fix, to support rank=0 for empty arrays, I just want to confirm. Thanks, --DD PS: On a side note; where can I find the sending code for arrays? I didn't find the usual _send() and _recv() functions (but looking on github online...) PPS: The relevant part of my code is below, for context: uint32_t dims_rank = read(); if (dims_rank != 1) { throw std::runtime_error(fmt::format( "Unsupported array rank: Got {}; Want 1", dims_rank )); } uint32_t has_nulls = read(); uint32_t actual_elem_oid = read(); uint32_t first_dim = read(); uint32_t lower_idx = read(); ensure_oid(Oid{ actual_elem_oid }, expected_elem_oid); UNUSED(has_nulls); assert(lower_idx == 1);
Re: libpq: empty arrays have rank 0 in binary results? whatever the type's rank?
On Wed, Mar 29, 2023 at 3:45 PM Tom Lane wrote: > Dominique Devienne writes: > > Hi. Selecting/fetching an empty 1D array using a binary-mode PGresult, > > gives me back 12 bytes, and the first 4, the rank, is 0, something I was > > not expecting. > > Yeah, empty arrays have zero dimensions. > > > PS: On a side note; where can I find the sending code for arrays? > > I didn't find the usual _send() and _recv() functions (but looking on > > github online...) > > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/arrayfuncs.c;h=87c987fb2704761c59333bf8c1fee47e5c14c598;hb=HEAD#l1583 Great. Thanks Tom. --DD
Re: Using CTID system column as a "temporary" primary key
On Wed, Mar 29, 2023 at 9:23 PM Adrian Klaver wrote: > On 3/29/23 12:11, Sebastien Flaesch wrote: > > Oh the use of default keyword is new to me, thanks for that. > > > > But to make PostgreSQL more Informix-compatible, > > zero should have been considered as well. Perhaps. > 1) Why? Down the road to compatibility with some undetermined group of > databases lies mayhem. > Sure. Unless it's opt-in, see below. > 2) 0 can be a valid sequence value: > Of course. Yet, as above, if that is opt-in as specified in the `create table` DDL somehow, then why not? BTW, default and 0 are not the same thing. You cannot bind "default" in place of an integer-valued prepared-statement placeholder, in a binary mode insert. So it is definitely not the same thing. So while I can accept that not implementing that particular informix compatibility wart is a perfectly valid position, for impl and maintenance cost, the arguments I've read so far can be "easily" side-stepped from a technical perspective I suspect. FWIW.
Re: Using CTID system column as a "temporary" primary key
On Thu, Mar 30, 2023 at 11:42 AM Francisco Olarte wrote: > On Thu, 30 Mar 2023 at 10:01, Dominique Devienne > wrote: > > BTW, default and 0 are not the same thing. You cannot bind "default" in > place of > > an integer-valued prepared-statement placeholder, in a binary mode > insert. So it is > > definitely not the same thing. > > IMNSHO if you need to select between default and explicit in an insert > via binding you have a design problem, and down this path lies madness. > I fail to see that myself, sorry. You can bind NULL, you can bind values, so why wouldn't you be able to bind DEFAULT too? I see that more as a failing to the binding API myself :) But I guess it can be worked around with something like `... values(coalesce($1, default), ..)` and abusing NULL to mean DEFAULT on a case-by-case bases. Assuming default can be used in this way (didn't try), of course.
libpq: COPY FROM STDIN BINARY of arrays
Hi, I already use arrays fine (e.g. text[]) in BINARY mode binds (with prepared statement or not). I already use COPY FROM STDIN BINARY fine, with scalar values. But for the 1st time, I'm combining COPY with arrays, and it is NOT fine yet :( #08P01: ERROR: insufficient data left in message CONTEXT: COPY UT_LibPQ_copy_arrays, line 1, column ptys Does the wire-format of arrays differ between binary-binds and binary-copy? Any pointers to what I could look at, to resolve this? doc? code? example? Thanks, --DD PS: If that helps, I'm willing to detail the break-down of all values / bytes I'm sending
Re: libpq: COPY FROM STDIN BINARY of arrays
On Thu, Mar 30, 2023 at 6:00 PM Tom Lane wrote: > Dominique Devienne writes: > > Does the wire-format of arrays differ between binary-binds and > binary-copy? > > No. > An update on this, I think I figure it out, by comparing with COPY TO STDOUT WITH (FORMAT BINARY). I was missing the byte-count "header" for the array. Since I'm reusing my binary-bind code for COPY, I'm missing the 4 bytes for the array's "total encoded byte length" before the array data proper. Now that I understand the issue, should be a simple matter to fix. Thanks for chiming in Tom. --DD
DEFINER / INVOKER conundrum
My goal is to have clients connect to PostgreSQL, and call a function that return a JWT token. The JWT is supposed to capture the user (login role), and the current_role (which has meaning in our app), and sign it using a secret corresponding to a mid-tier service the client will connect to later. I've found https://github.com/michelp/pgjwt which seems perfect for my use case, but I'm struggling with something. On the one hand, I want a INVOKER security function, to be able to capture the login and current ROLEs. On the other hand, I want a DEFINER security function, to be able to access the secret to sign the JWT with. That secret will be in a table that regular users of our DB do NOT have access to, of course. But that the function doing the JWT signing does need access to, of course (again). I thought I'd have two layers of functions, one INVOKER that captures the ROLEs, which then calls the DEFINER one, passing the ROLEs captured, but since the INVOKER function must also be able to call the DEFINER function, what prevents the client from calling it directly, with different (spoofed) ROLEs? Is there a way out of that conundrum? I also thought about pg_stat_activity.usename with pg_backend_pid(), but there's no current_role in there, and also, I'm not sure how that would play with connection pooling!?!?!? So Is there a way to somehow mix INVOKER and DEFINER to achieve the stated goal? Or more broadly, mix information from the session and "private" information (JWT secret, part of the "app") in a server-side SQL function/procedure? Thanks, --DD
Re: DEFINER / INVOKER conundrum
First, let me say I was holding off replying/thanking everyone to have the time to properly test this. Erik's quasi-question makes me break that silence. On Tue, Apr 4, 2023 at 10:46 AM Erik Wienhold wrote: > > On 04/04/2023 07:55 CEST walt...@technowledgy.de wrote: > > For me, checking whether session_user has the privilege for claimed_role > > is not enough, so I add a DOMAIN to the mix: > > CREATE DOMAIN current_user_only AS NAME CHECK (VALUE = CURRENT_USER); > > CREATE FUNCTION f(calling_user current_user_only DEFAULT CURRENT_USER) > > Nice. It's equivalent to my version without the domain if the client can > execute SET ROLE before calling f, thereby injecting any role for which > pg_has_role(session_user, calling_user, 'MEMBER') returns true. > First, thanks to Christoph, Erik, and Walther, for helping out. Second, very nice indeed. I'll test it out soon. > Dominique did not say whether he controls the clients or not. > I would say I don't. The decision was made to go with a 2-tier architecture, so the desktop apps connect to PostgreSQL using a typically per-OS-user PostgreSQL LOGIN user, so even though it's our app's code that access the DB, so we control that part, nothing prevents those users to connect directly via psql or pgAdmin or any libpq or protocol-talking client, and try to do mischiefs or worse. Thus the server-side permission model must be as tight as it can be. Which means "regular" users don't have DDL privs, of course; thus the need for a mid-tier "more privileged" service to "mediate" the DDLs we sometimes need to do on behalf of "regular" users; thus the need to properly authenticate those users with the mid-tier services, based on the fact they can connect to the DB and its schema(s) and access/execute that JWT returning function. Hopefully that's clear :)
COPY RETURNING?
Hi. We are switching a schema type's surrogate/primary key, from `uuid` to `int`. That schema has parent-child relationships enforced with foreign-keys. Our ingestion/schema-loading code uses COPY FROM STDIN BINARY. Before, the SK/PK was generated client-side, as random uuid. The ingestion code maintained client-side maps from NKs to uuids, to be used for FK columns in child-tables COPY'd later. But now that the SK/PK is an integer identity column generated server-side, thanks to an implicitly-created sequence, we need that generated per-row `int` PK. With a normal prepared statement, we'd use a RETURNING clause, to avoid a separate round-trip to the server. Can the same somehow be achieved with COPY? I'm afraid the answer is no, but I want to ask anyway, maybe there's a way or work-around? At the end of the COPY, we do get a ResultSet, so API-wise it was be possible for it to contain some rows, I guess, but I'm not sure the COPY protocol supports returning rows, nor what the syntax would be to have a COPY RETURNING form of COPY. Thanks for any insights. --DD