Re: Slow planning time for simple query
> "Amit" == Amit Kapila writes: >> Presumably the problem is that the standby isn't authorized to change >> the btree index's "entry is dead" bits, Amit> I don't see anything like that in the code. We use _bt_killitems Amit> to mark the items as dead and neither that function or any of its Amit> caller has any such assumption. See index_fetch_heap: /* * If we scanned a whole HOT chain and found only dead tuples, tell index * AM to kill its entry for that TID (this will take effect in the next * amgettuple call, in index_getnext_tid). We do not do this when in * recovery because it may violate MVCC to do so. See comments in * RelationGetIndexScan(). */ if (!scan->xactStartedInRecovery) scan->kill_prior_tuple = all_dead; (this is the only place where kill_prior_tuple can be set to true) -- Andrew (irc:RhodiumToad)
Re: Text-indexing UTF-8 bytea, convert_from() immutability, null bytes...
> "Phil" == Phil Endecott writes: Phil> Next I tried Phil> to_tsvector('english',convert_from(body::text,'UTF-8')). That Phil> doesn't work because convert_from is not immutable. (This is 9.6; Phil> maybe that has changed.) Is there a good reason for that? I would guess because conversions are controlled by the pg_conversion table which can be modified by create/drop conversion. Phil> Maybe because I might change the client encoding? No, because convert_from converts from the specified encoding to the server_encoding, not the client_encoding, and the server_encoding can't be changed except at db creation time. Phil> As a hack I tried ALTER FUNCTION to make it immutable, A better approach is to wrap it in a function of your own which is declared immutable, rather than hacking the catalogs: create function from_utf8(bytea) returns text language plpgsql immutable as $$ begin return convert_from($1, 'UTF8'); end; $$; Phil> and now I get: Phil> ERROR: invalid byte sequence for encoding "UTF8": 0x00 Phil> Hmm, as far as I'm aware 0x00 is fine in UTF-8; what's that mean? PG doesn't allow 0x00 in text values regardless of encoding. Phil> But actually I'd be more than happy to ignore invalid UTF-8 here, Phil> since I'm only using it for text search; there may be some truly Phil> invalid UTF-8 in the data. Is there a "permissive" mode for Phil> charset conversion? Unfortunately not. Phil> (That error also suggests that the convert_from is not optimising Phil> the conversion from UTF-8 to UTF-8 to a no-op.) Indeed not, because it must validate that the data really is UTF-8 before treating it as such. Phil> Anyway: given the problem of creating a text search index over Phil> bytea data that contains UTF-8 text, which may include oddities Phil> like null bytes, what would you do? You can search for 0x00 in a bytea using position() or LIKE. What do you want to do with values that contain null bytes? or values which you think are supposed to be valid utf8 text but are not? -- Andrew (irc:RhodiumToad)
Re: SELECT UNION into a text[]
> "David" == David G Johnston writes: >> SELECT ARRAY_AGG(hashed) >> FROM words_nouns >> WHERE added > TO_TIMESTAMP(1539100913) >> UNION >> SELECT ARRAY_AGG(hashed) >> FROM words_verbs >> WHERE added > TO_TIMESTAMP(1539100913); David> SELECT array_agg(words) FROM ( David> SELECT hashed FROM words_nouns David> UNION ALL David> SELECT hashed FROM words_verbs David> ) AS src Or even better, SELECT ARRAY(SELECT hashed FROM words_nouns UNION SELECT hashed FROM words_verbs); -- Andrew (irc:RhodiumToad)
Re: Measuring the Query Optimizer Effect: Turning off the QO?
> "Tom" == Tom Mercha writes: Tom> Hi All Tom> As we know, a query goes through number of stages before it is Tom> executed. One of these stages is query optimization (QO). That's not really true at all. One of the stages is query _planning_, which takes the (rewritten) query as input and converts it to something that the executor can take action on. There isn't actually any separate "optimization" phase. -- Andrew (irc:RhodiumToad)
Re: Measuring the Query Optimizer Effect: Turning off the QO?
> "Tom" == Tom Lane writes: Tom> Two I'd particularly draw your attention to are Tom> join_collapse_limit and from_collapse_limit --- if you set both to Tom> 1, that'll effectively disable searching for a good join order, Tom> causing the join order to match the syntactic structure of the Tom> FROM clause. For instance "FROM a,b,c" will always be done by Tom> joining a to b first FROM a,b,c can always be planned in any join order. If you want to force the join order you have to set join_collapse_limit=1 AND write it as FROM a JOIN b ON ... JOIN c ON ... For an example, try: explain select * from onek o1, tenk1 t, onek o2 where o1.unique1=t.unique1 and t.unique1=o2.unique1 and o1.unique2<10 and o2.unique2<10; which (at least for me) joins o1 and o2 together first even with the collapse limits set to 1. -- Andrew (irc:RhodiumToad)
Re: SCRAM-SHA-256, is it possible to retrieve enough information from PG server (pg_authid etc) to perform authentication as a client
> "Vladimir" == Vladimir Soldatov writes: Vladimir> Hi guys, Vladimir> I am new to PostgreSQL, so sorry for maybe stupid question. I Vladimir> am working on some application implementing Frontend/Backend Vladimir> PG protocol and one of the goals - having only "admin" users Vladimir> credentials (like postgres user) be able to retrieve enough Vladimir> information from PG server (for example, from pg_authid Vladimir> table) to perform authentication for any user created in PG Vladimir> (without any user interaction, so we don't know the user's Vladimir> password). It's an explicit goal of SCRAM to make it impossible to use the server's stored authentication data to actually authenticate from a client (without first breaking the preimage resistance of the hash function). Specifically, the authentication exchange proves to the server that the client knows key_c, but the server only stores H(key_c); the server can validate the client message just by applying the hash function, but the correct value of key_c can't be determined in advance on the server without a successful preimage attack on H(key_c). The right way to allow a privileged user to operate as if they were someone else is to use SET ROLE or SET SESSION AUTHORIZATION rather than actually trying to log in as the other user. -- Andrew (irc:RhodiumToad)
Re: Use of ?get diagnostics'?
> "Thiemo" == Thiemo Kellner writes: Thiemo> Hi all Thiemo> I try to create a function (code at Thiemo> https://pastebin.com/mTs18B90) Paste sites are for IRC, on the mailing list you should always attach the necessary details to your message. Thiemo> using 'get diagnostics' to retrieve the number of affected Thiemo> rows. However, it throws Thiemo> the following exception was thrown: Thiemo> SQLSTATE: 42703 Thiemo> column "row_count" does not exist line 44 of your paste: V_TEXT := V_TEXT || ROW_COUNT || ' row.'; should be V_ROW_COUNT, I suspect. Likewise line 46. (The CONTEXT lines of the error message would have identified the offending line of the function for you.) -- Andrew (irc:RhodiumToad)
Re: How to get timezone offset in timestamp with time zone AT TIME ZONE output.
> "Paul" == Paul McGarry writes: Paul> Hi there, Paul> Does anyone have a good way of doing: Paul> = Paul> select '2020-04-04 15:00:00+00'::timestamp with time zone AT TIME ZONE Paul> 'Australia/Sydney'; Paul> timezone Paul> - Paul> 2020-04-05 02:00:00 Paul> select '2020-04-04 16:00:00+00'::timestamp with time zone AT TIME ZONE Paul> 'Australia/Sydney'; Paul> timezone Paul> - Paul> 2020-04-05 02:00:00 Paul> = Paul> but with the output including the offset, eg: Paul> 2020-04-05 02:00:00+11 Paul> 2020-04-05 02:00:00+10 This is ugly in some respects but minimizes the possible hazards (since using a function-local SET clause ensures that the timezone is reset on exit): create function ts_to_char(t timestamptz, z text) returns text language plpgsql immutable set timezone = 'GMT' as $$ begin perform set_config('timezone', z, true); return t::text; end; $$; select ts_to_char(timestamptz '2020-04-04 16:00:00+00', 'Australia/Sydney'); ts_to_char 2020-04-05 02:00:00+10 -- Andrew (irc:RhodiumToad)
Re: How to get timezone offset in timestamp with time zone AT TIME ZONE output.
> "Adrian" == Adrian Klaver writes: Adrian> This has come up before and the general suggestion has been to Adrian> have a column for a naive(timestamp w/o tz) timestamp and a Adrian> column for the timezone. No, that's usually (not always) backwards, and in any event wouldn't solve this particular issue. -- Andrew (irc:RhodiumToad)
Re: How to get timezone offset in timestamp with time zone AT TIME ZONE output.
> "Paul" == Paul McGarry writes: > On Wed, Sep 25, 2019 at 5:44 AM Adrian Klaver > wrote: >> >> "Therefore whatever renders the offset needs to be capable of doing >> it per row, independently of the server/session time zone." The key point here is that storing the timestamp as WITHOUT rather than WITH time zone doesn't help you solve the problem of outputting the data with a different time zone offset shown for different rows. (Since timestamp without time zone values will be output without an offset, whereas timestamp with time zone values will be output with the offset corresponding to the server's timezone, which shouldn't be changing globally between rows - local changes within a function are OK.) Paul> Andrew's function seems plausible and I need to find some time to Paul> test it, but I am slightly surprised there isn't a native way to Paul> get the output, as it seems like something that would be fairly Paul> useful when dealing with dates. The set of functions for dealing with timestamps has, like most of postgres, largely grown in an unplanned fashion and therefore often has deficiencies. Also, the fact that we don't follow the spec's semantics for WITH TIME ZONE (for good reason, the spec can't handle DST boundaries or historical timezone changes _at all_) complicates choices of functions and operators to provide. Possible functions we could add: strftime('format', t [,timezone]) -- like the C function to_char(t, 'format', timezone) -- 3-arg version of existing to_char You can do a getOffset(timestamptz,timezone) function like this: create function getOffset(t timestamptz, zone text) returns interval language sql immutable as $$ select (t at time zone zone) - (t at time zone 'GMT'); $$; but formatting the interval result as text is a little more challenging due to needing explicit + signs: create function getOffsetStr(t timestamptz, zone text) returns text language sql stable as $$ select regexp_replace(to_char(getOffset(t,zone), 'HH24:MI'), '^(?!-)', '+'); $$; -- Andrew (irc:RhodiumToad)
Re: How to get timezone offset in timestamp with time zone AT TIME ZONE output.
> "rob" == rob stone writes: rob> You can EXTRACT timezone, timezone_hour and timezone_minute from a rob> timestamp. Using 'timezone' returns a value representing seconds rob> from UTC. Yes, but this always outputs a value representing the server timezone; there's no way (other than the same function set_config trick I gave above) to make it return a value that represents a different timezone per row. -- Andrew (irc:RhodiumToad)
Re: Possible bug: SQL function parameter in window frame definition
> "Alastair" == Alastair McKinley writes: Alastair> Hi all, Alastair> I noticed this strange behaviour whilst trying to write a Alastair> function for Postgres 11.5 (PostgreSQL 11.5 on Alastair> x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 Alastair> (Red Hat 4.8.5-36), 64-bit) and reduced it to this minimal Alastair> example. Using a function parameter in the window frame Alastair> definition seems to be the cause of the error. [...] Alastair> This appears to be a bug to me. Yes, it's a bug, related to function inlining (the select f(3); is not inlined and therefore works, but the select * from f(3); is being inlined, but the original Param is somehow making it into the final plan rather than being substituted with its value). Looking into why. -- Andrew (irc:RhodiumToad)
Re: Possible bug: SQL function parameter in window frame definition
> "Tom" == Tom Lane writes: Tom> It looks to me that the reason is that query_tree_mutator Tom> (likewise query_tree_walker) fails to visit query->windowClause, I noticed this too. I spent some time looking at what might break if that was changed (found two places so far, see attached draft patch). Tom> which is a bug of the first magnitude if we allow those to contain Tom> expressions. Not sure how we've missed that up to now. I suspect because the partition/order by expressions are actually in the targetlist instead (with only SortGroupClause nodes in the windowClause), so only window framing expressions are being missed. Tom> Looking at struct Query, it seems like that's not the only Tom> questionable omission. We're also not descending into Tom> Node *utilityStmt;/* non-null if commandType == CMD_UTILITY */ I assume that utility statements are doing any necessary expression processing themselves... Tom> List *groupClause;/* a list of SortGroupClause's */ There's at least one place that walks this (and the distinct and sort clauses) explicitly (find_expr_references_walker) but most places just aren't interested in SortGroupClause nodes given that the actual expressions are elsewhere. Tom> List *groupingSets; /* a list of GroupingSet's if present */ Likewise, GroupingSet nodes are not any form of expression, they only reference the groupClause entries. Tom> List *distinctClause; /* a list of SortGroupClause's */ Tom> List *sortClause; /* a list of SortGroupClause's */ Same goes as for groupClause. Tom> List *rowMarks; /* a list of RowMarkClause's */ Tom> Now probably this is never called on utility statements, and maybe Tom> there is never a reason for anyone to examine or mutate Tom> SortGroupClauses, GroupingSets, or RowMarkClauses, but I'm not Tom> sure it's any business of this module to assume that. I think the logic that query_tree_walker is specifically there to walk places that might contain _expressions_ is reasonably valid. That said, the fact that we do have one caller that finds it necessary to explicitly walk some of the places that query_tree_walker omits suggests that this decision may have been a mistake. -- Andrew (irc:RhodiumToad) diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index dd0a7d8dac..2862c47314 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -2217,7 +2217,6 @@ find_expr_references_walker(Node *node, /* query_tree_walker ignores ORDER BY etc, but we need those opers */ find_expr_references_walker((Node *) query->sortClause, context); find_expr_references_walker((Node *) query->groupClause, context); - find_expr_references_walker((Node *) query->windowClause, context); find_expr_references_walker((Node *) query->distinctClause, context); /* Examine substructure of query */ diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 18bd5ac903..7f485ae29a 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -2292,6 +2292,8 @@ query_tree_walker(Query *query, return true; if (walker(query->havingQual, context)) return true; + if (walker(query->windowClause, context)) + return true; if (walker(query->limitOffset, context)) return true; if (walker(query->limitCount, context)) @@ -3151,6 +3153,7 @@ query_tree_mutator(Query *query, MUTATE(query->jointree, query->jointree, FromExpr *); MUTATE(query->setOperations, query->setOperations, Node *); MUTATE(query->havingQual, query->havingQual, Node *); + MUTATE(query->windowClause, query->windowClause, List *); MUTATE(query->limitOffset, query->limitOffset, Node *); MUTATE(query->limitCount, query->limitCount, Node *); if (!(flags & QTW_IGNORE_CTE_SUBQUERIES)) diff --git a/src/backend/parser/parse_collate.c b/src/backend/parser/parse_collate.c index 31a5f702a9..dabd904999 100644 --- a/src/backend/parser/parse_collate.c +++ b/src/backend/parser/parse_collate.c @@ -485,6 +485,7 @@ assign_collations_walker(Node *node, assign_collations_context *context) case T_FromExpr: case T_OnConflictExpr: case T_SortGroupClause: + case T_WindowClause: (void) expression_tree_walker(node, assign_collations_walker, (void *) &loccontext);
Re: Possible bug: SQL function parameter in window frame definition
> "Tom" == Tom Lane writes: Tom> However, we need to fix this in all active branches, and I Tom> definitely agree with minimizing the amount of change to back Tom> branches. The fact that the minimal change breaks (or exposes an Tom> oversight in) assign_collations_walker makes it very plausible Tom> that it will also break somebody's third-party code. If we push Tom> the API change further we increase the risk of breaking stuff. Tom> That seems OK in HEAD but not in back branches. We could minimize the chance of breakage in a back-patched fix by having query_tree_walker/mutator iterate the windowClause list itself and invoke the walker only on offset expressions; is it worth it? Walkers that follow the recommended code structure should be unaffected; it only shows up in the collations walker because that treats expressions as the "default" case and tries to explicitly handle all non-expression nodes. -- Andrew (irc:RhodiumToad)
Re: Possible bug: SQL function parameter in window frame definition
>>>>> "Andrew" == Andrew Gierth writes: Andrew> We could minimize the chance of breakage in a back-patched fix Andrew> by having query_tree_walker/mutator iterate the windowClause Andrew> list itself Here is a draft patch along those lines; the intent of this one is that no existing walker or mutator should need to change (the change to the dependency code is basically cosmetic I believe, just avoids walking some things twice). Also added some tests. -- Andrew (irc:RhodiumToad) diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index dd0a7d8dac..03582781f6 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -2214,18 +2214,13 @@ find_expr_references_walker(Node *node, context->addrs); } - /* query_tree_walker ignores ORDER BY etc, but we need those opers */ - find_expr_references_walker((Node *) query->sortClause, context); - find_expr_references_walker((Node *) query->groupClause, context); - find_expr_references_walker((Node *) query->windowClause, context); - find_expr_references_walker((Node *) query->distinctClause, context); - /* Examine substructure of query */ context->rtables = lcons(query->rtable, context->rtables); result = query_tree_walker(query, find_expr_references_walker, (void *) context, - QTW_IGNORE_JOINALIASES); + QTW_IGNORE_JOINALIASES | + QTW_EXAMINE_SORTGROUP); context->rtables = list_delete_first(context->rtables); return result; } diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 18bd5ac903..d063bee271 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -2296,6 +2296,33 @@ query_tree_walker(Query *query, return true; if (walker(query->limitCount, context)) return true; + if ((flags & QTW_EXAMINE_SORTGROUP)) + { + if (walker((Node *) query->groupClause, context)) + return true; + if (walker((Node *) query->windowClause, context)) + return true; + if (walker((Node *) query->sortClause, context)) + return true; + if (walker((Node *) query->distinctClause, context)) + return true; + } + else + { + /* + * We need to walk the expressions in WindowClause nodes even if we're + * not interested in SortGroupClause nodes. + */ + ListCell *lc; + foreach(lc, query->windowClause) + { + WindowClause *wc = lfirst_node(WindowClause, lc); + if (walker(wc->startOffset, context)) +return true; + if (walker(wc->endOffset, context)) +return true; + } + } if (!(flags & QTW_IGNORE_CTE_SUBQUERIES)) { if (walker((Node *) query->cteList, context)) @@ -3153,6 +3180,38 @@ query_tree_mutator(Query *query, MUTATE(query->havingQual, query->havingQual, Node *); MUTATE(query->limitOffset, query->limitOffset, Node *); MUTATE(query->limitCount, query->limitCount, Node *); + + if ((flags & QTW_EXAMINE_SORTGROUP)) + { + MUTATE(query->groupClause, query->groupClause, List *); + MUTATE(query->windowClause, query->windowClause, List *); + MUTATE(query->sortClause, query->sortClause, List *); + MUTATE(query->distinctClause, query->distinctClause, List *); + } + else + { + /* + * We need to mutate the expressions in WindowClause nodes even if + * we're not interested in SortGroupClause nodes. + */ + List *resultlist; + ListCell *temp; + + resultlist = NIL; + foreach(temp, query->windowClause) + { + WindowClause *wc = lfirst_node(WindowClause, temp); + WindowClause *newnode; + + FLATCOPY(newnode, wc, WindowClause); + MUTATE(newnode->startOffset, wc->startOffset, Node *); + MUTATE(newnode->endOffset, wc->endOffset, Node *); + + resultlist = lappend(resultlist, (Node *) newnode); + } + query->windowClause = resultlist; + } + if (!(flags & QTW_IGNORE_CTE_SUBQUERIES)) MUTATE(query->cteList, query->cteList, List *); else /* else copy CTE list as-is */ diff --git a/src/include/nodes/nodeFuncs.h b/src/include/nodes/nodeFuncs.h index 0cb931c82c..4b5408fa9b 100644 --- a/src/include/nodes/nodeFuncs.h +++ b/src/include/nodes/nodeFuncs.h @@ -27,6 +27,7 @@ #define QTW_EXAMINE_RTES_AFTER 0x20 /* examine RTE nodes after their * contents */ #define QTW_DONT_COPY_QUERY 0x40 /* do not copy top Query */ +#define QTW_EXAMINE_SORTGROUP 0x80 /* include SortGroupNode lists */ /* callback function for check_functions_in_node */ typedef bool (*check_function_callback) (Oid func_id, void *context); diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out index edc93d5729..d5fd4045f9 100644 --- a/src/test/regress/expected/window.out +++ b/src/test/regress/expected/window.out @@ -3821,3 +3821,45 @@ SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w 5 | t | t| t (5 rows)
Re: Possible bug: SQL function parameter in window frame definition
> "Tom" == Tom Lane writes: >> Here is a draft patch along those lines; the intent of this one is >> that no existing walker or mutator should need to change (the change >> to the dependency code is basically cosmetic I believe, just avoids >> walking some things twice). Tom> Hmm. I think this is a reasonable direction to go in, but Tom> what about groupingSets and rowMarks? groupingSets ultimately contains nothing but numbers which are meaningless without reference to the matching groupClause list. So anything that cares about those is really going to have to process them in its Query case in the walker function in order to get at both clauses. Similarly, rowMarks contains indexes into the rangetable (and no recursive substructure at all), so it's likewise better processed at the Query level. Tom> Also, in HEAD I'd be inclined to add assertions about utilityStmt Tom> being NULL. Yup. -- Andrew (irc:RhodiumToad)
Re: Arrays and ANY problem
> "David" == David Salisbury writes: David> I didn't specify the real problem as it's all wrapped up in David> layers and I didn't want to post a "can someone write the query David> for me". The real problem was I have a table with a string David> holding comma separated numbers, and needed to go to a lookup David> table and replace each of those numbers with it's correlated David> value. So '12,2,10' gets converted to 'twelve,two,ten'. David> Tom's "I'd suggest that making his sub-select return a rowset David> result rather than an array" was spot on and lead me to David> "unnest". For my posted problem this was the simple solution. David> Sorry to narrow things down to my specific array method. David> select name from table_name_ds_tmp where categoryid = ANY ( select David> unnest(string_to_array( '200,400', ',')::bigint[]) ); This won't necessarily preserve the order of elements (it might sometimes look like it does, but it's fooling you). It also won't handle duplicate numbers. The right solution that does preserve order would be: select name from unnest(string_to_array( '200,400', ',')::bigint[]) with ordinality as u(id,ord) join table_name_ds_tmp t on (t.category_id=u.id) order by u.ord; (wrap an ARRAY( ) around that if you need the result as a single array rather than as rows, or use string_agg(name, ',' order by ord) if you want a comma-separated string result) regexp_split_to_table might be a better method than unnest/string_to_array. -- Andrew (irc:RhodiumToad)
Re: JSON vs. JSONB storage size
> "Thomas" == Thomas Kellerer writes: Thomas> The table size with jsonb was bigger in general, but the one Thomas> with the "integer" value was even bigger than the one with the Thomas> "string" storage. jsonb stores numeric values as "numeric", not as integers or floats, so the storage needed will depend on the number of decimal digits. The size results you're seeing are mainly the consequence of the fact that jsonb stores the whole Numeric datum, varlena header included (and without packing the header), so there's an extra 4 bytes you might not have accounted for: 1234567890 is three numeric "digits" (2 bytes each) plus a 2 byte numeric header (for weight/scale/sign) plus the 4 byte varlena header, for 12 bytes total, whereas "1234567890" takes only 10 (since the length is encoded in the jsonb value offsets). Furthermore, there may be up to 3 padding bytes before the numeric value. I think in your test, the extra 3 bytes is pushing the size of a single row up to the next multiple of MAXALIGN, so you're getting slightly fewer rows per page. I don't know what Windows is doing, but on my system (freebsd amd64) I get 136 rows/page vs. 120 rows/page, which would make a million rows take 57MB or 65MB. (Your use of pg_total_relation_size is including the pkey index, which confuses the results a bit.) -- Andrew (irc:RhodiumToad)
Re: day interval
> "Abraham" == Abraham, Danny writes: Abraham> Hi Abraham> A question on day interval Abraham> select date('20191001') - date('20190923'); Abraham> Will provide sometimes '8' - an integer , but sometimes '8 Abraham> day' - a string No, it'll always return an integer. You will only get an interval result if you subtract timestamps rather than dates, for example if one of the operands is actually an expression returning a timestamp. Give an example of an actual expression you used that returned an interval instead, and we may be able to tell you how to fix it. -- Andrew (irc:RhodiumToad)
Re: day interval
> "Abraham" == Abraham, Danny writes: Abraham> The problematic code is: Abraham> select date(cm_curr_date) - date(CM_DATE) into diff_days from CMS_SYSPRM; This will always return an integer, unless either the date() cast or the -(date,date) operator have been redefined or modified. Abraham> The fix is: Abraham> select date_part ('day', age( date(cm_curr_date), date(CM_DATE))) into diff_days from CMS_SYSPRM; This doesn't do the same thing, it will give a different result if the dates differ by a month or more. Abraham> The problem: Abraham> How to recreate the problem. (You know - QA). Abraham> Tried changing lc_time, timezone and datestyle .. but nothing Abraham> seems to work None of these things can affect data types. -- Andrew (irc:RhodiumToad)
Re: day interval
> "Abraham" == Abraham, Danny writes: Abraham> Thanks for the clarification. Abraham> The problem is still this: Abraham> select date('20191001') - date('20190101') ; Abraham> in my servers it is always '273'. Abraham> In the customer's DB it is '273 days'; Then you need to establish why that is. For example, try these in psql on the customer's db and show us the outputs: \dT *.date \df *.date select castsource::regtype, casttarget::regtype, castfunc::regprocedure, castcontext, castmethod from pg_cast c join pg_type t on (casttarget=t.oid) where typname='date'; select oprresult::regtype from pg_operator join pg_type t1 on (t1.oid=oprleft) join pg_type t2 on (t2.oid=oprright) where oprname='-' and t1.typname='date' and t2.typname='date'; -- Andrew (irc:RhodiumToad)
Re: SELECT d02name::bytea FROM ... && DBI::Pg
> "Matthias" == Matthias Apitz writes: Matthias> Hello, Matthias> I can SELECT && print a column in hex with: Matthias> pos71=# select d02name::bytea from d02ben where d02bnr = '1048313' ; If your intention is to see the raw bytes of the stored text value, for example to check the encoding, then you should not do it that way. Casting from text to bytea does NOT do what you think. Instead use convert_to(d02name,'SQL_ASCII') (yes, always specify SQL_ASCII regardless of what encoding you think it should be in, since the actual meaning of SQL_ASCII is "no conversions"). For readability, you may then want to wrap that as encode(convert_to(d02name,'SQL_ASCII'),'escape') which will keep the ASCII characters but use \nnn escapes for non-ascii. -- Andrew (irc:RhodiumToad)
Re: Inserting multiple rows wtih a SELECt in the values clause
> "stan" == stan writes: stan> I suspect this may be because the SELECT in the values clause stan> returns multiple rows? Understand this: VALUES is really just a special form of SELECT that returns only the specific rows that you tell it to construct. Every single row returned by a VALUES clause is separately constructed. i.e. VALUES (...),(...); will return exactly two rows regardless of what is inside the (...). VALUES (...); is always exactly one row. And so on. The general form of INSERT is actually: INSERT INTO table(columns) where is any valid query returning any number of rows. The use of VALUES for the is just a convenient shorthand for cases where the exact number of rows to be inserted, and their content, is known in advance. So, if you're inserting some set of rows generated from a query, the word VALUES should not appear in the top-level statement. What you want is: INSERT INTO rate(employee_key, project_key, work_type_key, rate) SELECT employee.employee_key, project.project_key, work_type.work_type_key, 1 as rate FROM employee CROSS JOIN project CROSS JOIN work_type; -- Andrew (irc:RhodiumToad)
Re: here does postgres take its timezone information from?
> "Tom" == Tom Lane writes: >>> Ugh. It doesn't have the old backward compatibility names like >>> US/Pacific installed by default, which is a problem if that's what >>> initdb picked for your cluster (or you've stored references to any >>> of those names in other ways). >> One quick fix is to revert the change. Tom thinks this is not reason >> to revert. Would it be enough to edit the postgresql.conf to use the >> correct "modern" name for US/Pacific (PST?)? In rhar case, an update >> note might be sufficient? Tom> I think the "official" name of that zone is America/Los_Angeles. Tom> But initdb might seize on the US/Pacific alias, if available, And now you know why I have been saying for so many years that initdb should use the official names! -- Andrew (irc:RhodiumToad)
Re: here does postgres take its timezone information from?
>>>>> "Tom" == Tom Lane writes: Tom> Andrew Gierth writes: Tom> I think the "official" name of that zone is America/Los_Angeles. Tom> But initdb might seize on the US/Pacific alias, if available, >> And now you know why I have been saying for so many years that initdb >> should use the official names! Tom> [ shrug... ] The installed format doesn't provide any way to Tom> distinguish which are the "official" names. They're typically all Tom> hardlinks to the same file. zone.tab / zone1970.tab. Which I first pointed out before 8.2 came out... -- Andrew.
Re: Encoding/collation question
> "Rich" == Rich Shepard writes: Rich> I doubt that my use will notice meaningful differences. Since Rich> there are only two or three databases in UTF8 and its collation Rich> perhaps I'll convert those to LATIN1 and C. Note that it's perfectly fine to use UTF8 encoding and C collation (this has the effect of sorting strings in Unicode codepoint order); this is as fast for comparisons as LATIN1/C is. For those cases where you need data to be sorted in a culturally-meaningful order rather than in codepoint order, you can set collations on specific columns or in individual queries. -- Andrew (irc:RhodiumToad)
Re: Max locks
> "Peter" == Peter Eisentraut writes: Peter> max_locks_per_transactions only affects relation locks (also Peter> known as heavy weight locks), but pg_locks also shows other Peter> kinds of locks. pg_locks shows exactly two types of locks: "heavy" locks (which are not merely relation locks but also object, tuple, extension, transaction and advisory locks), and predicate locks (SIReadLock) which are limited by a separate configuration parameter (and are not relevant in this case based on the prior IRC discussion). Peter> Filter by locktype = 'relation' to get the appropriate view. This is incorrect (some predicate locks may be tagged 'relation' as well, and some heavy locks will not be relation locks). There's also the question of fastpath locks, but I believe there can only be a small number of these (16?) per backend, so that wouldn't account for this. I think what's going on is that the max size of the lock hashtable isn't strictly enforced; it'll add enough memory space for the configured number of locks to the total size of the shared memory segment, but it won't actually report an error until shared memory is actually exhausted, and it's possible that there may be unused space. (Performance may degrade if there are more locks than the configured maximum, because the hash table will have been sized for that maximum and can't be grown.) See comments for ShmemInitHash. -- Andrew (irc:RhodiumToad)
Re: Are PostgreSQL functions that return sets or tables evaluated lazily or eagerly?
> "Gerald" == Gerald Britton writes: Gerald> That leads me to ask: Gerald> If (and under what circumstances) PostgreSQL evaluates Gerald> functions lazily (returning rows as requested by the caller) or Gerald> eagerly (evaluation all rows before returning the first one)? This isn't trivial to answer because it depends significantly on the language the function is written in and how it was called. The first matching rule below controls what happens. 1. LANGUAGE SQL with inlining Table functions in language SQL are candidates for inlining, see https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions If an SQL-language function is inlined, then it behaves exactly as though the function body was written inline, which means it is evaluated as lazily as the query plan otherwise permits (for example, if there's a non-indexable ORDER BY clause, then clearly all the values have to be fetched before any are returned). 2. Table function called in the FROM clause Table functions in the FROM clause, e.g. SELECT ... FROM myfunc(); are always evaluated eagerly. 3. LANGUAGE SQL without inlining, in the select-list If the final statement of an SQL function is a plain select with no wCTEs, then it is evaluated lazily: the first fetch will execute everything up to the first row of the final select, and subsequently one row will be fetched at a time. If the final statement is a DML statement with a RETURNING clause, or contains wCTEs, then it is evaluated eagerly. 4. LANGUAGE C / INTERNAL C-language functions (and therefore internal functions too) can choose whether to use value-per-call mode or materialize mode. Materialize mode is always "eager", but value-per-call mode is sometimes still eager (as in case 2 above); it can only be lazy if no preceding rule forced it to be otherwise. Most built-in table functions use value-per-call mode (a notable exception being the functions in the tablefunc module). 5. LANGUAGE PLPGSQL, PLPERL, PLTCL Plpgsql, plperl, and pltcl functions are always evaluated eagerly. 6. LANGUAGE plpython Plpython functions that return an iterator run in value-per-call mode, with a "next" call on the iterator for each row. To what extent this is a lazy or eager evaluation depends on the python code. 7. Other PL languages For non-core PL languages the documentation or source code may indicate whether the language uses materialize mode or value-per-call mode. (Most languages are probably not well equipped to do value-per-call mode. One that does allow it is pl/lua, which runs table functions as coroutines.) -- Andrew (irc:RhodiumToad)
Re: When should parameters be passed as text v. binary?
> "Paula" == Paula Kirsch writes: Paula> I'm just trying to understand the trade-offs between sending Paula> everything always as text, all integer parameters as binary, Paula> floats as binary, etc. For passing data from client to server, there's no particular reason not to use the binary format for any data type that you understand (and where you're passing the data type oid explicitly in the query, rather than just leaving it as unknown). For results, things are harder, because libpq is currently all-or-nothing about result type formats, and if you start using extension types then not all of them even _have_ a binary format. And to decode a binary result you need to know the type, and have code to handle every specific type's binary format. -- Andrew (irc:RhodiumToad)
Re: \COPY to accept non UTF-8 chars in CHAR columns
> "Thomas" == Thomas Munro writes: Thomas> Something like this approach might be useful for fixing the CSV file: Thomas> https://codereview.stackexchange.com/questions/185821/convert-a-mix-of-latin-1-and-utf-8-to-proper-utf-8 Or: perl -MEncode -pe ' use bytes; sub c { decode("UTF-8",shift,sub { decode("windows-1252", chr(shift)) }); } s/([\x80-\xFF]+)/encode("UTF-8",c($1))/eg' outfile -- Andrew (irc:RhodiumToad)
Re: \COPY to accept non UTF-8 chars in CHAR columns
> "Rory" == Rory Campbell-Lange writes: Rory> Or: Rory> iconv -f WINDOWS-1252 -t UTF-8 -c < tempfile2 > tempfile3 No. That's just a conversion of win1252 to utf8 without regard for any UTF8 that might already be present in the input. Any such input will end up double-encoded, requiring further work to fix. -- Andrew (irc:RhodiumToad)
Re: \COPY to accept non UTF-8 chars in CHAR columns
> "Matthias" == Matthias Apitz writes: Matthias> i.e. 0xc3 is translated to 0xc383 and the 2nd half, the Matthias> 0xbc to 0xc2bc, both translations have nothing to do with Matthias> the original split 0xc3bc, and perhaps in this case it Matthias> would be better to spill out a blank 0x40 for each of the Matthias> bytes which formed the 0xc3bc. If the only malformed sequences are there as a result of splitting up valid sequences, then you could do something like convert all invalid sequences to (sequences of) noncharacters, then once the data is imported, fix it up by adjusting how the data is split and regenerating the correct sequence (assuming your application allows this). For example you could encode an arbitrary byte xy as a sequence of two codepoints U+FDDx U+FDEy (the range FDD0-FDEF are all defined as noncharacters). -- Andrew (irc:RhodiumToad)
Re: Compiling C Extension Functions against PostgreSQL 12
> "TalGloz" == TalGloz writes: TalGloz> Which is good, my seal_mean_cxx_v2.so being created and copied TalGloz> to /usr/pgsql-12/lib/. But right after that I get this and it TalGloz> doesn't seem to effect my seal_mean_cxx_v2.so library: What's happening here is that it's attempting to build LLVM bitcode files for your library for the benefit of the JIT compiler that exists in recent postgres versions - without these files, your extension functions cannot be inlined into generated code. This requires that your code be compilable using clang (as well as gcc if that's what was used to build PG itself), and there's clearly some disagreement going on between clang and your system header files that's causing the failure. I didn't see an easy way of disabling bitcode emission for a module, though I think that has been discussed before. -- Andrew (irc:RhodiumToad)
Re: How to declare PG version for compiling extensions.
> "GPT" == GPT writes: GPT> Unfortunately, I had not installed the following package: GPT> "postgresql-server-dev-11" GPT> By the way, shouldn't a warning message appear while trying to run: GPT> `PATH=/usr/lib/postgresql/11/bin:$PATH make USE_PGXS=1` GPT> warning the user that some files are missing. It's more reliable to do (if the makefile is correctly written): make USE_PGXS=1 PG_CONFIG=/usr/lib/postgresql/11/bin/pg_config By specifying the pg_config binary explicitly rather than relying on the PATH, you avoid the chance of picking up an incorrect copy by mistake. This is why the standard form for pgxs makefiles has an assignment for PG_CONFIG before the PGXS assignment line; the command-line option overrides it. -- Andrew (irc:RhodiumToad)
Re: Shell Command within function
> "Mike" == Mike Martin writes: Mike> Is this possible? In an untrusted pl language, yes, but it's a bad idea because: Mike> I have a script which imports csvlogs into a table, and it would Mike> be useful to truncate the log files after import If you had an error after truncating the file but before committing the transaction, then the imported data would be gone, but the file would be gone too, so you'd have lost it. Things like this need to be handled _after_ commit, not before, so a function is the wrong place for it. -- Andrew (irc:RhodiumToad)
Re: Broken postgres links need to find callers
> "Rich" == Rich Shepard writes: Rich> I managed to mess up postgresql-10.3 on this Slackware-14.2 Rich> desktop server/workstation. It worked OK until I tried adding Rich> access to an another application. Rich> waiting for server to start2018-10-31 10:02:01.312 PDT [1285] FATAL: Rich> could not open directory "/usr/share/postgresql-10.2/timezonesets": No such Rich> file or directory 2018-10-31 10:02:01.312 PDT [1285] HINT: This may indicate Rich> an incomplete PostgreSQL installation, or that the file Rich> "/usr/lib/postgresql/10.3/bin/postgres" has been moved away from its proper Rich> location. Is there a pg_config binary in /usr/lib/postgresql/10.3/bin/ and if so, what is the output of /usr/lib/postgresql/10.3/bin/pg_config --sharedir Also what is the output of /usr/lib/postgresql/10.3/bin/postgres -V The most plausible explanation I can see for what you're seeing there is that what you have as /usr/lib/postgresql/10.3/bin/postgres is not actually the 10.3 binary but rather the 10.2 one. There should be no symlinks involved there - the path that is reported in the error message is the one that the postgres binary actually did try to open. -- Andrew (irc:RhodiumToad)
Re: Broken postgres links need to find callers
> "Rich" == Rich Shepard writes: Rich> Yes, pg_config is present but pointing to the wrong directory: Rich> # /usr/lib/postgresql/10.3/bin/pg_config --sharedir Rich> /usr/share/postgresql-10.2 What this says is that you somehow have a pg 10.3 binary which has been compiled with ./configure --datadir=/usr/share/postgresql-10.2 which seems, to say the least, somewhat odd. pg_config isn't used by the postgres binary to find paths, so "fixing" it wouldn't help. The same paths that were compiled into pg_config are compiled into the postgres binary, and pg_config and postgres contain the same relocation logic. -- Andrew (irc:RhodiumToad)
Re: recursion in plpgsql
> "David" == David Gauthier writes: David> Tom: I seem to remember (but am probably wrong) that cursors David> were locally scoped (or could be made so). This was several David> years ago with an earlier v8 version. Was that sort of thing David> around back then ? There are two distinct objects here being called "cursor": one is the plpgsql variable, which is locally scoped, and the other is the actual open portal, which must have a unique name within the session. By default, plpgsql explicit bound cursors (but not plain "refcursor" variables) take their portal name from the plpgsql variable name, and hence don't work recursively by default. This is a convenience so that code outside the function can use the same name to refer to the open portal. However, plpgsql cursor variables (whether declared bound or unbound) can be assigned a text value or NULL _before_ being opened, and if so, that value will be used for the portal name, or if NULL, a name of "" will be uniquely generated. (_After_ the open, the variable's text value is the actually assigned portal name.) Unbound refcursor variables default to NULL, so they are assigned unique portal names on opening. So in your example, adding child_node_curr := NULL; immediately before the OPEN statement should be sufficient. -- Andrew (irc:RhodiumToad)
Re: Largest & Smallest Functions
> "Ken" == Ken Tanzer writes: Ken> Hi. Building on the [type]_larger and _smaller functions (and Ken> lifting from the documentation), I put together a couple of Ken> functions that will take any number of arguments: Ken> CREATE FUNCTION largest(VARIADIC anyarray) RETURNS anyelement AS Ken> $$ SELECT max($1[i]) FROM generate_subscripts($1, 1) g(i); $$ Ken> LANGUAGE SQL IMMUTABLE; Ken> So far so good. I can do smallest(2,4,7), etc. But to take this a Ken> convenient step further, sometimes I want the smallest or largest Ken> from values already in an array. So I can create these functions: But you don't need to create more functions, because you can do this: select largest(variadic array[1,2,3]); largest - 3 Ken> So here's my questions: Ken> 1) Is there any way to collapse those four functions into two? (Or Ken> is there a better way to go about this?) See above Ken> 2) Is there any particular reason functions like that aren't built Ken> into Postgres? They seem like they would be useful. (Or maybe I Ken> missed them?) As already pointed out, greatest() and least() exist (though they were added before VARIADIC was, so they don't use it) Ken> 3) Bonus question--how come all the existing _larger and _smaller Ken> functions are specific to each data type, as opposed to more Ken> general smaller/larger functions? Because it saves looking up the type comparison function and doing an indirect call. -- Andrew (irc:RhodiumToad)
Re: Largest & Smallest Functions
> "Pavel" == Pavel Stehule writes: Pavel> The variadic parameters should not be a arrays - can be of "any" Pavel> type. But this functionality is available only for C language Pavel> functions. You mean (VARIADIC "any")? - that is not actually restricted to C language functions, any pl/* handler can choose to support it (it's just that all the built-in ones don't). -- Andrew (irc:RhodiumToad)
Re: Full list of operations that constitute a "maintenance" operation?
> "Lance" == Lance Luvaul writes: Lance> Hi all, I've read on the Postgres documentation for Lance> 'maintenance_work_mem' that VACUUM, CREATE INDEX, and ALTER Lance> TABLE ADD FOREIGN KEY are considered maintenance operations, but Lance> are there others? For example I use ALTER TABLE ADD COLUMN and Lance> ALTER TABLE SET LOGGED in my scripts... are they maintenance Lance> operations that would cause a maintenance_work_mem-sized chunk Lance> of memory (or more than 1 such chunk) to be allocated? There are essentially two classes of operations that use maintenance_work_mem: 1. Specific operations: non-full VACUUM (for the deleted tid list), CLUSTER (for sorting the table content), ALTER ... ADD FOREIGN KEY (for the validation query, which might want to use sorts or a hashjoin). (It's not impossible, though it should be rare, for an FK validation to use two chunks of maintenance_work_mem - with the right table sizes and data types, the validation query could plan as a merge anti-join with explicit sorts on both sides.) 2. Anything that (re)builds an index for any reason. This includes a large set of operations: CREATE INDEX and REINDEX are obvious, likewise VACUUM FULL and CLUSTER, but also any form of ALTER TABLE that rewrites the heap (which _includes_ SET LOGGED, but does not include adding a column with no default, or in pg11+ adding a column with a default). It also includes non-concurrent refresh of a materialized view. (TRUNCATE also technically rewrites the table heap, but since the new heap is empty, memory consumption during reindex is not an issue.) ALTER TABLE to add a unique, primary key or exclusion constraint also creates an index to implement the constraint with (unless USING INDEX was used to adopt an existing index) so that counts too. -- Andrew (irc:RhodiumToad)
Re: query patterns for multipass aggregating
> "Rob" == Rob Nikander writes: Rob> I want a query to list items, with their colors and images. Each Rob> result row is an item, so the colors and images must be aggregated Rob> into arrays or json. Rob> If there were only one other table, it’s a simple join and group… Rob> select items.*, array_agg(color_name) Rob> from items join colors on items.id = colors.item_id Rob> group by items.id Method 1: select items.*, c.colors, i.images from items left join (select item_id, array_agg(color_name) as colors from colors group by item_id) c on c.item_id=items.id left join (select item_id, array_agg(image) as images from images group by item_id) i on i.item_id=items.id; Method 2: select items.*, c.colors, i.images from items left join lateral (select array_agg(c0.color_name) as colors from colors c0 where c0.item_id=items.id) c on true left join lateral (select array_agg(i0.image) as images from images i0 where i0.item_id=items.id) i on true; Unfortunately, the planner isn't smart enough yet to know that these two are equivalent, so they generate different sets of possible query plans. Method 1 gets plans that work well if the entire items table is being selected, since it will read the whole of the images and colors tables in one go, and it will also get plans that work well for reading a _single_ item selected by WHERE items.id=? because equivalence-class processing will propagate a copy of that condition down to below the grouping clauses. It will _not_ get a good plan for reading any other small subset of items (e.g. selected by other criteria); for this you need method 2, which in turn doesn't get very good plans when you fetch the whole items table. Don't be tempted to use CTEs for the subqueries in either plan; that'll only make it much worse. -- Andrew (irc:RhodiumToad)
Re: query patterns for multipass aggregating
>>>>> "Andrew" == Andrew Gierth writes: Andrew> Unfortunately, the planner isn't smart enough yet to know that Andrew> these two are equivalent, oops, I edited the second one before posting in a way that made them not be equivalent: adding a "group by x0.item_id" in both subqueries in method 2 makes them equivalent again. Without that, the result differs slightly if there are no matching color or image rows (NULL vs. empty array). -- Andrew (irc:RhodiumToad)
Re: query patterns for multipass aggregating
> "Ondřej" == Ondřej Bouda writes: Ondřej> What about subqueries? Ondřej> SELECT Ondřej> items.*, Ondřej> (SELECT array_agg(color_name) FROM colors WHERE item_id = Ondřej> items.id) AS color_names, Ondřej> (SELECT array_agg(image_file) FROM images WHERE item_id = Ondřej> items.id) AS image_files Ondřej> FROM items Ondřej> According to my experience, not only the code is readable (no Ondřej> questions about how many rows are there for each item), but it Ondřej> also leads to a better query plan. This is (generally speaking) no improvement over the LATERAL method I showed, and is less flexible (for example it's very painful to return more than one value from the subqueries). -- Andrew (irc:RhodiumToad)
Re: Move cluster to new host, upgraded version
> "Rich" == Rich Shepard writes: Rich> My current desktop server/workstation is running version 10.5. Rich> I'm configuring a replacement desktop and have installed version Rich> 11.1 on it. To copy all databases from the 10.5 version to the Rich> 11.1 version I assume that I should do a pg_dumpall on the Rich> current host and read in that file on the replacement host. Is Rich> this the proper procedure? The most reliable and preferred procedure is to use the _new_ version's pg_dumpall, for example by allowing access to the old host from the new one (possibly using an ssh port forward), or (on OSes that make it easy to do package installs of multiple versions) to install the new pg_dump and pg_dumpall on the old system. Using the old version's pg_dumpall also generally speaking works, but there may be occasional rough edges. -- Andrew (irc:RhodiumToad)
Re: Debian : No echo after pg_dump | psql
> "Moreno" == Moreno Andreo writes: Moreno> The command I'm using is Moreno> root@x:~# pg_dump -v -C -h -p 6543 -U postgres Moreno> | psql -h localhost -p 6543 -U postgres Moreno> It presents a double password prompt after I run it: Moreno> Password: Password for user postgres: This is going to prompt once for the remote host's password and twice for the local one (because -C), and the concurrently-running commands are going to be fighting over access to the terminal to do it. Best avoided by using pgpass or non-password-based auth methods. More seriously, you're misunderstanding how -C works. When you use -C, the database you specify to psql (or pg_restore) is NOT the database you're restoring into - the restored db will ALWAYS have the same name as it had when dumped (if that's not what you want then don't use -C). Instead, the database you specify to psql or pg_restore is the database to connect to to issue the CREATE DATABASE command, which should usually be 'postgres'. That explains this bit: Moreno> If I create database (just database, not schema) on target Moreno> machine, I receive the error "database already exists" but Moreno> the dump goes on If I don't create it, I receive the error Moreno> "database does not exist" and processing aborts. -- Andrew (irc:RhodiumToad)
Re: Transition Tables doesn´t have OID
> "PegoraroF10" == PegoraroF10 writes: PegoraroF10> I would like to user OID value because we change our PegoraroF10> primary keys, sometimes, OID doesn´t. "oid" as a special system column and the WITH OIDS option when creating tables are being removed in pg12 (having been deprecated for something like 12 years now), you might want to start thinking about alternatives. PegoraroF10> So, there is a way to get OID on transition tables ? Well, arguably it is an oversight in the implementation of transition tables that they were not added to the various places in the parser that treat "oid" as a system column name. However, I not sure that you'll get any agreement to fix that in light of the demise of "oid" as mentioned above. (Not least because backpatching it could be dangerous in that it could break queries that now work, by making "oid" an ambiguous column reference.) I tried various workarounds, but they were defeated by the fact that evaluation of a whole-row Var does not copy the oid value (if any). (I'm not sure if this ever really worked, so calling it a bug may be a bit of a stretch.) -- Andrew (irc:RhodiumToad)
Re: using a function in where
> "Glenn" == Glenn Schultz writes: Glenn> All, Glenn> I am using the function below to convert a continuous variable Glenn> to a binned value. Sometimes a value other than zero is passed Glenn> through the query. For example -.5 result value is passed to the Glenn> query result. The basic of the query is below. Glenn> select Glenn> incentivebin(wac, rate, .25) Glenn> from Glenn> my_table Glenn> where incentivebin(was, rate, .25) = 0 Those two calls to incentivebin() have different parameters - is that an error in the original query, or a typo in transcribing it to email? -- Andrew (irc:RhodiumToad)
Re: Query never completes with an OR condition
> "Kenneth" == Kenneth Marshall writes: Kenneth> The individual queries run as expected, but when the OR Kenneth> condition is added, it never finishes. http://blog.rhodiumtoad.org.uk/2017/01/22/performance-issues-with-ored-conditions/ -- Andrew (irc:RhodiumToad)
Re: != ANY(array) does not behave as expected
> "Chris" == Chris Wilson writes: Chris> However, if we try to invert it by using the != operator, then Chris> we get unexpected results: Mr. De Morgan would like a word. https://en.wikipedia.org/wiki/De_Morgan%27s_laws In short, if you have a condition of the form (a OR b) and you want to negate it, then you find that: NOT (a OR b) is equivalent to (NOT a) AND (NOT b) Since x = ANY (array[1,2]) is equivalent to (x = 1) OR (x = 2), then the negation would be (x != 1) AND (x != 2), not OR. Which can be conveniently expressed as x != ALL (array[1,2]). So just as you interchange AND and OR when inverting the sense of a condition, you also interchange ALL and ANY for exactly the same reasons. Chris> expression NOT IN (subquery) https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_NOT_IN Chris> And is it a bug that one can't use unnest in a NOT IN expression Chris> in the WHERE clause? No. -- Andrew (irc:RhodiumToad)
Re: Code for getting particular day of week number from month
> "Mike" == Mike Martin writes: Mike> Hi Mike> For a particular sequence I needed to do (schedule 2nd monday in Mike> month for coming year) I created the following query That doesn't look like the best way - you're generating and discarding a lot of rows. "second monday in month X" can be expressed as follows: "second monday in X" = "(first monday in X) + 1 week" "first monday in X" = "first day of X + N days, where N is (dow(Monday) - dow(1st)) reduced to 0..6 mod 7" i.e. if the month starts on Monday, N=0 .. on Tuesday, N = 6 (1 - 2 == 6 mod 7) .. on Wednesday, N = 5 etc. So: select to_char(d, 'Day DD/MM/') from (select month + interval '1 week' + ((1 - extract(dow from month)::integer + 7) % 7) * interval '1 day' as d from generate_series(timestamp '2018-12-01', timestamp '2020-12-01', interval '1 month') month) s; -- Andrew (irc:RhodiumToad)
Re: How to build a btree index with integer values on jsonb data?
> "Johann" == Johann Spies writes: Johann> How can I transform the following definition to index pubyear Johann> as integer and not text? Johann> CREATE INDEX pubyear_idx Johann> ON some_table_where_data_field_is_of_type_jsonb USING btree Johann> ((data -> 'REC'::text) -> 'static_data'::text) -> Johann> 'summary'::text) -> 'pub_info'::text) ->> '@pubyear'::text) COLLATE Johann> pg_catalog."default"); Johann> While I can cast the value in a SELECT statement to integer I Johann> have been able to do the same while creating the index. Laurenz' answer was almost correct, just got the position of the parens wrong. When you use an expression in an index, the outermost level of the expression must either be (syntactically) a function call, or it must have parens around its _outermost_ level. You can simplify selecting from nested json using #>> in place of the -> and ->> operators. (x #>> array['foo','bar']) is equivalent to doing ((x -> 'foo') ->> 'bar') So: CREATE INDEX pubyear_idx ON some_table_where_data_field_is_of_type_jsonb USING btree ( ((data #>> array['REC','static_data','summary','pub_info','@pubyear'])::integer) ); Note the ::integer is inside the parens that define the column value within the outermost ( ) which enclose the column _list_. -- Andrew (irc:RhodiumToad)
Re: Format an Update with calculation
> "Bret" == Bret Stern writes: Bret> My statement below updates the pricing no problem, but I want it Bret> to be formatted with 2 dec points eg (43.23). UPDATE ... SET suggested_retail_price = round(suggested_retail_price*1.13, 2) WHERE ... assuming suggested_retail_price is already of type "numeric". You could also have given the columm a type of numeric(18,2) or similar, which would round all assigned values to 2 places. -- Andrew (irc:RhodiumToad)
Re: jsonb : find row by array object attribute
> "Rory" == Rory Campbell-Lange writes: Rory> and data like this: Rory> j Rory> -- Rory> {"a": 1, "people": [{"id": 2002}, {"id": 2004}]} Rory> {"a": 3, "people": [{"id": 2003}, {"id": 2005}]} Rory> (2 rows) Rory> I'd like to be able to find any row with a particular people id Rory> attribute. where j @> '{"people":[{"id":2003}]}' (meaning: j contains a key "people" whose value is an array containing an element {"id":2003}) Since @> can use GIN indexes, you should usually try and find a search condition using it before resorting to playing with -> or expanding out array values. -- Andrew (irc:RhodiumToad)
Re: jsonb : find row by array object attribute
>>>>> "Andrew" == Andrew Gierth writes: >>>>> "Rory" == Rory Campbell-Lange writes: Rory> and data like this: Rory> j Rory> -- Rory> {"a": 1, "people": [{"id": 2002}, {"id": 2004}]} Rory> {"a": 3, "people": [{"id": 2003}, {"id": 2005}]} Rory> (2 rows) Rory> I'd like to be able to find any row with a particular people id Rory> attribute. Andrew> where j @> '{"people":[{"id":2003}]}' Andrew> (meaning: j contains a key "people" whose value is an array Andrew> containing an element {"id":2003}) or to be more precise: j is an object containing a key "people" whose value is an array containing an element which is an object containing a key "id" with value 2003 i.e. {"a":3, "people": [{"id":2003,"blah":123},{"id":2004}]} would match the condition too. -- Andrew (irc:RhodiumToad)
Re: Can't quote_literal with COPY FROM PROGRAM
> "Mark" == Mark Mikulec writes: Mark> To be honest this whole affair with COPY FROM program seems like Mark> a bug to me though. Perhaps you're misunderstanding what COPY FROM PROGRAM is actually for. Its purpose is to do exactly what COPY does, that is to say, take as input a file in either PG's tab-delimited format or in CSV format, break it into records and fields, and insert the data into a table. Note that JSON is not a supported input format for COPY, though of course JSON _values_ can appear as data within a field inside either the tab-delimited or CSV formats. COPY FROM PROGRAM simply does COPY but with the input (whether in tab or CSV format) taken from the output of the program rather than a file. In tab-delimited format, the delimiter can be changed to something other than a tab, but the escape character is fixed as \ and the characters NL, CR, \, and the delimiter character are required to be escaped. Thus, any literal \ in the data MUST be escaped as \\ before passing the data to COPY in this mode. In CSV mode, CSV quoting and escaping rules are followed. It's not COPY's job to read a single datum, whether in JSON format or anything else. -- Andrew (irc:RhodiumToad)
Re: Function `set_config` doesn't work in with query?
> "Zexuan" == Zexuan Luo writes: Zexuan> For instance: Zexuan> ``` Zexuan> with t as ( Zexuan> select set_config('blah', '1', false) Zexuan> ) Zexuan> select current_setting('blah'); A CTE containing a SELECT query which is not referenced anywhere will not be executed, even if it contains volatile functions. (CTEs containing INSERT/UPDATE/DELETE that are not referenced _will_ still be executed.) -- Andrew (irc:RhodiumToad)
Re: Immutable way to cast timestamp TEXT to DATE? (for index)
> "Ken" == Ken Tanzer writes: Ken> Hi. I've got a text field in a table that holds this style of Ken> timestamp: Ken> 2014-10-23T00:00:00 You can't make this a field of type "timestamp" rather than text? Casts from text to either date or timestamp are mutable because they depend on the current DateStyle value (_you_ might know that your values are always ISO format, but the code doesn't). You can't even avoid this with to_date or to_timestamp and a fixed format, because those functions are also mutable since some of the format options are locale-dependent (again, you might know that you're not using those, but the code doesn't). If the column type were timestamp, though, then you could do comparisons directly, or cast the value to "date" (which is an immutable cast). If you absolutely can't change the column type, then one option would be to do your own fixed-format date parsing function (and label it immutable), e.g. create function iso_timestamp(text) returns timestamp without time zone as $$ select case when $1 ~ '^\d\d\d\d-?\d\d-?\d\dT\d\d:?\d\d:?\d\d$' then $1::timestamp else null end $$ set DateStyle = 'ISO,YMD' language sql immutable strict; or create function iso_date(text) returns date as $$ select case when $1 ~ '^\d\d\d\d-?\d\d-?\d\d(?![^T])' then substring($1 from '^\d\d\d\d-?\d\d-?\d\d')::date else null end $$ set DateStyle = 'ISO,YMD' language sql immutable strict; -- Andrew (irc:RhodiumToad)
Re: Immutable way to cast timestamp TEXT to DATE? (for index)
> "Ken" == Ken Tanzer writes: >> If you absolutely can't change the column type, then one option >> would be to do your own fixed-format date parsing function (and >> label it immutable), e.g. >> >> create function iso_timestamp(text) >> returns timestamp without time zone >> as $$ select case when $1 ~ '^\d\d\d\d-?\d\d-?\d\dT\d\d:?\d\d:?\d\d$' >> then $1::timestamp >> else null end $$ >> set DateStyle = 'ISO,YMD' >> language sql immutable strict; >> >> or >> >> create function iso_date(text) >> returns date >> as $$ select case when $1 ~ '^\d\d\d\d-?\d\d-?\d\d(?![^T])' >> then substring($1 from '^\d\d\d\d-?\d\d-?\d\d')::date >> else null end $$ >> set DateStyle = 'ISO,YMD' >> language sql immutable strict; Ken> Yeah, I thought I might have to do that, but when I create that Ken> index it still doesn't seem to use the index for queries. It won't use the index unless you use the same function in the query too. i.e. CREATE INDEX ON ... (iso_date("Service_Date")); SELECT * FROM ... WHERE iso_date("Service_Date") BETWEEN ... AND ...; -- Andrew (irc:RhodiumToad)
Re: Adding LEFT JOIN to a query has increased execution time 10 times
> "AF" == Alexander Farber writes: AF> Here are the only modified settings in postgresql.conf: AF> max_connections = 120 # (change requires restart) AF> work_mem = 8MB # min 64kB AF> maintenance_work_mem = 128MB # min 1MB AF> 90% of the backend source code are JSON-emitting stored functions AF> and there is one function which is the main core of the game and is AF> a SELECT query over 7 tables. AF> It is called for every Websocket-connected client and delivers a AF> JSON list of active games for the player. AF> Until recently the query needed 1-2 seconds for completion, That seems slow in itself, even before adding the extra join - the explain suggests that you're both short on indexes and you're getting pretty bad plans, possibly due to exceeding join_collapse_limit. (You might try increasing that in your config, along with from_collapse_limit; the default values are a legacy of the days when CPUs were much slower and planning time more of an issue.) AF> but after I have added a LEFT JOIN with the following table, the AF> query takes 7-10 seconds for completion and makes the game AF> unpleasant to play: AF> # \d words_geoip; AF> Table "public.words_geoip" AF> Column | Type | Collation | Nullable | Default AF> +--+---+--+- AF> block | inet | | not null | AF> lat| double precision | | | AF> lng| double precision | | | AF> Indexes: AF> "words_geoip_pkey" PRIMARY KEY, btree (block) And here's yet another missing index, resulting in your query having to process and discard 27 million rows in the course of generating a result of only 9 rows: Join Filter: (CASE WHEN (g.player1 = 5) THEN u2.ip ELSE u1.ip END << i2.block) Rows Removed by Join Filter: 27660682 (you probably wanted <<= rather than << as that comparison, if there's any chance your geoip table might have entries for single IPs) Fortunately, this being pg10, you can use either of these indexes: CREATE INDEX ON words_geoip USING gist (block inet_ops); or CREATE INDEX ON words_geoip USING spgist (block); As for the rest of the query, here are places you could probably work on: AF> LEFT JOIN words_moves m ON m.gid = g.gid AF> AND NOT EXISTS (SELECT 1 AF> FROM words_moves m2 AF> WHERE m2.gid = m.gid AF> AND m2.played > m.played) Whar you're asking for here is that the words_moves row that you're joining not have a matching row with a larger "played" value. You can do this far more efficiently with a lateral join, given the right index. AF> LEFT JOIN words_social s1 ON s1.uid = 5 AF> AND NOT EXISTS (SELECT 1 AF> FROM words_social s AF> WHERE s1.uid = s.uid AF> AND s.stamp > s1.stamp) AF> LEFT JOIN words_social s2 ON s2.uid = (CASE WHEN g.player1 = 5 THEN AF> g.player2 ELSE g.player1 END) AF> AND NOT EXISTS (SELECT 1 AF> FROM words_social s AF> WHERE s2.uid = s.uid AF> AND s.stamp > s2.stamp) Similar considerations apply to both of the above. AF> WHERE 5 IN (g.player1, g.player2) AF> AND (g.finished IS NULL OR g.finished > CURRENT_TIMESTAMP - INTERVAL '1 AF> day'); This WHERE clause could be written as WHERE 5 IN (g.player1, g.player2) AND coalesce(g.finished,'infinity') > (current_timestamp - interval '1 day') and you could then create the following indexes, CREATE INDEX ON words_games (player1, coalesce(finished,'infinity')); CREATE INDEX ON words_games (player2, coalesce(finished,'infinity')); which should get you a BitmapOr plan for that condition. AF> I have also asked my question at [dba.stack] If you ask questions like this on the IRC channel (#postgresql on chat.freenode.net - see http://freenode.net for info or web-based client access), you can usually get feedback in real time (I rarely answer performance questions in email because getting responses just takes too long). You may have to be patient. -- Andrew (irc:RhodiumToad)
Re: Adding LEFT JOIN to a query has increased execution time 10 times
> "Alexander" == Alexander Farber writes: Alexander> Good evening, thank you for the useful hints! Alexander> With the further help of the IRC folks the query has been Alexander> optimized (7-10 seconds -> 0.3 second) 0.3 MILLIseconds, actually. (You chanced not to catch me around on IRC, but I see that didn't matter.) Alexander> by adding the following indices: Alexander> CREATE INDEX ON words_games (player1, COALESCE(finished, 'INFINITY')); Alexander> CREATE INDEX ON words_games (player2, COALESCE(finished, 'INFINITY')); Alexander> CREATE INDEX ON words_moves (gid, played DESC); Alexander> CREATE INDEX ON words_social (uid, stamp DESC); I'm not a big fan of using DESC on indexes; it's almost never needed, because any btree index can be scanned in reverse. (You only actually need it if you're mixing ASC/DESC orderings in an ORDER BY and want an index that matches it.) Alexander> Also I have increased the following parameters in Alexander> postgresql.conf - Alexander> from_collapse_limit = 24 Alexander> join_collapse_limit = 24 Ironically, I think these settings don't affect the query now since removing the EXISTS conditions (which count as joins) means there are now less than 8 joined tables. But keeping them high is probably a good idea so that you don't get problems if you ever add another join or two. Alexander> Now the whole query looks as following and the EXPLAIN Alexander> output pasted is below - Just for future reference, when you include explain output in your email (which you should, for the benefit of the archives - paste sites and explain.depesz.com have limited lifetimes), it's best if you can make sure your email client doesn't word-wrap them into near-unreadability. Alexander> I was told that it still could be improved (by rearranging Alexander> WHERE clauses?) Maybe, but once you're down to sub-millisecond execution times, further optimization is usually only worthwhile for very heavily executed queries. -- Andrew (irc:RhodiumToad)
Re: Not sure which part of the query needs optimization
> "Alexander" == Alexander Farber writes: Alexander> Good afternoon, Alexander> for each visitor of my website I generate a JSON list of 30 Alexander> top players ( https://slova.de/words/top.php ), who played Alexander> in the past week, with their average scores and average time Alexander> between moves. -> Parallel Seq Scan on words_moves m_1 (cost=0.00..73600.05 rows=8666 width=16) (actual time=0.761..130.844 rows=11074 loops=3) Filter: (played > (CURRENT_TIMESTAMP - '7 days'::interval)) Rows Removed by Filter: 492241 This is telling you that an index on words_moves(played) would likely help. But the real hot point in the query is here (reformatted for clarity): -> Aggregate (cost=6097.83..6097.84 rows=1 width=32) (actual time=19.401..19.401 rows=1 loops=30) -> Bitmap Heap Scan on words_moves (cost=33.97..6093.45 rows=1748 width=4) (actual time=1.680..18.153 rows=15011 loops=30) Recheck Cond: (uid = u.uid) Heap Blocks: exact=216312 -> Bitmap Index Scan on words_moves_uid_idx (cost=0.00..33.54 rows=1748 width=0) (actual time=0.979..0.979 rows=15011 loops=30) Index Cond: (uid = u.uid) (the explain.depesz.com view points this out with an orange highlight) This corresponds to this subquery: (SELECT ROUND(AVG(score), 1) FROM words_moves WHERE uid = u.uid) AS avg_score, The basic issue here is that you're calculating an average over around 15k rows per user, for each user in the query result (so 30 times, making 450k rows). You can see from the "Heap Blocks" stat that this is having to scan a lot of data; it's taking on average 19.4ms per user, but multiply that by 30 users and you get ~580ms total, or about 70% of the total execution time. The obvious thing to do is to keep a computed average score for each user - either in a separate table which you update based on changes to words_moves, which you could do with a trigger, or using a materialized view which you refresh at suitable intervals (this has the drawback that the data will not be immediately up-to-date). Combining these two changes should get you to under 100ms, maybe. -- Andrew (irc:RhodiumToad)
Re: How to always run UPDATE FROM despite missing records in the source table?
> "Alexander" == Alexander Farber writes: Alexander> However the following query does not work as intended and Alexander> does not update any fields, because there is no matching Alexander> block in the geoip table found: Alexander> UPDATE users u SET Alexander> visited = now(), -- HOW TO ALWAYS UPDATE THIS FIELD? Alexander> ip = '20.20.20.20'::inet, -- HOW TO ALWAYS UPDATE THIS FIELD? Alexander> lat = i.lat, Alexander> lng = i.lng Alexander> FROM geoip i Alexander> WHERE u.uid = 2 AND '20.20.20.20'::inet <<= i.block; Alexander> The field visited and ip however should be always updated - Alexander> regardless if the block was found or not. Alexander> Kind of LEFT JOIN, but for UPDATE - how to achieve this please? It can be done like this (this assumes you want to preserve the previous values of u.lat/u.lng if the block was not found; if you want to set them to null instead, then remove the coalesce() calls): UPDATE users u SET visited = now(), ip = v.ip, lat = coalesce(i.lat, u.lat), lng = coalesce(i.lng, u.lng) FROM (VALUES ('20.20.20.20'::inet)) v(ip) LEFT JOIN geoip i ON (v.ip <<= i.block) WHERE u.uid = 2; Alexander> But that would run the same subquery twice (correct?) and my Alexander> geoip table is already slow with 3073410 records Slow even with a gist or spgist index? what does the explain analyze look like? (You could also try using the ip4r module; I've not done any serious benchmarking to see if it's faster than the built-in index types, though it has some theoretical advantages due to not being restricted to CIDR ranges. In pg versions before the built-in inet type got a gist index method, ip4r was _the_ way to do ip block lookups for geoip etc.) -- Andrew (irc:RhodiumToad)
Re: How to always run UPDATE FROM despite missing records in the source table?
> "Alexander" == Alexander Farber writes: Alexander> even though I wonder what is the (VALUES Alexander> ('20.20.20.20'::inet)) v(ip) construct there, some temporary Alexander> table which is then LEGT JOINed to the geoip table? The SQL spec calls it a . The VALUES clause behaves like a SELECT that returns a fixed number of rows (1 or more) whose columns contain the results of the specified expressions. The v(ip) part is just a table and column alias (I omitted the optional AS keyword out of long habit) to name the constructed table. -- Andrew (irc:RhodiumToad)
Re: lost "left join"
> "Олег" == Олег Самойлов writes: Олег> Hi, all. Олег> I got some mystic behaviour of PostgreSQL, perhaps this is a bug. Feature, actually. Олег> But things begin be strange if I add validation by time. Олег> => explain select * from node as parent left join link on Олег> parent.node_id=link.parent left join node as child on Олег> link.child=child.node_id where parent.node_id=1 and current_date Олег> <@ parent.valid and current_date <@ link.valid and current_date Олег> <@ child.valid; The problem here is that (for example) child.valid is null if there was no matching child row in the join, and the planner knows that x <@ NULL is not true (since the chosen <@ operator is defined as strict), and therefore it knows that the left join is unnecessary and can be reduced to an inner join. At least 90% of the time when you refer to values from the nullable side of a join in a WHERE clause, you're making a mistake (e.g. the condition should have been in the ON clause instead). The other 10% or less of the time, you have to make sure you use non-strict conditions, i.e. take account of the fact that the values might be null. -- Andrew (irc:RhodiumToad)
Re: Weird behaviour of ROLLUP/GROUPING
> "Guillaume" == Guillaume Lelarge writes: Guillaume> CASE grouping(to_char(b, 'MM')) Guillaume> ERROR: arguments to GROUPING must be grouping expressions of the Guillaume> associated query level Guillaume> LINE 3: CASE grouping(to_char(b, 'MM')) WHEN 1 THEN 'some date' ... Guillaume> AFAICT, both queries should behave the same, though their Guillaume> actual behaviours are quite opposite. Working fine for the Guillaume> first, erroring out on the second. Guillaume> Does anyone has any idea what's going on here? Not yet. But I will find out, since it seems to be a bug. -- Andrew (irc:RhodiumToad)
Re: Weird behaviour of ROLLUP/GROUPING
>>>>> "Andrew" == Andrew Gierth writes: >>>>> "Guillaume" == Guillaume Lelarge writes: Guillaume> CASE grouping(to_char(b, 'MM')) Guillaume> ERROR: arguments to GROUPING must be grouping expressions of the Guillaume> associated query level Guillaume> LINE 3: CASE grouping(to_char(b, 'MM')) WHEN 1 THEN 'some date' ... Guillaume> AFAICT, both queries should behave the same, though their Guillaume> actual behaviours are quite opposite. Working fine for the Guillaume> first, erroring out on the second. Guillaume> Does anyone has any idea what's going on here? Andrew> Not yet. But I will find out, since it seems to be a bug. It is definitely a bug, to do with assignment of collations. It specifically happens when you use GROUPING which contains any subexpression of a collatable type, inside a CASE expr WHEN clause, since that assigns collations to the expression much earlier in parsing than the rest of the query, so the code that validates GROUPING ends up trying to compare an expression which has had collations assigned to it to one which has not, and so it thinks they differ. I will see about fixing this, somehow. -- Andrew (irc:RhodiumToad)
Re: Question about array_to_string()'s behavior and the behavior might be changed in the future or not
> "Egashira" == Egashira, Yusuke writes: Egashira> Hi, Egashira> I have a question about array_to_string(). Egashira> I think array_to_string() concatenates always array elements Egashira> from the "beginning" of the array elements and this rule will Egashira> not be changed on the future minor releases. Egashira> Is my understanding correct? The order of elements in the original array will always be preserved in the string output, yes. Egashira> For example, I understand that array_to_string() "never" Egashira> output like following result now and in the future, is my Egashira> understanding correct? Yes. -- Andrew (irc:RhodiumToad)
Re: Weird behaviour of ROLLUP/GROUPING
> "Guillaume" == Guillaume Lelarge writes: >> I will see about fixing this, somehow. Guillaume> Thanks a lot. I've committed a fix (to all supported branches, since this bug actually precedes the addition of GROUPING SETS and can be triggered with a simple GROUP BY if you try hard enough). The regression test says it works now, but it'd be good if you could try it again on REL_11_STABLE (at commit e74d8c5085 or later) to check that it fixes your case. -- Andrew (irc:RhodiumToad)
Re: commit within a procedure loop - cannot commite with subtransaction
> "andyterry" == andyterry writes: andyterry> Hi, andyterry> Using a procedure for the first time to run some processing andyterry> for each row in a table, generating output to a target andyterry> table. The following works without COMMIT the example below andyterry> gives: andyterry> INFO: Error Name:cannot commit while a subtransaction is active andyterry> INFO: Error State:2D000 What client are you using to execute this? Some clients may insert SAVEPOINT statements (which create subtransactions) behind your back (e.g. psql with \set ON_ERROR_ROLLBACK does this) in order to recover from errors without aborting the whole transaction. (turn on log_statement=all in your config, or for the user you're executing this as, and look for the command in the server log) -- Andrew (irc:RhodiumToad)
Re: Implementing an expanded object in C
> "Michel" == Michel Pelletier writes: Michel> Replying to my own problem here, I guess this was a situation Michel> where explaining it in detail revealed the problem to me. By Michel> specifying my type is 'passedbyvalue' That cannot possibly be appropriate. Michel> CREATE TYPE matrix ( Michel> internallength = 8, Your type is clearly not a fixed-length type, because fixed-length types cannot have expanded datums. A fixed-length type must contain its entire representation within the fixed length - it is not allowed to be a pointer to something else. -- Andrew (irc:RhodiumToad)
Re: Displaying Comments in Views
> "Susan" == Susan Hurst writes: Susan> What is the trick for displaying column comments in views? The Susan> query below works as expected when the table_schema includes Susan> tables, however it shows nothing when the table_schema contains Susan> only views. I tried putting the query into an inline statement Susan> as a column selection in a wrapper query...I got all the Susan> table/column data but the comment column values were all null. Susan> from pg_catalog.pg_statio_all_tables st That is the wrong place to look for the purposes of this query, since as the name implies it only shows tables (and not views, since non-materialized views don't have or need I/O statistics). Also, it's _NOT_ the place to look when you just want a list of tables in the db or to look up tables by oid; use pg_class for that. I'd have gone with something along the lines of: select n.nspname as table_schema, c.relname as table_name, a.attname as column_name, pd.description as description from pg_class c join pg_namespace n on (n.oid=c.relnamespace) join pg_attribute a on (a.attrelid=c.oid and a.attnum > 0 and not a.attisdropped) join pg_description pd on (pd.classoid='pg_class'::regclass and pd.objoid=c.oid and pd.objsubid=a.attnum) where n.nspname = 'devops' order by n.nspname, c.relname, a.attname; -- Andrew (irc:RhodiumToad)
Re: postgresql v11.1 Segmentation fault: signal 11: by running SELECT... JIT Issue?
> "pabloa98" == pabloa98 writes: pabloa98> the table baseline_denull has 1765 columns, Uhh... #define MaxHeapAttributeNumber 1600/* 8 * 200 */ Did you modify that? (The back of my envelope says that on 64bit, the largest usable t_hoff would be 248, of which 23 is fixed overhead leaving 225 as the max null bitmap size, giving a hard limit of 1800 for MaxTupleAttributeNumber and 1799 for MaxHeapAttributeNumber. And the concerns expressed in the comments above those #defines would obviously apply.) -- Andrew (irc:RhodiumToad)
Re: postgresql v11.1 Segmentation fault: signal 11: by running SELECT... JIT Issue?
> "pabloa98" == pabloa98 writes: pabloa98> I did not modify it. Then how did you create a table with more than 1600 columns? If I try and create a table with 1765 columns, I get: ERROR: tables can have at most 1600 columns -- Andrew (irc:RhodiumToad)
Re: postgresql v11.1 Segmentation fault: signal 11: by running SELECT... JIT Issue?
> "pabloa98" == pabloa98 writes: pabloa98> I found this article: pabloa98> https://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux Those instructions contain obvious errors. pabloa98> It seems I should modify: uint8 t_hoff; pabloa98> and replace it with something like: uint32 t_hoff; or uint64 t_hoff; At the very least, that ought to be uint16 t_hoff; since there is never any possibility of hoff being larger than 32k since that's the largest allowed pagesize. However, if you modify that, it's then up to you to ensure that all the code that assumes it's a uint8 is found and fixed. I have no idea what else would break. -- Andrew (irc:RhodiumToad)
Re: Help : Update and insert record based on several value in the parameter
> "Hengky" == Hengky Lie writes: Hengky> Hi, Hengky> I want to create a function to update my table (flag='9') and Hengky> insert new record (flag='0') with the rfidnumber specified in a Hengky> parameter. rfidnumber is stored as text/varchar? if it's something else, change the suggestions below accordingly. Hengky> This parameter may have several value seperated by space (ie. Hengky> 11 22 33 44) update ... and rfidnumber = any (string_to_array(znumber,' ')) Or, do the split just once: DECLARE z_ids text[] := string_to_array(znumber, ' '); BEGIN update ... where flag='0' and rfidnumber =any (z_ids); insert into ... select localtimestamp, '0', id from unnest(z_ids) as u(id); -- Andrew (irc:RhodiumToad)
Re: Date calculation
> "Ron" == Ron writes: Ron> Hi, Ron> v9.6.6 Ron> Is there a built in function to calculate, for example, next Ron> Sunday? No, but such things aren't hard to calculate using the available primitives. To get "next Xday", for example, you can add 7 days and then do "previous or current Xday". In turn, "previous or current Xday" can be done by subtracting (X-Monday), doing date_trunc 'week', and adding (X-Monday) again. select current_date, date_trunc('week', (current_date + 7 - 6)::timestamp)::date + 6; current_date | ?column? --+ 2019-01-31 | 2019-02-03 (1 row) If you do this sort of thing a lot, then define your own functions for it: -- create this to override the cast to timestamptz that otherwise messes -- things up: create function date_trunc(text,date) returns date language sql immutable as $f$ select date_trunc($1, $2::timestamp)::date; $f$; -- perfect hash function for weekday names, with Monday=0 -- (accepts upper, lower or mixed case) create function dayno(text) returns integer language sql immutable as $f$ select (( ((ascii(substring($1 from 3)) & 22)*10) # (ascii($1) & 23) )*5 + 2) % 7; $f$; create function next_dow(start_date date, day_name text) returns date language sql immutable as $f$ select date_trunc('week', (start_date + 7 - dayno(day_name))) + dayno(day_name); $f$; select current_date, next_dow(current_date, 'Thursday'), next_dow(current_date, 'Friday'); current_date | next_dow | next_dow --++ 2019-01-31 | 2019-02-07 | 2019-02-01 -- Andrew (irc:RhodiumToad)
Re: Date calculation
> "Bruce" == Bruce Momjian writes: Bruce> Oh, right, you want date, so use: Bruce> SELECT date_trunc('week', CURRENT_DATE) + '6 days'; Three major things wrong with this: 1. If you do this on Sunday, it gives you the current day not the _next_ Sunday. 2. If you try and do this for other days of the week it doesn't work at all, instead giving you the specified day of the current week whether or not it's before or after the current day. 3. It's letting PG cast the date to a timestamptz, which is inefficient, possibly incorrect, and mutable; you want to force it to cast to timestamp without timezone instead. (A good rule of thumb is that you virtually never want to cast dates to timestamptz; the natural cast from date is to timestamp _without_ timezone.) -- Andrew (irc:RhodiumToad)
Re: Unused files in the database directory after crashed VACUUM FULL
> "Tom" == Tom Lane writes: > Hannes Erven writes: >> I've just had a "VACUUM FULL " crash due to 100% disk usage. >> Clearly my fault, I was expecting the new table to be small enough. Tom> What do you mean by "crash" exactly? A normal transactional Tom> failure should've cleaned up orphaned files. I suppose if the Tom> kernel decided to kill -9 the vacuum process, that wouldn't happen Tom> --- but that's not the usual response to out-of-disk-space. The failure mode of this kind we see reported most often is getting a PANIC failure from inability to extend the WAL. -- Andrew (irc:RhodiumToad)
Re: Subquery to select max(date) value
> "Rich" == Rich Shepard writes: Rich> Will probably try DISTINCT ON, too, if that makes it simpler or Rich> faster. You want LATERAL. -- Andrew (irc:RhodiumToad)
Re: Subquery to select max(date) value
> "Adrian" == Adrian Klaver writes: Adrian> Close to your last posted query. person_id 2 and 3 have NULL Adrian> values for activities data as there is no record for 2 and 3 is Adrian> out of the date range.: Adrian> select Adrian>p.person_id, Adrian>p.desc_fld, Adrian>a.next_contact Adrian> from Adrian>people as p Adrian>LEFT JOIN ( Adrian>SELECT Adrian>DISTINCT ON (person_id) [...] Adrian>) a USING (person_id) Adrian> ; DISTINCT ON with no matching ORDER BY at the _same_ query level is non-deterministic. Also DISTINCT ON isn't efficient. Consider instead something along the lines of: select p.*, a.* -- for illustration from people p join lateral (select * from activities a1 where a1.person_id = p.person_id and a1.next_contact > '2018-12-31' and a1.next_contact <= 'today' order by a1.next_contact desc limit 1) a on true; (make sure to have an index on activities(person_id,next_contact)) -- Andrew (irc:RhodiumToad)
Re: Size estimation of postgres core files
> "Jeremy" == Jeremy Finzel writes: Jeremy> Yes Linux. This is very helpful, thanks. A follow-up question - Jeremy> will it take postgres a really long time to crash (and Jeremy> hopefully recover) if I have say 1T of RAM because it has to Jeremy> write that all out to a core file first? It doesn't write out all of RAM, only the amount in use by the particular backend that crashed (plus all the shared segments attached by that backend, including the main shared_buffers, unless you disable that as previously mentioned). And yes, it can take a long time to generate a large core file. -- Andrew (irc:RhodiumToad)
Re: Subquery to select max(date) value
> "Rich" == Rich Shepard writes: Rich> I found a couple of web pages describing the lateral join yet Rich> have not correctly applied them. The manual's page did not help Rich> me get the correct syntax, either. Think I'm close, however: Rich> select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, a.next_contact Rich> from people as p, organizations as o Rich> lateral Rich> (select a.next_contact LATERAL (SELECT ...) is syntactically like (SELECT ...) in that it comes _after_ a "," in the from-clause or after a [LEFT] JOIN keyword. Don't think of LATERAL as being a type of join, think of it as qualifying the (SELECT ...) that follows. Rich> from activities as a Rich> where a.next_contact is not null and a.next_contact <= 'today' and Rich> a.next_contact > '2018-12-31' You'd want a condition here that references the "people" table; the whole point of LATERAL is that it opens up the scope of column references in the subquery to include those tables which are to its left in the from-clause. Rich> order by person_id,next_contact); and I'm guessing you want that ordered by next_contact alone, possibly with LIMIT 1 to get just the nearest following next_contact time. -- Andrew (irc:RhodiumToad)
Re: Subquery to select max(date) value
> "Rich" == Rich Shepard writes: Rich> Using LIMIT 1 produces only the first returned row. This Rich> statement (using max() for next_contact) produces no error Rich> message, but also no results so I killed the process after 30 Rich> seconds. Without a syntax error for guidance I don't know how to Rich> proceed. I've not before run 'explain' on a query. Would that be Rich> appropriate here? Yes. Rich> select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name Rich> from people as p, organizations as o, Rich> lateral Rich> (select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, Rich> max(a.next_contact) Rich> from people as p, organizations as o, activities as a Rich> where a.next_contact > '2018-12-31' and Rich> a.next_contact <= 'today' and Rich> a.next_contact is not null Rich> group by p.person_id, o.org_name, a.next_contact Rich> order by p.person_id, o.org_name, a.next_contact) sq; The problem here is that you have no join conditions at all, so the result set of this query is massive. And you've duplicated many tables inside the subquery which is not necessary or appropriate. select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, sq.* from people as p join organizations as o on p.organization_id=o.id -- OR WHATEVER cross join lateral (select a.next_contact from activities as a where a.person_id=p.person_id --VERY IMPORTANT and a.next_contact > '2018-12-31' and a.next_contact <= 'today' and a.next_contact is not null order by a.next_contact DESC limit 1) sq; Ordering by DESC with a limit 1 is used to get the max next_contact value rather than the smallest; this is similar to max(), but makes it trivial to also access the other columns of the _same_ activities row which is being selected. -- Andrew (irc:RhodiumToad)
Re: HAVING query structured wrong
> "Chuck" == Chuck Martin writes: Chuck> I am trying to create a query that returns all transactions for Chuck> each person who has a balance over a given amount. I thought Chuck> HAVING was the answer, but if so, I'm mis-using it. This query Chuck> returns only transactions exceeding the given amount rather than Chuck> transactions for people whose balance is over the amount: [snip] Chuck> Since that returned the wrong set of records, I created another Chuck> that returns the correct set of people with balances over the Chuck> given amount. But I can't figure out how to use this to get all Chuck> the transactions for people returned by this query: Chuck> SELECT case_pkey Chuck> FROM trans,ombcase,status Chuck> WHERE case_fkey = case_pkey Chuck> AND status_fkey = status_pkey Chuck> AND statusopen = 1 Chuck> AND transistrust <> 1 Chuck> AND transcleared <> 1 Chuck> GROUP BY case_pkey Chuck> HAVING sum(transamount) >= 50 Chuck> ORDER BY case_pkey Chuck> So how do I get all transactions for each case_pkey? You can join the result of any subquery as if it were a table, either with or without using a CTE: SELECT ... FROM (select case_pkey from ... having ...) AS cases, trans WHERE trans.case_fkey = cases.case_pkey; (incidentally, please qualify all the column references in your query with a table name or alias, otherwise people reading your code have no idea which column is supposed to be in which table) or with a CTE, WITH cases AS (select ... from ... having ...) SELECT ... FROM cases, trans WHERE trans.case_fkey = cases.case_pkey; There's also a third method with window functions instead of GROUP BY, which is to do something like SELECT ... FROM (select ..., sum(transamount) over (partition by case_pkey) as total_amt from ...) s WHERE total_amt > 50; -- Andrew (irc:RhodiumToad)
Re: RECURSIVE allowed only ONCE in a CTE
> "Jitendra" == Jitendra Loyal writes: Jitendra> I find that the RECURSIVE can be used only once in a CTE. RECURSIVE can be specified only once, but it applies to all CTEs at that level. That is to say, RECURSIVE qualifies the preceding WITH, _not_ the following CTE. Note that just specifying RECURSIVE doesn't mean that any CTE is recursive, it simply changes the name scoping rules such that CTEs _can_ be recursive. (Without it, a CTE's name is not in scope in its own body so recursion is impossible.) -- Andrew (irc:RhodiumToad)
Re: crosstab function
> "Martin" == Martin Mueller writes: Martin> I run Postgres 10.5. I understand that there is something Martin> called tablefunc and it includes a crosstab function. On Stack Martin> Overflow I learn that you import this function. But from where Martin> and how? The Postgres documentation is quite clear and Martin> intelligible to a retired English professor like me, but there Martin> is nothing in the Postgres documentation about how to do the Martin> import, and a search for ‘import modules’ yields nothing. Martin> I tried to emulate a Stack overflow query that does what I want Martin> to do, but got an error message saying that the crosstab Martin> function doesn’t exist. I tried Martin> CREATE EXTENSION IF NOT EXISTS tablefunc; Martin> but it did nothing. It doesn’t seem to work as import Martin> statements in Python do CREATE EXTENSION causes the functions to be defined in the current database; that's the only "import" that is needed. In psql, you can do this: \df+ *.crosstab* to see what functions of that name are defined and what schema they are in (should be "public" by default). Likewise \dx lists installed extensions in the current database. Remember that you need to do the CREATE EXTENSION command actually in the database in which you want to use the functions, not in any other database. Errors about functions not existing are usually caused by a problem with the number or type of parameters, since function names aren't unique (overloaded functions are allowed). You didn't tell us the actual error you got, so we can't say exactly what the problem is there. -- Andrew (irc:RhodiumToad)
Re: Overloaded && operator from intarray module prevents index usage.
> "Thomas" == Thomas Kellerer writes: [intarray woes] Thomas> Is this expected behaviour? Is this caused by the Postgres core Thomas> (e.g. the optimizer to taking the opclass into account) or is Thomas> it a "problem" in the way the intarray module defines its Thomas> operators? It's basically a conflict between intarray (which is historically older) and the built-in array indexing support. The reason it happens is that the operator resolution logic matches an (integer[] && integer[]) operator in preference to (anyarray && anyarray) regardless of their relative position on the search_path. This resolution happens before anything is known about any indexes that might be applicable. Then later, at planning time, an index is chosen based on the operator, not the reverse. My own recommendation for most cases is to never install intarray on the search path, and invoke its functions via explicit qualification or wrap them in your own functions. -- Andrew (irc:RhodiumToad)
Re: Overloaded && operator from intarray module prevents index usage.
> "Ron" == Ron writes: Ron> Arrays are -- by definition -- not atomic, and so they Ron> fundamentally break the model that relational databases are Ron> founded upon. If you want to be a good database designer, don't Ron> use arrays. "In theory there is no difference between theory and practice, but in practice there is." Sometimes a good database designer has to tell the theoreticians where to get off, and do something more pragmatic. -- Andrew (irc:RhodiumToad)
Re: Camel case identifiers and folding
> "Morris" == Morris de Oryx writes: Morris> UUIDs as a type are an interesting case in Postgres. They're Morris> stored as a large numeric for efficiency (good!), but are Morris> presented by default in the 36-byte format with the dashes. Morris> However, you can also search using the dashes 32-character Morris> formatand it all works. Case-insensitively. That works because UUIDs have a convenient canonical form (the raw bytes) which all input is converted to before comparison. Text is ... not like this. Even citext is really only a hack - it assumes that comparisons can be done by conversion to lowercase, which may work well enough for English but I'm pretty sure it does not correctly handle the edge cases in, for example, German (consider 'SS', 'ss', 'ß') or Greek (final sigma). Doing it better would require proper application of case-folding rules, and even that would require handling of edge cases (the Unicode case folding algorithm is designed to be language-independent, which means that it breaks for Turkish without special-case exceptions). -- Andrew (irc:RhodiumToad)
Re: printing JsonbPair values of input JSONB on server side?
> "T" == T L writes: T> //Problem line!!! T> //elog(NOTICE, "print_kv_pair(): k = %s, v = %s", T> ptr-> key.val.string.val, numeric_out(ptr->value.val.numeric)); string.val isn't a C string (notice the "not null terminated" comment in the structure definition), and you can't call numeric_out like that. Either of those would crash it. You could use pnstrdup to get a valid C string, and use DatumGetCString(DirectFunctionCall1( numeric_out, NumericGetDatum(ptr->value.val.numeric))) to get the numeric value as a C string. -- Andrew (irc:RhodiumToad)
Re: printing JsonbPair values of input JSONB on server side?
> "T" == T L writes: T> //problem lines!!! //either elog crashes pg server T> char *buf = pnstrdup(ptr->key.val.string.val, T> ptr-> key.val.string.len); T> elog(NOTICE, "print_kv_pair(): k = %s", T> (ptr->key).val.string.val); //debug It doesn't help to make a null-terminated copy of the string if you're then just going to try and print the original. elog(NOTICE, "print_kv_pair(): k = %s", buf); T> elog(NOTICE, "print_kv_pair(): v = %s", T> DatumGetCString(DirectFunctionCall1(numeric_out, T> NumericGetDatum(ptr->value.val.numeric))) ); //debug That should work, _provided_ that value.type == jbvNumeric - did you consider checking that first? -- Andrew (irc:RhodiumToad)
Re: printing JsonbPair values of input JSONB on server side?
> "T" == T L writes: T> Below is my test. It prints a strange character instead of "a"; and T> says that the value isn't numeric. Yeah, there's plenty else wrong with your code. Did you look at how JsonbToCStringWorker does it? that looks like the best example I can find on a quick scan. -- Andrew (irc:RhodiumToad)
Re: FreeBSD 12 and Postgres build failures
> "Karl" == Karl Denninger writes: Karl> Both 10.7 and 11.2 fail on FreeBSD 12-STABLE if --with-openssl is Karl> part of the configure string. Nope, they work fine for me, running under 12-stable as of a couple of weeks ago. -- Andrew (irc:RhodiumToad)
Re: FreeBSD 12 and Postgres build failures
> "Karl" == Karl Denninger writes: Karl> That was on a bare, new grab of the source from your repository and a Karl> straight-up run of "./configure --with-openssl --with-perl", then "gmake". That's a configure line that has never worked on FreeBSD: % ./configure --with-openssl --with-perl [...] configure: error: readline header not found If you have libedit already installed, see config.log for details on the failure. It is possible the compiler isn't looking in the proper directory. Use --without-readline to disable libedit support. % (If you don't use --without-readline then you must add --with-includes=/usr/local/include and --with-libs=/usr/local/lib to pick up either ports readline or ports libedit. Or you might be able to use --with-includes=/usr/include/edit to get base system libedit, but that's not really the recommended method. Besides libedit sucks for actual use so you want readline anyway.) If you didn't explicitly specify any of those, but configure found a readline to use anyway, then it means you have defined some non-standard compiler options or have messed with the include paths in some other way, which could be relevant to your problem. -- Andrew (irc:RhodiumToad)
Re: FreeBSD 12 and Postgres build failures
> "Karl" == Karl Denninger writes: >> That's a configure line that has never worked on FreeBSD: >> >> % ./configure --with-openssl --with-perl >> [...] >> configure: error: readline header not found >> If you have libedit already installed, see config.log for details on the >> failure. It is possible the compiler isn't looking in the proper directory. >> Use --without-readline to disable libedit support. >> % Karl> Uh, it has worked on all my FreeBSD systems for a very long time; Then you have done something non-standard, because "cc" on FreeBSD does not look in /usr/local/include or /usr/local/lib without being told to do so, and the configure checks for readline will fail without that. Possible explanations include having a "gcc" (without a specific version suffix) on the $PATH (gcc will be chosen over cc in that case), or having configured CFLAGS or CPPFLAGS in the environment to ./configure. You might post your full output of ./configure rather than leaving us to guess at it. -- Andrew (irc:RhodiumToad)
Re: FreeBSD 12 and Postgres build failures
> "Karl" == Karl Denninger writes: Karl> "cc" on FreeBSD is now (and has been for quite some time) clang: Karl> If it will compile Postgres now it definitely would not when it Karl> was first shifted to; People have been building PG with clang since at least as far back as clang 3.1, and the FreeBSD 10.0 (which I believe was the first FreeBSD release where clang was the default) package builds of PG were built with clang 3.3. (Now, of course, clang is _required_ if you want to try out the new JIT features.) Building with gcc on FreeBSD is not the default on a clean system, and as we've recently seen with another report, there are issues with mixing the gcc and llvm toolchains on some architectures that are not in any way PG's problem to solve. (Though the fact that we prefer gcc to cc in configure if both are found is a decision that should probably be revisited, given that the reasons for that preference are mostly consigned to the graveyard of commercial Unix variants.) You might also have considered that the fact that package builds exist for pg 10 and 11 on FreeBSD 12 is a demonstration that building them is, in fact, possible on a clean system... Karl> root@NewFS:/home/karl # which gcc Karl> /usr/local/bin/gcc I had not previously noticed that the lang/gcc metaport (which I have never installed, though I have lang/gcc8 installed at the moment) installs a symlink to the selected gcc version as just "gcc". That explains a lot. -- Andrew (irc:RhodiumToad)
Re: Subquery to select max(date) value
> "Rich" == Rich Shepard writes: Rich> Tried this and did not do it correctly. Should there be two Rich> 'order by', one in the sub-query, the other in the top-level Rich> query? Yes. Rich> This does not return the desired order: Rich> select p.person_id, p.lname, p.fname, p.direct_phone, p.active, o.org_name, sq.* Rich> from people as p Rich> join organizations as o on p.org_id = o.org_id Rich> cross join Rich> lateral Rich> (select a.next_contact Rich> from activities as a Rich> where a.person_id = p.person_id and Rich> p.active='True' and Rich> a.next_contact is not null Rich> order by a.next_contact DESC Rich> limit 1) sq Rich> order by sq.next_contact DESC; That query seems correct assuming you want the result in descending order of next_contact. How did the actual result differ from your expectation? -- Andrew (irc:RhodiumToad)
Re: Postgres comparison bugfixes between arbitrary versions
> "Thomas" == Thomas Kellerer writes: Thomas> Hello, Thomas> some time ago someone published a website where it was possible Thomas> to select two arbitrary Postgres version and then see a list of Thomas> Bugfixes (and features) that are missing in the older version Thomas> of the two. why-upgrade.depesz.com -- Andrew (irc:RhodiumToad)
Re: Computed index on transformation of jsonb key set
> "Steven" == Steven Schlansker writes: Steven> I figured I'd end up with significantly better storage and Steven> performance characteristics if I first compute a uuid[] value Steven> and build the GIN over that, and use the array operator class Steven> instead. Additionally, this eliminates possible confusion about Steven> uuid casing (text is case sensitive, uuid is not) and this has Steven> already caused at least one bug in our application. Steven> I attempted to optimize a query like: Steven> select * from tbl where array(select jsonb_object_keys(mapData)::uuid) && array['320982a7-cfaa-572a-b5ea-2074d7f3b014'::uuid]; Obvious solution: create function uuid_keys(mapData jsonb) returns uuid[] language plpgsql immutable strict as $$ begin return array(select jsonb_object_keys(mapData)::uuid); end; $$; create index on tbl using gin (uuid_keys(mapData)); select * from tbl where uuid_keys(mapData) && array[...]; -- Andrew (irc:RhodiumToad)