Re:
On 2022-06-17 10:38:57 +0800, Abdul Qoyyuum wrote: > Best if you just maintain the logs into a log file instead of a database > table. That depends on what you want to use those logs for. If you just want to look at them when something goes wrong, I agree: Logs files are simple, fast to write, take little space (and usually compress well), can be trimmed with a text editor and attached to a ticket or sent by mail ... But if you want to process them further (generate statistics or other reports, anonymize them, combine them with data from other sources, ...), it is useful to store them in some structured format (aka a "database"). That doesn't necessarily mean a relational database, it could be a nosql database (for example, ElasticSearch is very popular for storing logs), but if you are already using PostgreSQL in your project, using it for logs suggests itself. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?
> On 18 Jun 2022, at 2:14, Bryn Llewellyn wrote: > > I implemented two complementary functions: > > —"no_null_keys()" checks that a "jsonb" value has no occurrences of « "some > key": null » > > —"strip_null_keys()" removes « "some key": null » occurrences from a "jsonb" > value > > The code checks with "no_null_keys()" that, as expected, no ingested JSON > document has an occurrence of « "some key": null ». > > And it uses "strip_null_keys()" on the output of "to_jsonb()" — and, as > appropriate, any other built-in JSON function that produces a "jsonb" value. > > It was straightforward to implement these two functions by using REGEXP > built-in functionality on the canonically formatted "text" value produced by > the "jsonb::text" typecast. In my experience, using regular expressions applied to document formats tends to get you false positives. I’d be worried about books with titles similar to 'How we wrote a regular expression to detect occurrences of "some key": null in our JSON documents', for example. For stripping those null occurrences, you are aware of the json_strip_nulls(json) and jsonb_strip_nulls(jsonb) functions, right? For detecting them on a recent PG, the @? operator or json_path_exists(json, jsonpath) functions would probably do the trick. I am not too familiar with JSONPATH expressions, but I expect (it passed some preliminary testing) this would detect your nulls just fine, while taking JSON semantics into account: jsonb_path_exists(document, '$.** ? (@ == null)'::jsonpath) For PG-specifics on JSONPATH, see section 9.16.2 on: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-OP-TABLE A recursive query is another possible solution. It would probably perform far worse, but I find them more rewarding to write. Some people prefer Sodoku. Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
Re: ISBN (was: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?)
> hjp-pg...@hjp.at wrote: > >> Bryn wrote: >> >> "isbn" — string >> values must be unique across the entire set of documents (in other words, it >> defines the unique business key); values must have this pattern: >> >> « ^[0-9]{3}-[0-9]{1}-[0-9]{2}-[0-9]{6}-[0-9]{1}$ » > > Completely off-topic, but this regexp doesn't describe ISBNs. In ISBNs the > three middle subfields are all variable length. The first is a language code > (there are more than 10 languages in the world), the second identifies the > publisher (there are more than 100 publishers) and the third the book. For > example, "Mastering PostgreSQL 9.6" has the ISBN 978-1-78355-535-2. Yes, I know. Sorry. I should have said that I simply wanted to illustrate a proof of concept for the notion, viable (only?) when you have incoming JSON documents with a well-defined schema, that this is idempotent: JSON → relational → JSON And it's of note that PostgreSQL has had what you need to do the xform, in each direction, for a long time. I simply typed up my "corpus" by hand. It didn’t matter that the ISBN format was a fairly tale. A book does have an ISBN and (as far as my P.o.C. is concerned) it's the unique business key. I should have cut out the REGEXP in my email and said something like "must conform to the specified format". But I was careless in my proof reading. Thanks for not mentioning that books have publishers—so that "authors" isn't the only key that implies a Codd-and-Date many-to-many!
Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?
> haram...@gmail.com wrote > >> b...@yugabyte.com wrote: >> >> I implemented two complementary functions: >> >> —"no_null_keys()" checks that a "jsonb" value has no occurrences of « "some >> key": null » >> >> —"strip_null_keys()" removes « "some key": null » occurrences from a "jsonb" >> value >> >> The code checks with "no_null_keys()" that, as expected, no ingested JSON >> document has an occurrence of « "some key": null ». >> >> And it uses "strip_null_keys()" on the output of "to_jsonb()" — and, as >> appropriate, any other built-in JSON function that produces a "jsonb" value. >> >> It was straightforward to implement these two functions by using REGEXP >> built-in functionality on the canonically formatted "text" value produced by >> the "jsonb::text" typecast. > > In my experience, using regular expressions applied to document formats tends > to get you false positives. I’d be worried about books with titles similar to: > > « > How we wrote a regular expression to detect occurrences of "some key": null > in our JSON documents > » > > For stripping those null occurrences, you are aware of the > json_strip_nulls(json) and jsonb_strip_nulls(jsonb) functions, right? > > For detecting them on a recent PG, the @? operator or json_path_exists(json, > jsonpath) functions would probably do the trick. > I am not too familiar with JSONPATH expressions, but I expect (it passed some > preliminary testing) this would detect your nulls just fine, while taking > JSON semantics into account: > > jsonb_path_exists(document, '$.** ? (@ == null)'::jsonpath) > > For PG-specifics on JSONPATH, see section 9.16.2: > https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-OP-TABLE > > A recursive query is another possible solution. It would probably perform far > worse... Thank you so much, Alban, for taking an interest in my « "some key": null » saga. The "G", "E", "L", and "O" keys on my keyboard were broken. They're fixed now, and so "postgres strip keys with null value from jsonb object" got me to this (but, strangely, not to the PG docs): docs.yugabyte.com/preview/api/ysql/datatypes/type_json/functions-operators/jsonb-strip-nulls/ This is embarrassing. I wrote that doc, along with the rest of the content in the "JSON data types and functionality" section just before COVID hit us. Apparently, life under lockdown has damaged my corpus callosum. I typed up my proof-of-concept code and emails with one half of my brain—and it was the other half that wrote that documentation. So thanks for the memory jog. My excuse is that (as was the case with my malformed ISBN that Peter Holzer pointed out in a separate thread) I wanted just to show myself, as a proof-of-concept, that stripping nulls was feasible—so I gave it no more thought once I'd done that. But, I suppose, that's not excuse... Anyway, my "strip_null_keys()" is already on the scrapheap. And the body of my "no_null_keys()" reduced to a single line: create function no_null_keys(j in jsonb) returns boolean immutable language sql as $body$ select j = jsonb_strip_nulls(j); $body$; You might argue that I don't need to bother with the encapsulation. But it makes testing easier—and I'm trusting that inlining works as advertised. Your point about false positives is well taken. So, just for sport: create type t1 as (k int, v text); create type t2 as (a int, b int, c t1, d t1, e text[]); create function j() returns jsonb language plpgsql as $body$ declare tconstant textnot null := 'How we wrote a regular expression to detect occurrences of « "some key": null » in our JSON documents!'; c1 constant t1 not null := (17, t); c2 constant t1 not null := (29, null); arr constant text[] not null := array['x', null::text, t]; rconstant t2 not null := (42, null, c1, c2, arr); begin return to_jsonb(r); end; $body$; select jsonb_pretty(j()); The output includes two occurrences of this: "How we wrote a regular expression to detect occurrences of « \"some key\": null » in our JSON documents!" I believe that the "jsonb" to "text" conversion never produces an isolated double-quote within the representation of an object key's value. I checked that my "strip_null_keys()" handled your example before consigning it to the scrapheap—and it didn't let me down. But it would be foolish to argue that there isn't some way to provoke a false positive.