re-novice coming back to pgsql: porting an SQLite update statement to postgres
The last time I used PostgreSQL for anything was about 15 years ago and I have only limited SQL background, so please consider this a novice question. I have an embedded SQLite database that I would like to port to PostgreSQL, I have done the majority of this porting, but am stuck on a final compound statement. This is for a hobby project. For background I'll give the SQLite schema and the statement in question, and also the PostgreSQL schema. The SQLite code here is also available to view in context at the links at the bottom of the post in case this is helpful SQLite schema: create table if not exists buckets ( rowidINTEGER PRIMARY KEY AUTOINCREMENT, id TEXT UNIQUE NOT NULL, name TEXT, type TEXT NOT NULL, client TEXT NOT NULL, hostname TEXT NOT NULL, created TEXT NOT NULL, -- unix micro datastr TEXT NOT NULL -- JSON text ); create table if not exists events ( idINTEGER PRIMARY KEY AUTOINCREMENT, bucketrow INTEGER NOT NULL, starttime INTEGER NOT NULL, -- unix micro endtime INTEGER NOT NULL, -- unix micro datastr TEXT NOT NULL,-- JSON text FOREIGN KEY (bucketrow) REFERENCES buckets(rowid) ); create index if not exists event_index_id ON events(id); create index if not exists event_index_starttime ON events(bucketrow, starttime); create index if not exists event_index_endtime ON events(bucketrow, endtime); PostgreSQL schema: create table if not exists buckets ( rowidSERIAL PRIMARY KEY, id TEXT UNIQUE NOT NULL, name TEXT, type TEXT NOT NULL, client TEXT NOT NULL, hostname TEXT NOT NULL, created TIMESTAMP WITH TIME ZONE NOT NULL, datastr JSONB NOT NULL ); create table if not exists events ( idSERIAL PRIMARY KEY, bucketrow INTEGER NOT NULL, starttime TIMESTAMP WITH TIME ZONE NOT NULL, endtime TIMESTAMP WITH TIME ZONE NOT NULL, datastr JSONB NOT NULL, FOREIGN KEY (bucketrow) REFERENCES buckets(rowid) ); create index if not exists event_index_id ON events(id); create index if not exists event_index_starttime ON events(bucketrow, starttime); create index if not exists event_index_endtime ON events(bucketrow, endtime); The part that I am having difficulty knowing where to start deals with the JSON data in the events table. This contains largely unstructured data, but can have an array of amendment notes added using the SQL statements below. These notes are essentially append-only alterations to the unstructured data, spanning a period of the unstructured data within the time bounds of the event table row. My question is where would be the best place for me to looks to learn about how to implement a port of this SQLite? and what would broadly be the most sensible approach to take (to narrow down what I need to read through in learning)? Apologies for the long post, and thank you for any help. Dan SQLite statement: begin transaction; -- ensure we have an amend array. update events set datastr = json_insert(datastr, '$.amend', json('[]')) where datetime(starttime, 'subsec') < datetime(?5, 'subsec') and datetime(endtime, 'subsec') > datetime(?4, 'subsec') and bucketrow = ( select rowid from buckets where id = ?1 ); update events set datastr = json_insert(datastr, '$.amend[#]', json_object('time', ?2, 'msg', ?3, 'replace', ( -- trim amendments down to original event bounds. select json_group_array(json_replace(value, '$.start', case when datetime(starttime, 'subsec') > datetime(json_extract(value, '$.start'), 'subsec') then starttime else json_extract(value, '$.start') end, '$.end', case when datetime(endtime, 'subsec') < datetime(json_extract(value, '$.end'), 'subsec') then endtime else json_extract(value, '$.end') end )) from json_each(?6) where datetime(json_extract(value, '$.start'), 'subsec') < datetime(endtime, 'subsec') and datetime(json_extract(value, '$.end'), 'subsec') > datetime(starttime, 'subsec') ))) where datetime(starttime, 'subsec') < datetime(?5, 'subsec') and datetime(endtime, 'subsec') > datetime(?4, 'subsec') and bucketrow = ( select rowid fr
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
On 7/23/24 13:11, Vincent Veyron wrote: > On Mon, 15 Jul 2024 20:31:13 + > > This is the goto page for anything SQL : > https://www.postgresql.org/docs/current/sql-commands.html > > For DateTime types : > https://www.postgresql.org/docs/current/datatype-datetime.html > > For JSON types : > https://www.postgresql.org/docs/current/datatype-json.html Thanks, I will work through those. On Tue, 2024-07-23 at 23:52 +0200, Dominique Devienne wrote: > On Tue, Jul 23, 2024 at 10:35 PM Adrian Klaver > wrote: > > Just know that SQLite does not enforce types [...] > > That's true, and applies to the OP's schema. Thank you both. Yes, I was aware of this weirdness of the schema (I inherited it) and was shocked that it worked when I relaised. I'll be happier when types are properly enforced, but I don't think I can retrospectively enforce that on the SQLite implementation I have.
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
On Wed, 2024-07-24 at 00:23 +, Dan Kortschak wrote: > On 7/23/24 13:11, Vincent Veyron wrote: > > On Mon, 15 Jul 2024 20:31:13 + > > > > This is the goto page for anything SQL : > > https://www.postgresql.org/docs/current/sql-commands.html > > > > For DateTime types : > > https://www.postgresql.org/docs/current/datatype-datetime.html > > > > For JSON types : > > https://www.postgresql.org/docs/current/datatype-json.html > > Thanks, I will work through those. I've had a chance to attack this. The first part of the problem was that I could not figure out how to get the multiple statement transaction that I using in SQLite to work with PostgreSQL. The solution was to use the host language's Postres binding transaction functions and send the statements separately. The first part, to ensure the JSON array exist is solved with update events set datastr = jsonb_set(datastr, '{amend}', '[]') where starttime < $3 and endtime > $2 and not datastr::jsonb ? 'amend' and bucketrow = ( select rowid from buckets where id = $1 ); I'm still having difficulties with the second part which is to update the contents of the amend array in the JSON. So far I'm able to append the relevant details to the append array, but I'm unable to correctly select the corrects elements from the $6 argument, which is in the form [{"start":,"end":,"data":}, ...]. The first update statement gives me broadly what I want, but includes elements of the array that it shouldn't. update events set datastr = jsonb_set( datastr, '{amend}', datastr->'amend' || jsonb_build_object( 'time', $2::TEXT, 'msg', $3::TEXT, 'replace', ( -- This select is for comparison with the code below. select * from jsonb($6::TEXT) ) ) ) where starttime < $5 and endtime > $4 and bucketrow = ( select rowid from buckets where id = $1 ); If I filter on the start and end time, I end up with no element coming through at all and the "replace" field ends up null. update events set datastr = jsonb_set( datastr, '{amend}', datastr->'amend' || jsonb_build_object( 'time', $2::TEXT, 'msg', $3::TEXT, 'replace', ( select * from jsonb($6::TEXT) as replacement where (replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and (replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime ) ) ) where starttime < $5 and endtime > $4 and bucketrow = ( select rowid from buckets where id = $1 ); Can anyone suggest what I might be missing? One thing that occurs to me is that due to the host language the timezone in starttime and endtime is the local timezone, while the timezone in the elements of the $6 argument are in UTC. I've tried forcing the timezones to match and this does not appear to be the issue. This can be seen in the case of output from the first, non-filtering update statement above: { ... "start": "2023-06-13T05:24:50+09:30", "end": "2023-06-13T05:24:55+09:30", "data": { ... "amend": [ { "endtime": "2023-06-13T05:24:55+09:30", "msg": "testing",
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
On Sat, 2024-09-14 at 12:05 +0200, Alban Hertroys wrote: > > That’s because the replacement data is an array of objects, not a > single object. > > You need to iterate through the array elements to build your > replacement data, something like what I do here with a select > (because that’s way easier to play around with): > > with dollar6 as ( > select jsonb($$[ > { > "data": { > "foo": 1, "bar": 2 > }, > "end": "2023- > 06-12T19:54:51Z", > "start": > "2023-06-12T19:54:39Z" > } > ]$$::text) replacement > ) > select * > from dollar6 > cross join lateral jsonb_array_elements(replacement) r > where (r->>'start')::timestamptz <= current_timestamp; Thanks
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
I have come to hopefully my last stumbling point. I am unable to see a way to express something like this SQLite syntax select json_group_array(json_replace(value, '$.a', case when json_extract(value, '$.a') > 2 then 2 else json_extract(value, '$.a') end, '$.b', case when json_extract(value, '$.b') < -2 then -2 else json_extract(value, '$.b') end )) from json_each('[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]'); (in the repro above, the values are integers, but in the real case, they are timestamps) I have worked on multiple statements around the theme of with t as ( select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$) arr ) select jsonb_array_elements(arr) as arr from t; The closest that I have come is with t as ( select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$) arr ) select jsonb_set(arr, '{a}', case when (arr->>'a')::INTEGER > 2 then 2 else (arr->>'a')::INTEGER end ) from ( select jsonb_array_elements(arr) as arr from t ) elements; but this is a millions miles from where I want to be (it doesn't work, but I think the shape of the things that it's working with are maybe heading in the right direction). I've read through the docs, but I just don't seem able to get my head around this. Any help would be greatful appreciated (also some reading direction so that I'm not floundering so much). thanks
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
Thank you both. This has been extremely helpful. I still have more work to do but this has made it possible to start playing with something, and reading about it when it doesn't work. On Sun, 2024-09-15 at 10:13 -0700, Willow Chargin wrote: > On Sun, Sep 15, 2024 at 4:23 AM Alban Hertroys > wrote: > > > > The drawback is that you have to specify all fields and types, but > > you don’t need to cast the values all the time either. > > Here is a variant of Alban's first method that does not require > specifying all fields and types, and so works with heterogeneous > values: > > WITH t AS ( > SELECT jsonb($$[ > {"a": 1, "b": -3, "c": 1}, > {"a": 2, "b": -2, "c": 2}, > {"a": 3, "b": -1, "c": 3}, > {"a": 3, "b": -3, "c": 4} > ]$$) arr > ) > SELECT > jsonb_agg(new_element ORDER BY idx) new_arr > FROM t, LATERAL ( > SELECT idx, jsonb_object_agg(key, CASE > WHEN key = 'a' > THEN least(old_value::numeric, 2)::text::jsonb > WHEN key = 'b' > THEN greatest(old_value::numeric, - > 2)::text::jsonb > ELSE old_value > END) > FROM > jsonb_array_elements(arr) > WITH ORDINALITY old_elements(old_element, idx), > jsonb_each(old_element) each(key, old_value) > GROUP BY idx > ) new_elements(idx, new_element) > > I also took the liberties of using `least` / `greatest` to simplify > the > clamping operations, and using `WITH ORDINALITY` / `ORDER BY` on the > array scan and re-aggregation to make the element ordering explicit > rather than relying on the query engine to not re-order the rows. > > https://www.postgresql.org/docs/16/functions-conditional.html#FUNCTIONS-GREATEST-LEAST > https://www.postgresql.org/docs/16/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS I found that I needed to old_value::text::numeric to get Willow's code to work, but I imagine this is due to the ancientness of the postgresql I am using. thanks Dan
Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres
This is the toy with the shape of data that will be seen in the application. The final trick was to use to_jsonb to allow the timestamptz to be put back into the jsonb. WITH replace AS ( SELECT jsonb($$[ {"start": "2023-06-12T19:54:39.248859996+10:00", "end": "2023-06-12T19:54:59.248859996+10:00", "data": {"item":1}}, {"start": "2023-06-12T19:54:50.248859996+10:00", "end": "2023-06-12T19:59:39.248859996+10:00", "data": {"item":2}}, {"start": "2023-06-12T19:56:39.248859996+10:00", "end": "2023-06-12T19:57:39.248859996+10:00", "data": {"item":3}}, {"start": "2023-06-12T19:54:39.248859996+10:00", "end": "2023-06-12T20:54:49.248859996+10:00", "data": {"item":4}}, {"start": "2024-06-12T19:54:39.248859996+10:00", "end": "2024-06-12T19:59:39.248859996+10:00", "data": {"item":5}} ]$$) replacements ) SELECT jsonb_agg(new ORDER BY idx) trimmed_replacements FROM replace, LATERAL ( SELECT idx, jsonb_object_agg(key, CASE WHEN key = 'start' THEN to_jsonb(greatest(old::text::TIMESTAMPTZ, '2023-06-12T19:54:50+10:00'::TIMESTAMPTZ)) WHEN key = 'end' THEN to_jsonb(least(old::text::TIMESTAMPTZ, '2023-06-12T19:58:00+10:00'::TIMESTAMPTZ)) ELSE old END ) FROM jsonb_array_elements(replacements) WITH ORDINALITY rs(r, idx), jsonb_each(r) each(key, old) WHERE (r->>'start')::TIMESTAMPTZ < '2023-06-12T19:58:00+10:00'::TIMESTAMPTZ and (r->>'end')::TIMESTAMPTZ > '2023-06-12T19:54:50+10:00'::TIMESTAMPTZ GROUP BY idx ) news(idx, new);