re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-07-15 Thread Dan Kortschak
The last time I used PostgreSQL for anything was about 15 years ago and
I have only limited SQL background, so please consider this a novice
question.

I have an embedded SQLite database that I would like to port to
PostgreSQL, I have done the majority of this porting, but am stuck on a
final compound statement. This is for a hobby project.

For background I'll give the SQLite schema and the statement in
question, and also the PostgreSQL schema. The SQLite code here is also
available to view in context at the links at the bottom of the post in
case this is helpful

SQLite schema:

create table if not exists buckets (
rowidINTEGER PRIMARY KEY AUTOINCREMENT,
id   TEXT UNIQUE NOT NULL,
name TEXT,
type TEXT NOT NULL,
client   TEXT NOT NULL,
hostname TEXT NOT NULL,
created  TEXT NOT NULL, -- unix micro
datastr  TEXT NOT NULL  -- JSON text
);
create table if not exists events (
idINTEGER PRIMARY KEY AUTOINCREMENT,
bucketrow INTEGER NOT NULL,
starttime INTEGER NOT NULL, -- unix micro
endtime   INTEGER NOT NULL, -- unix micro
datastr   TEXT NOT NULL,-- JSON text
FOREIGN KEY (bucketrow) REFERENCES buckets(rowid)
);
create index if not exists event_index_id ON events(id);
create index if not exists event_index_starttime ON events(bucketrow, 
starttime);
create index if not exists event_index_endtime ON events(bucketrow, endtime);


PostgreSQL schema:

create table if not exists buckets (
rowidSERIAL PRIMARY KEY,
id   TEXT UNIQUE NOT NULL,
name TEXT,
type TEXT NOT NULL,
client   TEXT NOT NULL,
hostname TEXT NOT NULL,
created  TIMESTAMP WITH TIME ZONE NOT NULL,
datastr  JSONB NOT NULL
);
create table if not exists events (
idSERIAL PRIMARY KEY,
bucketrow INTEGER NOT NULL,
starttime TIMESTAMP WITH TIME ZONE NOT NULL,
endtime   TIMESTAMP WITH TIME ZONE NOT NULL,
datastr   JSONB NOT NULL,
FOREIGN KEY (bucketrow) REFERENCES buckets(rowid)
);
create index if not exists event_index_id ON events(id);
create index if not exists event_index_starttime ON events(bucketrow, 
starttime);
create index if not exists event_index_endtime ON events(bucketrow, endtime);


The part that I am having difficulty knowing where to start deals with
the JSON data in the events table. This contains largely unstructured
data, but can have an array of amendment notes added using the SQL
statements below. These notes are essentially append-only alterations
to the unstructured data, spanning a period of the unstructured data
within the time bounds of the event table row.

My question is where would be the best place for me to looks to learn
about how to implement a port of this SQLite? and what would broadly be
the most sensible approach to take (to narrow down what I need to read
through in learning)?

Apologies for the long post, and thank you for any help.
Dan


SQLite statement:

begin transaction;
-- ensure we have an amend array.
update events set datastr = json_insert(datastr, '$.amend', json('[]'))
where
datetime(starttime, 'subsec') < datetime(?5, 'subsec') and
datetime(endtime, 'subsec') > datetime(?4, 'subsec') and
bucketrow = (
select rowid from buckets where id = ?1
);
update events set datastr = json_insert(datastr, '$.amend[#]', 
json_object('time', ?2, 'msg', ?3, 'replace', (
-- trim amendments down to original event bounds.
select json_group_array(json_replace(value,
'$.start', case
when datetime(starttime, 'subsec') > 
datetime(json_extract(value, '$.start'), 'subsec') then
starttime
else
json_extract(value, '$.start')
end,
'$.end', case
when datetime(endtime, 'subsec') < 
datetime(json_extract(value, '$.end'), 'subsec') then
endtime
else
json_extract(value, '$.end')
end
))
from
json_each(?6)
where
datetime(json_extract(value, '$.start'), 'subsec') < 
datetime(endtime, 'subsec') and
datetime(json_extract(value, '$.end'), 'subsec') > 
datetime(starttime, 'subsec')
)))
where
datetime(starttime, 'subsec') < datetime(?5, 'subsec') and
datetime(endtime, 'subsec') > datetime(?4, 'subsec') and
bucketrow = (
select rowid fr

Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-07-23 Thread Dan Kortschak
On 7/23/24 13:11, Vincent Veyron wrote:
> On Mon, 15 Jul 2024 20:31:13 +
>
> This is the goto page for anything SQL :
> https://www.postgresql.org/docs/current/sql-commands.html
>
> For DateTime types :
> https://www.postgresql.org/docs/current/datatype-datetime.html
>
> For JSON types :
> https://www.postgresql.org/docs/current/datatype-json.html

Thanks, I will work through those.


On Tue, 2024-07-23 at 23:52 +0200, Dominique Devienne wrote:
> On Tue, Jul 23, 2024 at 10:35 PM Adrian Klaver
>  wrote:
> > Just know that SQLite does not enforce types [...]
> 
> That's true, and applies to the OP's schema.

Thank you both. Yes, I was aware of this weirdness of the schema (I
inherited it) and was shocked that it worked when I relaised. I'll be
happier when types are properly enforced, but I don't think I can
retrospectively enforce that on the SQLite implementation I have.





Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-14 Thread Dan Kortschak
On Wed, 2024-07-24 at 00:23 +, Dan Kortschak wrote:
> On 7/23/24 13:11, Vincent Veyron wrote:
> > On Mon, 15 Jul 2024 20:31:13 +
> > 
> > This is the goto page for anything SQL :
> > https://www.postgresql.org/docs/current/sql-commands.html
> > 
> > For DateTime types :
> > https://www.postgresql.org/docs/current/datatype-datetime.html
> > 
> > For JSON types :
> > https://www.postgresql.org/docs/current/datatype-json.html
> 
> Thanks, I will work through those.

I've had a chance to attack this. The first part of the problem was
that I could not figure out how to get the multiple statement
transaction that I using in SQLite to work with PostgreSQL. The
solution was to use the host language's Postres binding transaction
functions and send the statements separately.

The first part, to ensure the JSON array exist is solved with

update
events
set 
datastr = jsonb_set(datastr, '{amend}', '[]')
where
starttime < $3 and
endtime > $2 and
not datastr::jsonb ? 'amend' and
bucketrow = (
select rowid from buckets where id = $1
);

I'm still having difficulties with the second part which is to update
the contents of the amend array in the JSON.

So far I'm able to append the relevant details to the append array, but
I'm unable to correctly select the corrects elements from the $6
argument, which is in the form
[{"start":,"end":,"data":}, ...]. The first
update statement gives me broadly what I want, but includes elements of
the array that it shouldn't.

update
events
set
datastr = jsonb_set(
datastr,
'{amend}',
datastr->'amend' || jsonb_build_object(
'time', $2::TEXT,
'msg', $3::TEXT,
'replace', (
-- This select is for comparison with 
the code below.
select * from jsonb($6::TEXT)
)
)
)
where
starttime < $5 and
endtime > $4 and
bucketrow = (
select rowid from buckets where id = $1
);

If I filter on the start and end time, I end up with no element coming
through at all and the "replace" field ends up null.

update
events
set
datastr = jsonb_set(
datastr,
'{amend}',
datastr->'amend' || jsonb_build_object(
'time', $2::TEXT,
'msg', $3::TEXT,
'replace', (
select *
from
jsonb($6::TEXT) as replacement
where

(replacement->>'start')::TIMESTAMP WITH TIME ZONE < endtime and

(replacement->>'end')::TIMESTAMP WITH TIME ZONE > starttime
)
)
)
where
starttime < $5 and
endtime > $4 and
bucketrow = (
select rowid from buckets where id = $1
);

Can anyone suggest what I might be missing? One thing that occurs to me
is that due to the host language the timezone in starttime and endtime
is the local timezone, while the timezone in the elements of the $6
argument are in UTC. I've tried forcing the timezones to match and this
does not appear to be the issue.

This can be seen in the case of output from the first, non-filtering
update statement above:

{
...
"start": "2023-06-13T05:24:50+09:30",
"end": "2023-06-13T05:24:55+09:30",
"data": {
...
"amend": [
{
"endtime": "2023-06-13T05:24:55+09:30",
"msg": "testing",
   

Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-14 Thread Dan Kortschak
On Sat, 2024-09-14 at 12:05 +0200, Alban Hertroys wrote:
> 
> That’s because the replacement data is an array of objects, not a
> single object.
> 
> You need to iterate through the array elements to build your
> replacement data, something like what I do here with a select
> (because that’s way easier to play around with):
> 
> with dollar6 as (
> select jsonb($$[
>     {
>     "data": {
> "foo": 1, "bar": 2
>     },
>     "end": "2023-
> 06-12T19:54:51Z",
>     "start":
> "2023-06-12T19:54:39Z"
>     }
>     ]$$::text) replacement
> )
> select *
> from dollar6
> cross join lateral jsonb_array_elements(replacement) r
> where (r->>'start')::timestamptz <= current_timestamp;


Thanks






Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-15 Thread Dan Kortschak
I have come to hopefully my last stumbling point.

I am unable to see a way to express something like this SQLite syntax

select json_group_array(json_replace(value,
  '$.a', case
when json_extract(value, '$.a') > 2 then
  2
else
  json_extract(value, '$.a')
end,
  '$.b', case
when json_extract(value, '$.b') < -2 then
  -2
else
  json_extract(value, '$.b')
end
))
from
  json_each('[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]');

(in the repro above, the values are integers, but in the real case,
they are timestamps)

I have worked on multiple statements around the theme of 

with t as (
  select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$)
arr
)
select
  jsonb_array_elements(arr) as arr
from
  t;

The closest that I have come is 

with t as (
  select jsonb($$[{"a":1, "b":-3},{"a":2, "b":-2},{"a":3, "b":-1}]$$)
arr
)
select jsonb_set(arr, '{a}', case
  when (arr->>'a')::INTEGER > 2 then
2
  else
(arr->>'a')::INTEGER
  end
)
from (
  select
jsonb_array_elements(arr) as arr
  from
t
) elements;

but this is a millions miles from where I want to be (it doesn't work,
but I think the shape of the things that it's working with are maybe
heading in the right direction). I've read through the docs, but I just
don't seem able to get my head around this.

Any help would be greatful appreciated (also some reading direction so
that I'm not floundering so much).

thanks







Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-20 Thread Dan Kortschak
Thank you both. This has been extremely helpful. I still have more work
to do but this has made it possible to start playing with something,
and reading about it when it doesn't work.

On Sun, 2024-09-15 at 10:13 -0700, Willow Chargin wrote:
> On Sun, Sep 15, 2024 at 4:23 AM Alban Hertroys 
> wrote:
> > 
> > The drawback is that you have to specify all fields and types, but
> > you don’t need to cast the values all the time either.
> 
> Here is a variant of Alban's first method that does not require
> specifying all fields and types, and so works with heterogeneous
> values:
> 
>     WITH t AS (
>     SELECT jsonb($$[
>     {"a": 1, "b": -3, "c": 1},
>     {"a": 2, "b": -2, "c": 2},
>     {"a": 3, "b": -1, "c": 3},
>     {"a": 3, "b": -3, "c": 4}
>     ]$$) arr
>     )
>     SELECT
>     jsonb_agg(new_element ORDER BY idx) new_arr
>     FROM t, LATERAL (
>     SELECT idx, jsonb_object_agg(key, CASE
>     WHEN key = 'a'
>     THEN least(old_value::numeric, 2)::text::jsonb
>     WHEN key = 'b'
>     THEN greatest(old_value::numeric, -
> 2)::text::jsonb
>     ELSE old_value
>     END)
>     FROM
>     jsonb_array_elements(arr)
>     WITH ORDINALITY old_elements(old_element, idx),
>     jsonb_each(old_element) each(key, old_value)
>     GROUP BY idx
>     ) new_elements(idx, new_element)
> 
> I also took the liberties of using `least` / `greatest` to simplify
> the
> clamping operations, and using `WITH ORDINALITY` / `ORDER BY` on the
> array scan and re-aggregation to make the element ordering explicit
> rather than relying on the query engine to not re-order the rows.
> 
> https://www.postgresql.org/docs/16/functions-conditional.html#FUNCTIONS-GREATEST-LEAST
> https://www.postgresql.org/docs/16/queries-table-expressions.html#QUERIES-TABLEFUNCTIONS

I found that I needed to old_value::text::numeric to get Willow's code
to work, but I imagine this is due to the ancientness of the postgresql
I am using.

thanks
Dan






Re: re-novice coming back to pgsql: porting an SQLite update statement to postgres

2024-09-20 Thread Dan Kortschak
This is the toy with the shape of data that will be seen in the
application. The final trick was to use to_jsonb to allow the
timestamptz to be put back into the jsonb.

WITH replace AS (
SELECT jsonb($$[
{"start": "2023-06-12T19:54:39.248859996+10:00", "end": 
"2023-06-12T19:54:59.248859996+10:00", "data": {"item":1}},
{"start": "2023-06-12T19:54:50.248859996+10:00", "end": 
"2023-06-12T19:59:39.248859996+10:00", "data": {"item":2}},
{"start": "2023-06-12T19:56:39.248859996+10:00", "end": 
"2023-06-12T19:57:39.248859996+10:00", "data": {"item":3}},
{"start": "2023-06-12T19:54:39.248859996+10:00", "end": 
"2023-06-12T20:54:49.248859996+10:00", "data": {"item":4}},
{"start": "2024-06-12T19:54:39.248859996+10:00", "end": 
"2024-06-12T19:59:39.248859996+10:00", "data": {"item":5}}
]$$) replacements
)
SELECT
jsonb_agg(new ORDER BY idx) trimmed_replacements
FROM
replace, LATERAL (
SELECT idx, jsonb_object_agg(key,
CASE
WHEN key = 'start'
THEN to_jsonb(greatest(old::text::TIMESTAMPTZ, 
'2023-06-12T19:54:50+10:00'::TIMESTAMPTZ))
WHEN key = 'end'
THEN to_jsonb(least(old::text::TIMESTAMPTZ, 
'2023-06-12T19:58:00+10:00'::TIMESTAMPTZ))
ELSE old
END 
)
FROM
jsonb_array_elements(replacements)
WITH ORDINALITY rs(r, idx),
jsonb_each(r) each(key, old)
WHERE
(r->>'start')::TIMESTAMPTZ < 
'2023-06-12T19:58:00+10:00'::TIMESTAMPTZ and 
(r->>'end')::TIMESTAMPTZ > '2023-06-12T19:54:50+10:00'::TIMESTAMPTZ
GROUP BY idx
) news(idx, new);