date:20230209

Re: Domain check taking place unnecessarily?

2023-02-09 Thread Mark Hills

On Wed, 8 Feb 2023, Laurenz Albe wrote:

> On Wed, 2023-02-08 at 18:01 +, Mark Hills wrote:
> > I've ruled out waiting on a lock; nothing is reported with 
> > log_lock_waits=on. This is a test database with exclusive access (2.5 
> > million rows):
> > 
> > This is PostgreSQL 14.5 on Alpine Linux. Thanks.
> > 
> > CREATE DOMAIN hash AS text
> >     CHECK (VALUE ~ E'^[a-zA-Z0-9]{8,32}$');
> >  
> > devstats=> ALTER TABLE invite ADD COLUMN test text;
> > ALTER TABLE
> > Time: 8.988 ms
> >  
> > devstats=> ALTER TABLE invite ADD COLUMN test hash;
> > ALTER TABLE
> > Time: 30923.380 ms (00:30.923)
> >  
> > devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT NULL;
> > ALTER TABLE
> > Time: 30344.272 ms (00:30.344)
> >  
> > devstats=> ALTER TABLE invite ADD COLUMN test hash DEFAULT 
> > '123abc123'::hash;
> > ALTER TABLE
> > Time: 67439.232 ms (01:07.439)
> 
> It takes 30 seconds to schan the table and determine that all existing 
> rows satisky the constraint.

But there's no existing data (note this is adding column, not constraint)

Existing rows are guaranteed to satisfy the domain check, because the 
domain check is guaranteed to be immutable (per [1] in my original mail)

Of course, if it were a table constraint it may involve multiple columns, 
requiring it to be evaluated per-row.

But the docs make it clear the domain check is expected to be evaluated on 
input, precisely for this purpose.

So I wondered if this was a shortcoming or even a bug.

It seems that adding a column of NULL (or even default) values for a 
domain can (should?) be as quick as a basic data type like text or 
integer...?

-- 
Mark

Re: Domain check taking place unnecessarily?

2023-02-09 Thread Mark Hills

On Wed, 8 Feb 2023, David G. Johnston wrote:

> On Wed, Feb 8, 2023 at 11:01 AM Mark Hills  wrote:
> 
> >
> > CREATE DOMAIN hash AS text
> > CHECK (VALUE ~ E'^[a-zA-Z0-9]{8,32}$');
> >
> > devstats=> ALTER TABLE invite ADD COLUMN test hash;
> > ALTER TABLE
> > Time: 30923.380 ms (00:30.923)
> >
> 
> Necessarily, I presume because if you decided that the check on the domain
> should be "value is not null" (don't do this though...) the column addition
> would have to fail for existing rows (modulo defaults...).

I'm not sure I'm parsing this paragraph correctly, but the existing rows 
don't provide any data to the domain check. Perhaps you could clarify.

Many thanks

-- 
Mark

Re: Domain check taking place unnecessarily?

2023-02-09 Thread Tom Lane

Mark Hills  writes:
> On Wed, 8 Feb 2023, Laurenz Albe wrote:
>> It takes 30 seconds to schan the table and determine that all existing 
>> rows satisky the constraint.

> But there's no existing data (note this is adding column, not constraint)

> Existing rows are guaranteed to satisfy the domain check, because the 
> domain check is guaranteed to be immutable (per [1] in my original mail)

immutable != "will accept null".

There could be some more optimizations here, perhaps, but there aren't.

regards, tom lane

Re: Window Functions & Table Partitions

2023-02-09 Thread Benjamin Tingle

Thanks for the helpful response david! I'll have a shot at getting the
patch to work myself & submitting to pgsql-hackers.

Ben

On Wed, Feb 8, 2023 at 2:36 PM David Rowley  wrote:

> On Thu, 9 Feb 2023 at 10:45, Benjamin Tingle  wrote:
> > Basically- window partition functions don't take advantage of existing
> table partitions. I use window functions as a more powerful GROUP BY clause
> that preserves row-by-row information- super handy for a lot of things.
> >
> > In particular, I want to use window functions on already partitioned
> tables, like the below example:
> >
> > create table abb (a int, b int, g int) partition by hash(b)
> > /* populate table etc... */
> > select a, b, min(a) over (partition by b) as g from abb
> >
> > Ideally with a query plan like this:
> >
> > Window:
> > Append:
> > Sort on table_p0
> > Sort on table_p1
> > Sort on table_p2
>
> There was some effort [1] in version 12 to take advantage of the order
> defined by the partitioning scheme. The release notes [2] mention:
>
> "Avoid sorting when partitions are already being scanned in the necessary
> order"
>
> However, it's not 100% of what you need as there'd have to be a btree
> index on abb(b) for the planner to notice.
>
> Likely this could be made better so that add_paths_to_append_rel()
> added the pathkeys defined by the partitioned table into
> all_child_pathkeys if they didn't exist already. In fact, I've
> attached a very quickly hacked together patch against master to do
> this.  I've given it very little thought and it comes complete with
> failing regression tests.
>
> If you're interested in pursuing this then feel free to take the patch
> to the pgsql-hackers mailing list and propose it. It's unlikely I'll
> get time to do that for a while, but I will keep a branch locally with
> it to remind me in case I do at some point in the future.
>
> David
>
> [1]
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=959d00e9dbe4cfcf4a63bb655ac2c29a5e579246
> [2] https://www.postgresql.org/docs/release/12.0/
>


-- 

Ben(t).

Re: Domain check taking place unnecessarily?

Re: Domain check taking place unnecessarily?

Re: Domain check taking place unnecessarily?

Re: Window Functions & Table Partitions

4 matches

Site Navigation

Mail list logo

Footer information