Re: Partitioning a table by integer value (preferably in place)

2021-08-13 Thread Michael Lewis
On Fri, Aug 13, 2021 at 3:02 AM Vijaykumar Jain < vijaykumarjain.git...@gmail.com> wrote: > ... use binary split for large partitions, to avoid large row movements. > Would you expound on this?

Re: Using a TRIGGER with window functions.

2021-08-16 Thread Michael Lewis
If you want to update many rows after updating N rows, you want an after STATEMENT trigger which is executed after insert, update or delete. You also want to ensure that only that function maintains sort_order field and that you don't update recursively, perhaps by executing that function when NEW.

Re: Regexp_replace bug / does not terminate on long strings

2021-08-19 Thread Michael Lewis
0/10,250/25,250/250,300/20,300/30,300/7,300/300,300/10,300/50,400/20,400/25,450/50,500/500,500/50,500/35,500/25,600/40,1000/20,1000/40,1000/1000,1000/35,1000/50,1000/500 */ If you don't need the order maintained, it becomes a much simpler problem and you can strip off some of this complexity. *

Re: Regexp_replace bug / does not terminate on long strings

2021-08-19 Thread Michael Lewis
Btw- My apologies for top posting. I think my caffeine wore off.

Re: string_agg distinct order by

2021-08-19 Thread Michael Lewis
NSERT INTO person VALUES ('Roe', 'ok'); SELECT * FROM person order by current_mood; SELECT * FROM person order by current_mood desc; Note- using enum may complicate other things in your usage, so I am not suggesting this is ideal, just one option. *Michael Lewis | Database Engineer* *Entrata*

Re: Querying a table with jaccard similarity with 1.6 million records take 12 seconds

2021-09-02 Thread Michael Lewis
This is showing many false positives from the index scan that get removed when the actual values are examined. With such a long search parameter, that does not seem surprising. I would expect a search on "raj nagar ghaziabad 201017" or something like that to yield far fewer results from the index s

Re: Query takes around 15 to 20 min over 20Lakh rows

2021-09-02 Thread Michael Lewis
What is T and how many rows are in there? How many rows in task_history? What indexes exist? Are you confident you want 2 million rows in that result set? What version is this on? What pg_settings have been changed from defaults?

Re: Query takes around 15 to 20 min over 20Lakh rows

2021-09-07 Thread Michael Lewis
. *Michael Lewis | Database Engineer* *Entrata*

Re: Fastest option to transfer db?

2021-09-13 Thread Michael Lewis
What version of Postgres is the source? Can you make use of logical replication?

Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Michael Lewis
This is not a Postgres issue. Please reach out to the Azure team. >

Re: Question about behavior of conditional indexes

2021-09-21 Thread Michael Lewis
You likely need to tune your autovacuum settings such that the index bloat reaches a steady-state and does not continue to increase. When vacuum runs, it will remove dead (and no longer visible to any running transactions) tuples aka row versions from each page (8KB block by default) in the file fo

Re: Timestamp with vs without time zone.

2021-09-21 Thread Michael Lewis
Related to this current discussion and exchange of ideas... is there a best practice for retrieving data in such a way as the rows are localized to a timezone for where/group by purposes. That is, if I have a table which has events, but those events belong to a tenant or some entity that has a loca

Re: Question about behavior of conditional indexes

2021-09-22 Thread Michael Lewis
Just to clarify, I suggested fillfactor likely could *not* help because you are indexing a column that has the value change. The idea with reducing fillfactor is that the rows can be updated with the new versions remaining in the same 8KB block in the file for table/row storage. If the new version

Re: Timestamp with vs without time zone.

2021-09-22 Thread Michael Lewis
On Wed, Sep 22, 2021 at 12:44 AM cen wrote: > On 21. 09. 21 23:34, Michael Lewis wrote: > > Related to this current discussion and exchange of ideas... is there a > best practice for retrieving data in such a way as the rows are localized > to a timezone for where/group by purpos

Re: Faster distinct query?

2021-09-22 Thread Michael Lewis
In the future, please share the plan returned by explain analyze, and some data about how many rows in the involved tables, what type of system you are running it on, any changes from default configs, etc. How does the below work? It should do a very simple index scan only, then aggregate the rela

Re: Faster distinct query?

2021-09-22 Thread Michael Lewis
On Wed, Sep 22, 2021 at 2:48 PM Tom Lane wrote: > "David G. Johnston" writes: > > There is no where clause so I'm doubtful there is much to be gained going > > down this path. The Index-Only scan seems like an optimal way to obtain > > this data and the existing query already does that. > > The

Re: Faster distinct query?

2021-09-23 Thread Michael Lewis
> > It’s only when I add in the AND data.channels=channels.channel that the > query time blows up to 5+ minutes. I personally don’t understand why there > would be such a large difference between the two queries - something wrong > with my indexes? > Did you do the manual vacuum as suggested by To

Re: Using a single sequence for all tables

2021-09-29 Thread Michael Lewis
If your processes somehow allow updates on the wrong table, then fix that. If you run out of space in whatever value range you choose initially, the pain to upgrade to a type that allows larger values would seem to be very large.

Re: DELETE ... USING LATERAL

2021-10-04 Thread Michael Lewis
On Mon, Oct 4, 2021, 10:30 AM Nikhil Benesch wrote: > you can't reuse the FROM table name in the USING clause: > > # delete from int_arrays using int_arrays; > ERROR: table name "int_arrays" specified more than once > Don't you need to use an alias for the table in the using clause?

Re: Query time related to limit clause

2021-10-05 Thread Michael Lewis
What's the definition for abc_serv_nch_q1_2021_expr_idx3? That is a jsonb field I assume? Statistics aren't great on jsonb data, so you may benefit greatly from pulling keys out to be stored as a standard column. I would be curious for more "pure" estimates on each quarterly partition directly for

Re: Avg/max size of these JSON docs in Postgres

2021-10-12 Thread Michael Lewis
On Tue, Oct 12, 2021 at 1:43 AM Simon Riggs wrote: > On Tue, 12 Oct 2021 at 08:14, Ram Pratap Maurya > wrote: > > > Confirm what is Avg/max size of these JSON docs in Postgres. > > JSON and JSONB datatypes can both be max 1GB in size. > That is per row. Just to confirm- The compression associa

Re: Avg/max size of these JSON docs in Postgres

2021-10-12 Thread Michael Lewis
On Tue, Oct 12, 2021 at 12:51 PM Simon Riggs wrote: > On Tue, 12 Oct 2021 at 18:53, Michael Lewis wrote: > > > > On Tue, Oct 12, 2021 at 1:43 AM Simon Riggs < > simon.ri...@enterprisedb.com> wrote: > >> > >> On Tue, 12 Oct 2021 at 08:14, Ram Pratap Ma

Re: Determining if a table really changed in a trigger

2021-10-26 Thread Michael Lewis
Does this perform differently from suppress_redundant_updates_trigger? https://www.postgresql.org/docs/current/functions-trigger.html

Re: How to copy rows into same table efficiently

2021-10-26 Thread Michael Lewis
On Tue, Oct 26, 2021 at 1:07 AM Arun Suresh wrote: > Current approach taken is to build a query like below: > INSERT INTO mytable (col1, col2, col3, col4) SELECT col1, 'XYZ', col3, > col4 FROM mytable WHERE col2 = 'ABCD' > > Is there a better way to do this? > There could be other tables with for

Re: Need recommendation on PARALLEL INDEX SCAN and PARALLEL APPEND

2021-10-26 Thread Michael Lewis
> > Planning Time: 10.351 ms > > Execution Time: 0.283 ms > Nearly all of the time is in planning... What is your default_statistics_target?

Re: Determining if a table really changed in a trigger

2021-10-26 Thread Michael Lewis
If you end up with no rows changing from an insert or delete, something seems awry. Unless you mean 0 rows affected. Do after statement triggers still execute? I suppose they very well might. Would the statement even execute if no rows get updated and that is prevented with before update? I would

Re: Model clause and

2021-10-28 Thread Michael Lewis
On Thu, Oct 28, 2021 at 1:57 PM SQL Padawan wrote: > I presume that VIRTUAL GENERATED columns are on the to-do list? > https://www.postgresql.org/docs/current/ddl-generated-columns.html Is this not what you want?

Re: Model clause and

2021-10-28 Thread Michael Lewis
Mea culpa. "PostgreSQL currently implements only stored generated columns." I should re-read what I share references to rather than trust my memory, particularly while sick. Thanks for kindly correcting.

Re: Debugging features needed

2021-11-05 Thread Michael Lewis
For my purposes, app name isn't long enough so we put a comment at the start of every SQL that has a unique ID generated in the application. This ensures that we can tell one connection apart from another even when both are coming from the same feature/action (applicationName is set to this) even u

Re: Logical Replication - Should Destination Table Columns Be Defined With Default Value

2021-11-09 Thread Michael Lewis
Is there any advantage to not defining the default on the replica? If it is not a static value and the publishing database will trigger row updates, I could see waiting to set the default until after the table re-write is done, but otherwise there doesn't seem to be any benefit to skipping column d

Re: Are all unlogged tables in any case truncated after a server-using

2021-11-12 Thread Michael Lewis
Why keep them as unlogged tables? If data is static, can you spare the disk space to gradually copy data from existing unlogged table to new copy that is logged, and then have brief exclusive lock to drop unlogged and rename new one?

Re: Are all unlogged tables in any case truncated after a server-crash?

2021-11-12 Thread Michael Lewis
Curious... why keep the table as unlogged if it is static? If you can spare the disk space, perhaps just create a regular table with same definition, gradually copy the data to spread the impact on WAL, and when complete, just drop the old table and rename the new one.

Row estimate on empty tables

2021-11-16 Thread Michael Lewis
ref: https://www.postgresql.org/message-id/9772.1510348781%40sss.pgh.pa.us This functionality hasn't changed in the years since this post, right? I'm dealing with temp tables specifically, but seem to be getting bit by this behavior occasionally so I would like to confirm. *Mic

Re: Execute command in PL/pgSQL function not executing

2021-11-19 Thread Michael Lewis
You could also just return the name of the temp table, or return nothing and directly use that name manually outside the function to do select * from pg_temp.my_data

Re: [EXTERNAL] Re: Inserts and bad performance

2021-11-24 Thread Michael Lewis
example and run it in a transaction with explain (analyze, buffers, verbose) and then rollback? *Michael Lewis | Database Engineer* *Entrata*

Re: Max connections reached without max connections reached

2021-12-01 Thread Michael Lewis
On Wed, Dec 1, 2021 at 3:52 PM Rob Sargent wrote: > Also be on the look-out for begin trans; begin trans; etc I read Kumar's > report as nested transactions. If your gang is doing a transaction per > row, they need a kick in the rear. Anyone not closing those needs a pink > slip. > I have see

Re: Max connections reached without max connections reached

2021-12-01 Thread Michael Lewis
On Wed, Dec 1, 2021 at 8:54 PM Rob Sargent wrote: > On 12/1/21 7:08 PM, Michael Lewis wrote: > > On Wed, Dec 1, 2021 at 3:52 PM Rob Sargent wrote: > >> Also be on the look-out for begin trans; begin trans; etc I read >> Kumar's report as nested transacti

Re: Max connections reached without max connections reached

2021-12-01 Thread Michael Lewis
On Wed, Dec 1, 2021 at 9:06 PM Dilip Kumar wrote: > IMHO, it is good to LOG such information if we are not already logging > this anywhere. > +1 I expect my 'vote' counts for naught, but I fully expect seeing these show up in the logs would have helped me much more quickly have insight into wha

Re: INSERT ... ON CONFLICT doesn't work

2021-12-01 Thread Michael Lewis
Perhaps I missed something, but why all this effort to reference the column and not just reference IX_Lockers_Uuid for the on conflict clause?

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

2021-12-06 Thread Michael Lewis
What is your default_statistics_target and how accurate is that estimate of 5668 rows? What is random_page_cost set to by the way? More importantly, what is the better plan that you'd like the planner to use with your existing indexes? It would seem logical to me to scan for the matching shipment_

Re: Postgresql + containerization possible use case

2021-12-10 Thread Michael Lewis
On Fri, Dec 10, 2021, 2:24 AM Achilleas Mantzios < ach...@matrix.gatewaynet.com> wrote: > Hi > we are running some 140 remote servers (in the 7 seas via satellite > connections) How are they used? What is in Postgres? Should that all have the exact same read only data at all times? >

Re: Postgresql + containerization possible use case

2021-12-11 Thread Michael Lewis
Interesting. I'm not sure that moving an image for a VM with Postgres is compatible with the goal of minimizing time/data but if upgrades are yearly or something, perhaps that is reasonable. It has been 9+ years since that post as well. But the full data will need to be included in the image if the

Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL

2021-12-21 Thread Michael Lewis
On Tue, Dec 21, 2021 at 11:50 AM Pavel Stehule wrote: > I wrote about it. Did you read this article? > > https://okbob.blogspot.com/2018/02/schema-variables.html > > The goals of this project: > >- fast non transactional non persistent (session based) storage, > > Would there be statistics th

Default values in functions

2021-12-29 Thread Michael Lewis
ATE + interval '2 hour', false ), ( null::int[] , CURRENT_DATE + interval '3 hour', true ), ( null , CURRENT_DATE + interval '4 hour', true ) )AS sub ( pArrayToCheck, pTimeToDisplay, expected_result ); *Michael Lewis | Database Engineer* *Entrata*

Re: Default values in functions

2021-12-29 Thread Michael Lewis
On Wed, Dec 29, 2021, 4:34 PM David G. Johnston wrote: > That isn’t how it works. Absence is what is important. Null is not > absence. As you showed, if you want nulls to be converted to defaults you > can use coalesce. > Thanks sir. It seems unfortunate that there is not a way to indicate ab

Re: Default values in functions

2021-12-29 Thread Michael Lewis
On Wed, Dec 29, 2021, 4:31 PM Tom Lane wrote: > You didn't say what icount() is, but if it's the one from > contrib/intarray, it's STRICT meaning it'll return NULL, > not zero, for a null array input. Thanks for that. Very good to know. (Or IOW, null::int[] is not at all the same thing as arra

Re: Default values in functions

2021-12-29 Thread Michael Lewis
On Wed, Dec 29, 2021, 5:31 PM Tom Lane wrote: > The way to do that is to use named parameters and the associated > call syntax, ie something like > > select myfunc(param1 => 42, param3 => 99); > Thanks very much. I have not seen that before. >

Re: Query much slower from php than psql or dbeaver

2022-01-20 Thread Michael Lewis
When dealing with foreign tables, I believe planning is not the same because of access to statistics (maybe has improved since 9.6 though). I just wonder... Would it be a viable option to create a materialized view using the FDW but then use the PHP script against the local tables only? Materialize

Re: tstzrange on large table gives poor estimate of expected rows

2022-01-24 Thread Michael Lewis
If interval_end_date is always 1 day ahead, why store it at all? Dependencies on a custom stats object wouldn't do anything I don't think because they are offset. They are 100% correlated, but not in a way that any of the existing stat types capture as far as I can figure.

Re: Undetected Deadlock

2022-01-26 Thread Michael Lewis
I must be missing something. You mentioned dropping a partition, so is there an actual need for the delete? Could you detach concurrently and then drop the table or delete rows if needed? https://www.postgresql.org/docs/14/sql-altertable.html#SQL-ALTERTABLE-DETACH-PARTITION

Re: Undetected Deadlock

2022-01-26 Thread Michael Lewis
There may be a bug so perhaps still pursue reproducing the issue, but I would expect drop concurrently to resolve your issue with the two processes conflicting. Also, perhaps trying"insert, on conflict do update" could be more efficient than the copy but obviously there are too many unknowns and va

Re: pg_try_advisory_lock is waiting?

2022-01-28 Thread Michael Lewis
pg_try_advisory_lock and other advisory lock functions take a lock on a key. The keys you have tried to lock are 0 and 1. Why? What problem are you trying to solve by changing from "for update" lock to advisory lock anyway?

Re: pg_try_advisory_lock is waiting?

2022-01-28 Thread Michael Lewis
On Fri, Jan 28, 2022 at 5:34 PM Mladen Gogala wrote: > pg_try_advisory_lock returned TRUE even without "FOR UPDATE" clause in the > subquery. Shouldn't it return false because it can't lock the row until the > uncommitted update finishes? > > > > The rows bein

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Michael Lewis
What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats give a mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck. Index Scan using data_pkey on data t (cost=0.57..21427806.53 *rows=58785023* width=131) (ac

Re: pg_cron for vacuum - dynamic table set

2022-02-03 Thread Michael Lewis
Can't you use a do script to construct and execute the statement? >

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-03 Thread Michael Lewis
I would guess that autovacuum is analyzing the table and causing the stats to change which is resulting in different estimates that result in different plans. Unless you can get the estimate much more accurate, you won't get far with expecting a stable plan that performs well. How is data_class_pk

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-14 Thread Michael Lewis
I think that theoretically if the planner sees a condition like a.column1 > constant_value, and it also has a condition like a.column1 = b.column2 then it could autogenerate the b.column2 > constant_value condition. And of course > could be <, <=, >= and <> But I could be wrong, particularly with

Re: Rows From but with Subqueries (or a cleaner non-array-using alternative)?

2022-02-14 Thread Michael Lewis
In pseudo code, group_index is defined as: case when LAG(v) OVER (ORDER BY i) = v then lag(i) ELSE i END, right? If you have that in the first cte instead of the start/end business, then you can just select vals, group number, and row_num over that new grouping, right? Something like this? WITH

Re: PostgreSQL's Max Identifier Length in Amazon RDS

2022-02-18 Thread Michael Lewis
I believe that would not be possible. One quick read- https://til.hashrocket.com/posts/8f87c65a0a-postgresqls-max-identifier-length-is-63-bytes With it being RDS, changing source and recompiling is not an option.

Re: PostgreSQL's Max Identifier Length in Amazon RDS

2022-02-18 Thread Michael Lewis
Also- https://www.postgresql.org/docs/10/runtime-config-preset.html "The following “parameters” are read-only, and are determined when PostgreSQL is compiled or when it is installed." >

<    1   2   3   4