Design for dashboard query

2024-06-15 Thread sud
Hello All,

Its postgres version 15.4. We are having a requirement in which aggregated
information for all the users has to be displayed on the UI screen. It
should show that information on the screen. So basically, it would be
scanning the full table data which is billions of rows across many months
and then join with other master tables and aggregate those and then display
the results based on the input "user id" filter.

In such a scenario we are thinking of using a materialized view on top of
the base tables which will store the base information and refresh those
periodically to show the data based on the input user id. However i am
seeing , postgres not supporting incremental refresh of materialized view
and full refresh can take longer. So , do we have any other option
available? Additionally , It should not impact or block the online users
querying the same materialized view when the refresh is happening.


pgstattuple - can it/does it provide stats by partition?

2024-06-15 Thread Dragam
Hi all,

As the title says - does (and/or can) pgstattuple provide a breakdown by 
partition?

Any further info needed, please let me know.

TIA and rgs,

Dragam

Envoyé avec la messagerie sécurisée [Proton Mail.](https://proton.me/)

Re: Design for dashboard query

2024-06-15 Thread Sushrut Shivaswamy
Have you tried creating an index on the user ID column?
Scanning the entire table to apply granular filters on a few attributes
seems unnecessary.

Materialised views make sense if you want to aggregate some columns and
query a subset of the data but would recommend trying indexes first.

Finally, shameless plug but consider using the pg_analytica extension that
enables fast analytic queries on the tables which is ideal for analytics
use cases like dashboards.
https://github.com/sushrut141/pg_analytica
I’m the author of the extension and am looking for initial users to try it
out.

Thanks,
Sushrut


On Sat, 15 Jun 2024 at 6:54 PM, sud  wrote:

> Hello All,
>
> Its postgres version 15.4. We are having a requirement in which aggregated
> information for all the users has to be displayed on the UI screen. It
> should show that information on the screen. So basically, it would be
> scanning the full table data which is billions of rows across many months
> and then join with other master tables and aggregate those and then display
> the results based on the input "user id" filter.
>
> In such a scenario we are thinking of using a materialized view on top of
> the base tables which will store the base information and refresh those
> periodically to show the data based on the input user id. However i am
> seeing , postgres not supporting incremental refresh of materialized view
> and full refresh can take longer. So , do we have any other option
> available? Additionally , It should not impact or block the online users
> querying the same materialized view when the refresh is happening.
>
>
>


Re: pgstattuple - can it/does it provide stats by partition?

2024-06-15 Thread Ron Johnson
On Sat, Jun 15, 2024 at 9:47 AM Dragam  wrote:

> Hi all,
>
> As the title says - does (and/or can) pgstattuple provide a breakdown by
> partition?
> 
>

Partitions are tables, created via CREATE TABLE.  I know from experience
that it works on inheritance tables; what happens when you try
pgstattuple() on a small test partition?


Re: Design for dashboard query

2024-06-15 Thread yudhi s
>
>
> On Sat, 15 Jun 2024 at 6:54 PM, sud  wrote:
>
>> Hello All,
>>
>> Its postgres version 15.4. We are having a requirement in which
>> aggregated information for all the users has to be displayed on the UI
>> screen. It should show that information on the screen. So basically, it
>> would be scanning the full table data which is billions of rows across many
>> months and then join with other master tables and aggregate those and then
>> display the results based on the input "user id" filter.
>>
>> In such a scenario we are thinking of using a materialized view on top of
>> the base tables which will store the base information and refresh those
>> periodically to show the data based on the input user id. However i am
>> seeing , postgres not supporting incremental refresh of materialized view
>> and full refresh can take longer. So , do we have any other option
>> available? Additionally , It should not impact or block the online users
>> querying the same materialized view when the refresh is happening.
>>
>>
>>
Yes incremental refresh is by default not available but you can do the
refresh using "concurrently" keyword which will be online i believe and
won't block your ongoing  queries if any.

And there are some other ways like create another view(say mv_new) with
exactly the same definition and refresh that with whatever interval you
want and then switch the views with the original one using the rename
command.


ALTER MATERIALIZED VIEW mv RENAME TO mv_old;
ALTER MATERIALIZED VIEW mv_new RENAME TO mv;
DROP MATERIALIZED VIEW mv_old;

Or

have a log table created (say transaction_log) and populate it using
triggers with each delta insert/update/delete on the transaction table. And
then schedule a cron job which will periodically flush the rows from the
transaction_log table to the materialized view. This will achieve your
incremental refresh.


Is a VACUUM or ANALYZE necessary after logical replication?

2024-06-15 Thread Koen De Groote
I've gone over all of
https://www.postgresql.org/docs/current/logical-replication.html and the
only mentions of the word "index" I could find was in relation to replica
identity and examples of table definitions showing primary key indexes.

Nothing is said about indexes. Maybe for good reason, maybe they are fully
functionality immediately after replication?

So the main question: Once a table is fully replicated, do I need to
vacuum(analyze) that table, or are the indexes on that table already
functional?

Regards,
Koen De Groote


Re: Is a VACUUM or ANALYZE necessary after logical replication?

2024-06-15 Thread Adrian Klaver

On 6/15/24 15:55, Koen De Groote wrote:
I've gone over all of 
https://www.postgresql.org/docs/current/logical-replication.html 
 and 
the only mentions of the word "index" I could find was in relation to 
replica identity and examples of table definitions showing primary key 
indexes.


Nothing is said about indexes. Maybe for good reason, maybe they are 
fully functionality immediately after replication?


So the main question: Once a table is fully replicated, do I need to 
vacuum(analyze) that table, or are the indexes on that table already 
functional?


VACUUM/ANALYZE is not about making the index functional. The VACUUM 
marks the space dead tuples occupy in the table and associated indexes 
as available for recycling. The ANALYZE updates tables statistics to 
help the planner make decisions on what query plan to use. On a fresh 
table VACUUM will not be of much value, ANALYZE though will help by 
creating up to date table statistics.




Regards,
Koen De Groote


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: DROP COLLATION vs pg_collation question

2024-06-15 Thread Laurenz Albe
On Fri, 2024-06-14 at 22:08 +0200, Karsten Hilbert wrote:
> Are collations per-database or per-cluster objects ?

Each database has its own "pg_collation" catalog table.

So they are local to the database, but the collations themselves
are defined by an external library, so the implementation is shared.

> I am asking because I seem to not be enabled to
> 
> 3) update collation version information in pg_collations for
>    collations intended for an encoding different from the
>    database encoding (ALTER COLLATION ... REFRESH VERSION fails)
> 
> which in effect would mean that -- upon change of collation
> versions in the underlying operating system (ICU update, libc
> update) -- one would have to live with outdated version
> information in pg_collations short of dump/sed/restore or
> some such ?

That should not happen.  What error do you get when you

  ALTER COLLATION ... REFRESH VERSION

Does the following give you the same error?

  ALTER DATABASE ... REFRESH COLLATION VERSION

Yours,
Laurenz Albe




Re: Is a VACUUM or ANALYZE necessary after logical replication?

2024-06-15 Thread Achilleas Mantzios

Στις 16/6/24 02:13, ο/η Adrian Klaver έγραψε:

On 6/15/24 15:55, Koen De Groote wrote:
I've gone over all of 
https://www.postgresql.org/docs/current/logical-replication.html 
 
and the only mentions of the word "index" I could find was in 
relation to replica identity and examples of table definitions 
showing primary key indexes.


Nothing is said about indexes. Maybe for good reason, maybe they are 
fully functionality immediately after replication?


So the main question: Once a table is fully replicated, do I need to 
vacuum(analyze) that table, or are the indexes on that table already 
functional?


VACUUM/ANALYZE is not about making the index functional. The VACUUM 
marks the space dead tuples occupy in the table and associated indexes 
as available for recycling. The ANALYZE updates tables statistics to 
help the planner make decisions on what query plan to use. On a fresh 
table VACUUM will not be of much value, ANALYZE though will help by 
creating up to date table statistics.


Hi Adrian, however in case the replication is problematic due to e.g. :

- wrong encoding to the new system, e.g. from SQL_ASCII to UTF-8

- ALWAYS triggers written without full schema qualification or other 
problems throwing ERRORs, etc


- server restarts during the sync phase

- etc

those will produce rollbacks, hence bloating right from the start. In 
those cases either VACUUM FULL will be needed, or even better correct 
the errors in their source and repeat the whole process. Normally the 
new DB (subscriber) should be a little smaller than the original 
(publisher).


As you said, in any case, ANALYZE will be always needed afterwards, as 
well as taking care of sequences. IMHO the sequence part would be nice 
to be handled in a more elegant manner, e.g. by an option in pg_dump to 
dump only sequences, for "nearly zero" downtime upgrades, this step 
should happen rapidly at the switchover.






Regards,
Koen De Groote



--
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt (as agents only)