Columnar Format Export in Postgres

2024-06-08 Thread Sushrut Shivaswamy
Hey Postgres team,

I have been working on adding support for columnar format export to
Postgres to speed up analytics queries.
I've created an extension that achieves this functionality here
.

I"m looking to improve the performance of this extension to enable drop-in
analytics support for Postgres. Some immediate improvements I have in mind
are:
 - Reduce memory consumption when exporting table data to columnar format
 - Create a native planner / execution hook that can read columnar data.

It would be very helpful if you could take a look and suggest improvements
to the extension.
Hopefully, this extension can be shipped by default with postgres at some
point in the future.

Thanks,
Sushrut


Re: Design for dashboard query

2024-06-15 Thread Sushrut Shivaswamy
Have you tried creating an index on the user ID column?
Scanning the entire table to apply granular filters on a few attributes
seems unnecessary.

Materialised views make sense if you want to aggregate some columns and
query a subset of the data but would recommend trying indexes first.

Finally, shameless plug but consider using the pg_analytica extension that
enables fast analytic queries on the tables which is ideal for analytics
use cases like dashboards.
https://github.com/sushrut141/pg_analytica
I’m the author of the extension and am looking for initial users to try it
out.

Thanks,
Sushrut


On Sat, 15 Jun 2024 at 6:54 PM, sud  wrote:

> Hello All,
>
> Its postgres version 15.4. We are having a requirement in which aggregated
> information for all the users has to be displayed on the UI screen. It
> should show that information on the screen. So basically, it would be
> scanning the full table data which is billions of rows across many months
> and then join with other master tables and aggregate those and then display
> the results based on the input "user id" filter.
>
> In such a scenario we are thinking of using a materialized view on top of
> the base tables which will store the base information and refresh those
> periodically to show the data based on the input user id. However i am
> seeing , postgres not supporting incremental refresh of materialized view
> and full refresh can take longer. So , do we have any other option
> available? Additionally , It should not impact or block the online users
> querying the same materialized view when the refresh is happening.
>
>
>