Hi everyone.
We are currently in the process of upgrading from 12.15 with Timescale
2.11 to 17.2 with Timescale 2.17.2
On our pre-prod env we have already upgraded and noticed a remarkable
performance issue vs. the so far not upgraded production environment.
If we run the exact same query in our pre-rpod env, the execution time
increased from ~ 250 ms to over 377k ms.
Changing random_page_cost from 1.1 to 4 does not help, but changing
work_mem from 64 MB to 256 MB helps, whereas any value lower than 256 MB
does not help. The prod server is configured with work_mem = 50 MB and
is executing the query in 13 ms
The Hypertable has been created on column received_time with dimension Time.
In general, we see, that, if we are lowering the filter period of
tick.received_time to ~ 14 days, the query is perfomant (on pre-prod).
In prod, neither in- nor de-creasing this does significantly change the
execution time.
My first assumption is, that somehow accessing the compressed chunks is
eating up all the time (which we could also see in the below's explain
statements). I somehow tend to point to the compression methods which,
IIRC, significantly changed from 12 to 17.
Maybe someone could have a look into this and guide me to the right spot
for further examination or even solving this issue?
Best,
Tobias
The query is as follows (I had to anonymize certain things, therefore I
used xx):
select ins.xx_id,
tick.exchtim,
tick.received_time,
tick.ask,
tick.ask_spread,
tick.ask_zsprd,
tick.bid,
tick.bid_spread,
tick.bid_zsprd,
tick.oas,
ins.currency
fromxx.xx_tick_data_entity tick,
xx.xx_reference_data_entity ins
where ins.xx_id = 'XS2991917530' and
tick.xx_id='XS2991917530' and
tick.received_time <= '2025-01-29T14:31:36' and
tick.received_time > '2025-01-01T14:31:36'
order by tick.received_time desc limit 1;
Following settings are on the Hypertable:
pre-prod:
- policy_compression => compress_after 14 days
- policy_retention => drop_after 90 days
prod:
- policy_compression => compress_after 90 days
- policy_retention => drop_after 10 years
Config for the pre-prod env:
Server:
- VM running RHEL 9.5
- 8 cores
- 32 GB RAM
- 3 TB LVM (mounted as PGDATA) via 4x100 GBit/s NFS to VMware server)
with XFS
postgresql.conf:
- max_connections = 100
- shared_buffers = 8 GB
- work_mem = 64 MB
- maintenance_work_mem = 2 GB
- effective_cache_size = 22 GB
- archive_mode = on
- random_page_cost = 1.1
- effective_io_concurrency = 200
- default_statistics_target = 200
- max_worker_processes = 16
- max_parallel_workers = 16
- max_parallel_workers_per_gather = 4
- timescaledb.max.background.workers = 25
Config for the prod env:
Server:
- VM running RHEL 9.4
- 16 cores
- 128 GB RAM
- 9 TB LVM (mounted as PGDATA) via 4x100 GBit/s NFS to VMware server)
with XFS
postgresql.conf:
- max_connections = 250
- shared_buffers = 24 GB
- work_mem = 50 MB
- maintenance_work_mem = 2 GB
- effective_cache_size = 68 GB
- archive_mode = on
- random_page_cost = 1.1
- effective_io_concurrency = 200
- default_statistics_target = 200
- max_worker_processes = 32
- max_parallel_workers = 32
- max_parallel_workers_per_gather = 8
- timescaledb.max.background.workers = 16
Explain (analyze, buffers) on the pre-prod server with work_mem = 64 MB:
Limit (cost=0.85..3.41 rows=1 width=89) (actual
time=377174.493..377174.498 rows=0 loops=1)
Buffers: shared hit=12 read=12328808
-> Nested Loop (cost=0.85..13555863.74 rows=5293428 width=89)
(actual time=377174.492..377174.497 rows=0 loops=1)
Buffers: shared hit=12 read=12328808
-> Custom Scan (ChunkAppend) on xx_tick_data_entity tick
(cost=0.56..13489693.39 rows=5293428 width=72) (actual
time=377174.491..377174.495 rows=0 loops=1)
Order: tick.received_time DESC
Buffers: shared hit=12 read=12328808
-> Index Scan Backward using
"101_28_xx_tick_data_entity_pkey" on _hyper_2_101_chunk tick_1
(cost=0.56..3159.73 rows=2789 width=72) (actual time=0.009..0.009 rows=0
loops=1)
Index Cond: (((xx_id)::text =
'XS2991917530'::text) AND (received_time <= '2025-01-29
14:31:36'::timestamp without time zone) AND (received_time > '2025-01-01
14:31:36'::timestamp without time zone))
Buffers: shared hit=4
-> Index Scan Backward using
"89_25_xx_tick_data_entity_pkey" on _hyper_2_89_chunk tick_2
(cost=0.56..1856.80 rows=1639 width=72) (actual time=0.016..0.016 rows=0
loops=1)
Index Cond: (((xx_id)::text =
'XS2991917530'::text) AND (received_time <= '2025-01