REINDEXdb performance degrading gradually PG13.4

2022-05-31 Thread Praneel Devisetty
> Hi,
>
> We are trying to reindex 600k tables in a single database  of size 2.7TB
> using reindexdb utility in a shell script
> reindexdb -v -d $dbname -h $hostname -U tkcsowner --concurrently -j
> $parallel -S $schema
>
> our config is as below
>   name  | setting
> +-
>  auto_explain.log_buffers   | off
>  autovacuum_work_mem| 524288
>  dbms_pipe.total_message_buffer | 30
>  dynamic_shared_memory_type | posix
>  hash_mem_multiplier| 1
>  logical_decoding_work_mem  | 65536
>  maintenance_work_mem   | 2097152
>  shared_buffers | 4194304
>  shared_memory_type | mmap
>  temp_buffers   | 1024
>  wal_buffers| 2048
>  work_mem   | 16384
>
> Memory:
>  free -h
>   totalusedfree  shared  buff/cache
>  available
> Mem:   125G 38G1.1G 93M 85G
>  86G
> Swap:   74G188M 74G
>
>  nproc
> 16
>
> Initially it was processing 1000 tables per minute. Performance is
> gradually dropping and now after 24 hr it was processing 90 tables per
> minute.
>
> we see stats collector in top -c continuously active
>   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
>
>  3730 **  20   0  520928 233844   1244 R  61.8  0.2 650:31.36
> postgres: stats collector
>
>
> postgres=# SELECT date_trunc('second', current_timestamp -
> pg_postmaster_start_time()) as uptime;
>  uptime
> 
>  1 day 04:07:18
>
> top - 13:08:22 up 1 day,  5:45,  2 users,  load average: 1.65, 1.65, 1.56
> Tasks: 303 total,   3 running, 300 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  9.6 us,  3.4 sy,  0.0 ni, 86.8 id,  0.1 wa,  0.0 hi,  0.0 si,
> 0.0 st
> KiB Mem : 13185940+total,   992560 free, 40571300 used, 90295552 buff/cache
> KiB Swap: 78643200 total, 78450376 free,   192820 used. 90327376 avail Mem
>
> iostat -mxy 5
> Linux 3.10.0-1160.53.1.el7.x86_64
> (***) 05/31/2022  _x86_64_
> (16 CPU)
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>8.220.003.230.060.00   88.49
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sda   0.00 0.000.000.60 0.00 0.00
> 16.00 0.002.670.002.67   3.33   0.20
> sdb   0.00 0.000.000.00 0.00 0.00
>  0.00 0.000.000.000.00   0.00   0.00
> sdc   0.00 0.000.00   26.80 0.00 0.16
> 11.94 0.010.370.000.37   0.69   1.86
> sde   0.00 0.003.80   26.80 0.04 0.43
> 31.27 0.030.960.631.01   0.40   1.22
>
> DB version
> PostgreSQL 13.4
>
> Os
> bash-4.2$ cat /etc/redhat-release
> CentOS Linux release 7.9.2009 (Core)
>
>  What could be the possible bottleneck ?
>
> Best Regards
> Praneel
>
>
>


REINDEXdb performance degrading gradually PG13.4

2022-05-31 Thread David G. Johnston
On Tuesday, May 31, 2022, Praneel Devisetty 
wrote:

>
> Initially it was processing 1000 tables per minute. Performance is
>> gradually dropping and now after 24 hr it was processing 90 tables per
>> minute.
>>
>
That seems like a fairly problematic metric given the general vast
disparities in size tables have.

Building indexes is so IO heavy that the non-IO bottlenecks that exists
likely have minimal impact on the overall times this rebuild everything
will take.  That said, I’ve never done anything at this scale before.  I
wouldn’t be too surprised if per-session cache effects are coming into play
given the number of objects involved and the assumption that each session
used for parallelism is persistent.  I’m not sure how the parallelism works
for managing the work queue though as it isn’t documented and I haven’t
inspected the source code.


Logical reads

2022-05-31 Thread Goti
Hi listers,

Is there any sql query which we can use to find the logical reads performed
by particular sql statement in postgres ??

Thanks,
Goti
-- 
Thanks,

Goti


Re: REINDEXdb performance degrading gradually PG13.4

2022-05-31 Thread Michael Paquier
On Tue, May 31, 2022 at 08:42:06AM -0700, David G. Johnston wrote:
> Building indexes is so IO heavy that the non-IO bottlenecks that exists
> likely have minimal impact on the overall times this rebuild everything
> will take.  That said, I’ve never done anything at this scale before.  I
> wouldn’t be too surprised if per-session cache effects are coming into play
> given the number of objects involved and the assumption that each session
> used for parallelism is persistent.  I’m not sure how the parallelism works
> for managing the work queue though as it isn’t documented and I haven’t
> inspected the source code.

get_parallel_object_list() in reindexdb.c would give the idea, where
the list of tables to rebuild are ordered based on an "ORDER BY
c.relpages DESC", then the table queue is processed with its own
command, moving on to the next item once we are done with an item in
the list.
--
Michael


signature.asc
Description: PGP signature