Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Samed YILDIRIM
Hi Jordan,

Increased shared buffer size does not necessarily mean an increased
performance.

Regarding the negative correlation between IOWait and shared_buffers' size;
if you don't increase memory of the system, it is an expected result in my
opinion. Because, PostgreSQL starts reserving a bigger portion of the
system memory, and the OS cache size decreases respectively. Smaller OS
cache can easily result with more disk access and higher IO demand and
bigger IOWait.

As you can see in graphs, when you increase the size of shared_buffers, you
see higher block hits and lower block reads. "hits" refers to the blocks
that are already in shared_buffers. "reads" refers to the blocks that are
not in shared_buffers and *"read from* *disk"*. But, *"read from disk"*
that you see in PostgreSQL's statistic catalogs doesn't mean all of those
blocks were read from the disk. PostgreSQL requests data blocks, which are
not already in shared_buffers, from the kernel. And, if the requested block
is in the OS cache, the kernel provides it directly from the memory. No
disk access, therefore, happens at all. And, you observe that through lower
disk access (I/O) and lower IOWait on your operating system.

When you increase size of shared_buffers without increasing amount of the
system memory and with or without decreasing effective_cache_size,
PostgreSQL considers the possibility of the block to be requested on the
memory lower than previous configuration. So, it creates execution plans
with less index usages. Less index usage means more sequential scan. More
sequential scan means more disk read. We already have less OS cache. And
the system has to carry out more disk accesses.

As you can see, they are all connected. Setting shared_buffers higher than
a threshold, which varies from database to database, actually decreases
your performance.

To conclude, your results are expected results.

A useful resource to read:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

>  . given the way PostgreSQL also relies on the operating system cache,
> it's unlikely you'll find using more than 40% of RAM to work better than a
> smaller amount.
>

Best regards.
Samed YILDIRIM


On Tue, 13 Dec 2022 at 02:29, Jordan Hurwich 
wrote:

> Hi everyone,
> I'm writing to ask about a correlation I was surprised to observe on our
> PSQL machines (particularly read-only standbys) where increasing
> "shared_buffers" appears to result in
> increased pg_stat_database.blk_read_time and CPU iowait, which in turn
> seems to correlate with reduced throughput for our query-heavy services -
> details below.
>
> Is this expected, or are there configuration changes we might make to
> improve the performance at higher "shared_buffers" values?
>
> Thanks, let me know if I can provide any more info,
> Jordan
>
>- Tests and results - public Datadog dashboard here
>,
>screenshot attached:
>   - Our beta system ("endor") was run with three different
>   configurations over the ~30hrs from Dec 11 17:00 to Dec 13 0:00 (UTC)
>   - The only changes between these deployments was the
>   "shared_buffers" parameter for all PSQL instances (machine and
>   configuration details below).
>  - "shared_buffers" = "4000MB" - from Dec 10 19:00 to Dec 11
>  20:00 UTC
>  - "shared_buffers" = "8000MB" - from Dec 11 21:00 to Dec 12
>  13:30 UTC
>  - "shared_buffers" = "14000MB" - from Dec 12, 14:30 to Dec 13,
>  0:00 UTC
>   - The datadog dashboard
>   
>   shows our results including cpu divided by usage and the cache hit vs 
> disk
>   read ratio including blk_read_time (additional metrics were enabled at
>   about Dec 11, 10am PST)
>  - Our most query heavy service is our "Trends worker" for which
>  the average worker duration is shown in the top-left graph
> - We expect the workload to be relatively constant throughout
> this period, particularly focusing on the standby instances 
> (PQSL2 and
> PSQL3) where all read-only queries should be sent.
>  - We see the lowest duration, i.e. best performance, most
>  consistently with the lowest setting for shared_buffers, "4000MB"
>  - As we increase shared_buffers we see increased iowait on the
>  standby instances (PSQL2 and PSQL3) and increased blk_read_time
>  (per pg_stat_database), in the bottom-most graphs as 
> "blks_read_time".
> - Even though we also see a higher ratio of cache hits on
> those instances. Our graphs show the per second change
> in pg_stat_database.blks_read abd blks_hit (as "all_hit/s" and
> "all_read/s") and pg_statio_user_tables.heap_blks_read, 
> heap_blks_hit,
> idx_blks_read, and idx_blks_hit
>  - C

Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Jordan Hurwich
Thanks for your thoughtful response Samed.

I'm familiar with the article you linked to, and part of my surprise is
that with these 32GB RAM machines we're seeing better performance at 12.5%
(4GB) than the commonly recommended 25% (8GB) of system memory for
shared_buffers. Your notes about disk read stats from Postgres potentially
actually representing blocks read from the OS cache make sense, I just
imagined that Postgres would be better at managing the memory when it was
dedicated to it via shared_buffers than the OS (obviously with some point
of diminishing returns); and I'm still hoping there's some Postgres
configuration change we can make that enables better performance through
improved utilization of shared_buffers at the commonly recommended 25% of
system memory.

You mentioned effective_cache_size, which we currently have set to 16GB
(50% of system memory). Is it worth us experimenting with that value, if so
would you recommend we try reducing it or increasing it? Are there other
settings that we might consider to see if we can improve the utilization of
shared_buffers at higher values like 8GB (25% of system memory)?

On Wed, Dec 14, 2022 at 4:38 AM Samed YILDIRIM  wrote:

> Hi Jordan,
>
> Increased shared buffer size does not necessarily mean an increased
> performance.
>
> Regarding the negative correlation between IOWait and shared_buffers'
> size; if you don't increase memory of the system, it is an expected result
> in my opinion. Because, PostgreSQL starts reserving a bigger portion of the
> system memory, and the OS cache size decreases respectively. Smaller OS
> cache can easily result with more disk access and higher IO demand and
> bigger IOWait.
>
> As you can see in graphs, when you increase the size of shared_buffers,
> you see higher block hits and lower block reads. "hits" refers to the
> blocks that are already in shared_buffers. "reads" refers to the blocks
> that are not in shared_buffers and *"read from* *disk"*. But, *"read from
> disk"* that you see in PostgreSQL's statistic catalogs doesn't mean all
> of those blocks were read from the disk. PostgreSQL requests data blocks,
> which are not already in shared_buffers, from the kernel. And, if the
> requested block is in the OS cache, the kernel provides it directly from
> the memory. No disk access, therefore, happens at all. And, you observe
> that through lower disk access (I/O) and lower IOWait on your operating
> system.
>
> When you increase size of shared_buffers without increasing amount of the
> system memory and with or without decreasing effective_cache_size,
> PostgreSQL considers the possibility of the block to be requested on the
> memory lower than previous configuration. So, it creates execution plans
> with less index usages. Less index usage means more sequential scan. More
> sequential scan means more disk read. We already have less OS cache. And
> the system has to carry out more disk accesses.
>
> As you can see, they are all connected. Setting shared_buffers higher than
> a threshold, which varies from database to database, actually decreases
> your performance.
>
> To conclude, your results are expected results.
>
> A useful resource to read:
> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>
>>  . given the way PostgreSQL also relies on the operating system
>> cache, it's unlikely you'll find using more than 40% of RAM to work better
>> than a smaller amount.
>>
>
> Best regards.
> Samed YILDIRIM
>
>
> On Tue, 13 Dec 2022 at 02:29, Jordan Hurwich 
> wrote:
>
>> Hi everyone,
>> I'm writing to ask about a correlation I was surprised to observe on our
>> PSQL machines (particularly read-only standbys) where increasing
>> "shared_buffers" appears to result in
>> increased pg_stat_database.blk_read_time and CPU iowait, which in turn
>> seems to correlate with reduced throughput for our query-heavy services -
>> details below.
>>
>> Is this expected, or are there configuration changes we might make to
>> improve the performance at higher "shared_buffers" values?
>>
>> Thanks, let me know if I can provide any more info,
>> Jordan
>>
>>- Tests and results - public Datadog dashboard here
>>,
>>screenshot attached:
>>   - Our beta system ("endor") was run with three different
>>   configurations over the ~30hrs from Dec 11 17:00 to Dec 13 0:00 (UTC)
>>   - The only changes between these deployments was the
>>   "shared_buffers" parameter for all PSQL instances (machine and
>>   configuration details below).
>>  - "shared_buffers" = "4000MB" - from Dec 10 19:00 to Dec 11
>>  20:00 UTC
>>  - "shared_buffers" = "8000MB" - from Dec 11 21:00 to Dec 12
>>  13:30 UTC
>>  - "shared_buffers" = "14000MB" - from Dec 12, 14:30 to Dec 13,
>>  0:00 UTC
>>   - The datadog dashboard
>>   

Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Tom Lane
Jordan Hurwich  writes:
> I'm familiar with the article you linked to, and part of my surprise is
> that with these 32GB RAM machines we're seeing better performance at 12.5%
> (4GB) than the commonly recommended 25% (8GB) of system memory for
> shared_buffers. Your notes about disk read stats from Postgres potentially
> actually representing blocks read from the OS cache make sense, I just
> imagined that Postgres would be better at managing the memory when it was
> dedicated to it via shared_buffers than the OS (obviously with some point
> of diminishing returns); and I'm still hoping there's some Postgres
> configuration change we can make that enables better performance through
> improved utilization of shared_buffers at the commonly recommended 25% of
> system memory.

Keep in mind that 25% was never some kind of golden number.  It is
a rough rule of thumb that was invented for far smaller machines than
what you're talking about here.

regards, tom lane




Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Jordan Hurwich
Thanks Tom, that makes a lot of sense. Given we're seeing low iowait and
blk_read_time at 4GB shared_buffers, sounds like we should just declare
victory here and be happy with that setting?

On Wed, Dec 14, 2022 at 10:27 AM Tom Lane  wrote:

> Jordan Hurwich  writes:
> > I'm familiar with the article you linked to, and part of my surprise is
> > that with these 32GB RAM machines we're seeing better performance at
> 12.5%
> > (4GB) than the commonly recommended 25% (8GB) of system memory for
> > shared_buffers. Your notes about disk read stats from Postgres
> potentially
> > actually representing blocks read from the OS cache make sense, I just
> > imagined that Postgres would be better at managing the memory when it was
> > dedicated to it via shared_buffers than the OS (obviously with some point
> > of diminishing returns); and I'm still hoping there's some Postgres
> > configuration change we can make that enables better performance through
> > improved utilization of shared_buffers at the commonly recommended 25% of
> > system memory.
>
> Keep in mind that 25% was never some kind of golden number.  It is
> a rough rule of thumb that was invented for far smaller machines than
> what you're talking about here.
>
> regards, tom lane
>


Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Samed YILDIRIM
Hello Jordan,

You don't have to set %25 for the best performance. You need to test
different values for your database. If I were you, I would

   - try to enable huge pages. You probably will see better performance
   with bigger shared_buffers when you configure huge pages. ->
   https://www.postgresql.org/docs/14/kernel-resources.html#LINUX-HUGE-PAGES
   - set effective_io_concurrency to 200. But, you need to test to figure
   out the best value. It significantly depends on your disk's
   metrics/configuration
   - set random_page_cost to 2 and try to decrease it gradually until 1.2.
   - set effective_cache_size to 24GB
   - run pg_test_timing on the server to see the cost of asking time to the
   system. Because track_io_timing is enabled in your configuration file. If
   it is expensive, I would disable tracking io timing.


Note that I assumed that those resources/servers are reserved for
PostgreSQL and there is no other service running on them.

Best regards.
Samed YILDIRIM


On Wed, 14 Dec 2022 at 20:12, Jordan Hurwich 
wrote:

> Thanks for your thoughtful response Samed.
>
> I'm familiar with the article you linked to, and part of my surprise is
> that with these 32GB RAM machines we're seeing better performance at 12.5%
> (4GB) than the commonly recommended 25% (8GB) of system memory for
> shared_buffers. Your notes about disk read stats from Postgres potentially
> actually representing blocks read from the OS cache make sense, I just
> imagined that Postgres would be better at managing the memory when it was
> dedicated to it via shared_buffers than the OS (obviously with some point
> of diminishing returns); and I'm still hoping there's some Postgres
> configuration change we can make that enables better performance through
> improved utilization of shared_buffers at the commonly recommended 25% of
> system memory.
>
> You mentioned effective_cache_size, which we currently have set to 16GB
> (50% of system memory). Is it worth us experimenting with that value, if so
> would you recommend we try reducing it or increasing it? Are there other
> settings that we might consider to see if we can improve the utilization of
> shared_buffers at higher values like 8GB (25% of system memory)?
>
> On Wed, Dec 14, 2022 at 4:38 AM Samed YILDIRIM  wrote:
>
>> Hi Jordan,
>>
>> Increased shared buffer size does not necessarily mean an increased
>> performance.
>>
>> Regarding the negative correlation between IOWait and shared_buffers'
>> size; if you don't increase memory of the system, it is an expected result
>> in my opinion. Because, PostgreSQL starts reserving a bigger portion of the
>> system memory, and the OS cache size decreases respectively. Smaller OS
>> cache can easily result with more disk access and higher IO demand and
>> bigger IOWait.
>>
>> As you can see in graphs, when you increase the size of shared_buffers,
>> you see higher block hits and lower block reads. "hits" refers to the
>> blocks that are already in shared_buffers. "reads" refers to the blocks
>> that are not in shared_buffers and *"read from* *disk"*. But, *"read
>> from disk"* that you see in PostgreSQL's statistic catalogs doesn't mean
>> all of those blocks were read from the disk. PostgreSQL requests data
>> blocks, which are not already in shared_buffers, from the kernel. And, if
>> the requested block is in the OS cache, the kernel provides it directly
>> from the memory. No disk access, therefore, happens at all. And, you
>> observe that through lower disk access (I/O) and lower IOWait on your
>> operating system.
>>
>> When you increase size of shared_buffers without increasing amount of the
>> system memory and with or without decreasing effective_cache_size,
>> PostgreSQL considers the possibility of the block to be requested on the
>> memory lower than previous configuration. So, it creates execution plans
>> with less index usages. Less index usage means more sequential scan. More
>> sequential scan means more disk read. We already have less OS cache. And
>> the system has to carry out more disk accesses.
>>
>> As you can see, they are all connected. Setting shared_buffers higher
>> than a threshold, which varies from database to database, actually
>> decreases your performance.
>>
>> To conclude, your results are expected results.
>>
>> A useful resource to read:
>> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>>
>>>  . given the way PostgreSQL also relies on the operating system
>>> cache, it's unlikely you'll find using more than 40% of RAM to work better
>>> than a smaller amount.
>>>
>>
>> Best regards.
>> Samed YILDIRIM
>>
>>
>> On Tue, 13 Dec 2022 at 02:29, Jordan Hurwich 
>> wrote:
>>
>>> Hi everyone,
>>> I'm writing to ask about a correlation I was surprised to observe on our
>>> PSQL machines (particularly read-only standbys) where increasing
>>> "shared_buffers" appears to result in
>>> increased pg_stat_database.blk_read_time and CPU iowait, which in turn
>>> seems to correlate with 

Re: Increased iowait and blk_read_time with higher shared_buffers

2022-12-14 Thread Jordan Hurwich
Awesome, this is really helpful Samed. I'll start experimenting with these
settings next. Really appreciate your guidance.

On Wed, Dec 14, 2022 at 10:41 AM Samed YILDIRIM  wrote:

> Hello Jordan,
>
> You don't have to set %25 for the best performance. You need to test
> different values for your database. If I were you, I would
>
>- try to enable huge pages. You probably will see better performance
>with bigger shared_buffers when you configure huge pages. ->
>https://www.postgresql.org/docs/14/kernel-resources.html#LINUX-HUGE-PAGES
>- set effective_io_concurrency to 200. But, you need to test to figure
>out the best value. It significantly depends on your disk's
>metrics/configuration
>- set random_page_cost to 2 and try to decrease it gradually until 1.2.
>- set effective_cache_size to 24GB
>- run pg_test_timing on the server to see the cost of asking time to
>the system. Because track_io_timing is enabled in your configuration file.
>If it is expensive, I would disable tracking io timing.
>
>
> Note that I assumed that those resources/servers are reserved for
> PostgreSQL and there is no other service running on them.
>
> Best regards.
> Samed YILDIRIM
>
>
> On Wed, 14 Dec 2022 at 20:12, Jordan Hurwich 
> wrote:
>
>> Thanks for your thoughtful response Samed.
>>
>> I'm familiar with the article you linked to, and part of my surprise is
>> that with these 32GB RAM machines we're seeing better performance at 12.5%
>> (4GB) than the commonly recommended 25% (8GB) of system memory for
>> shared_buffers. Your notes about disk read stats from Postgres potentially
>> actually representing blocks read from the OS cache make sense, I just
>> imagined that Postgres would be better at managing the memory when it was
>> dedicated to it via shared_buffers than the OS (obviously with some point
>> of diminishing returns); and I'm still hoping there's some Postgres
>> configuration change we can make that enables better performance through
>> improved utilization of shared_buffers at the commonly recommended 25% of
>> system memory.
>>
>> You mentioned effective_cache_size, which we currently have set to 16GB
>> (50% of system memory). Is it worth us experimenting with that value, if so
>> would you recommend we try reducing it or increasing it? Are there other
>> settings that we might consider to see if we can improve the utilization of
>> shared_buffers at higher values like 8GB (25% of system memory)?
>>
>> On Wed, Dec 14, 2022 at 4:38 AM Samed YILDIRIM  wrote:
>>
>>> Hi Jordan,
>>>
>>> Increased shared buffer size does not necessarily mean an increased
>>> performance.
>>>
>>> Regarding the negative correlation between IOWait and shared_buffers'
>>> size; if you don't increase memory of the system, it is an expected result
>>> in my opinion. Because, PostgreSQL starts reserving a bigger portion of the
>>> system memory, and the OS cache size decreases respectively. Smaller OS
>>> cache can easily result with more disk access and higher IO demand and
>>> bigger IOWait.
>>>
>>> As you can see in graphs, when you increase the size of shared_buffers,
>>> you see higher block hits and lower block reads. "hits" refers to the
>>> blocks that are already in shared_buffers. "reads" refers to the blocks
>>> that are not in shared_buffers and *"read from* *disk"*. But, *"read
>>> from disk"* that you see in PostgreSQL's statistic catalogs doesn't
>>> mean all of those blocks were read from the disk. PostgreSQL requests data
>>> blocks, which are not already in shared_buffers, from the kernel. And, if
>>> the requested block is in the OS cache, the kernel provides it directly
>>> from the memory. No disk access, therefore, happens at all. And, you
>>> observe that through lower disk access (I/O) and lower IOWait on your
>>> operating system.
>>>
>>> When you increase size of shared_buffers without increasing amount of
>>> the system memory and with or without decreasing effective_cache_size,
>>> PostgreSQL considers the possibility of the block to be requested on the
>>> memory lower than previous configuration. So, it creates execution plans
>>> with less index usages. Less index usage means more sequential scan. More
>>> sequential scan means more disk read. We already have less OS cache. And
>>> the system has to carry out more disk accesses.
>>>
>>> As you can see, they are all connected. Setting shared_buffers higher
>>> than a threshold, which varies from database to database, actually
>>> decreases your performance.
>>>
>>> To conclude, your results are expected results.
>>>
>>> A useful resource to read:
>>> https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
>>>
  . given the way PostgreSQL also relies on the operating system
 cache, it's unlikely you'll find using more than 40% of RAM to work better
 than a smaller amount.

>>>
>>> Best regards.
>>> Samed YILDIRIM
>>>
>>>
>>> On Tue, 13 Dec 2022 at 02:29, Jordan Hurwich 
>>> wrote:
>>>

creating hash indexes

2022-12-14 Thread Rick Otten
I inherited a database with several single-digit billion row tables.  Those
tables have a varchar(36) column populated with uuids (all connected to
each other via FKs) each currently supported by a btree index.

After the recent conversations about hash indexes I thought I'd do some
comparisons to see if using a hash index could help and perhaps
depriortize my burning desire to change the data type.  We never look up
uuids with inequalities after all.  Indeed, in my test environments the
hash index was half the size of the btree index, and the select performance
was slightly faster than btree lookups. varchar(36) with hash index was
roughly comparable to using a uuid data type (btree or hash index).

I was pretty excited until I tried to create the index on a table with the
data (instead of creating it ahead of time and then loading up the test
data).

Working in PG 14.5, on a tiny 9M row table, in an idle database, I found:
- creating the btree index on the varchar(36) column to consistently take 7
*seconds*
- creating the hash index on the varchar(36) to consistently take 1 *hour*

I was surprised at how dramatically slower it was.   I tried this on both
partitioned and non-partitioned tables (with the same data set) and in both
cases the timings came out similar.

I also tried creating a hash index on a varchar(100) column, also with 9M
rows. I gave up after it did not complete after several hours.  (it wasn't
locked, just slow)

While I was experimenting with the different index types, I did some insert
tests.  After putting the hash index on the column, the inserts were
significantly slower.  The btree index was *6-7x *slower than no index, and
the hash index was *100x* slower than no index.

Assuming I can live with the slower inserts, is there any parameter in
particular I can tweak that would make the time it takes to create the hash
index closer to the btree index creation time?  In particular if I wanted
to try this on a several billion row table in a busy database?

---

FWIW, from my tests on my laptop, on a 250M row table last weekend, after
100K selects:

MEAN (ms)  | btree|  hash
-  | ---  |  
varchar| 28.14916 | 27.03769
uuid   | 27.04855 | 27.64424

and the sizes

SIZE| btree| hash
| -| 
varchar | 12 GB| 6212 MB
uuid| 6595 MB  | 6212 MB

-  As long as the index fits in memory, varchar btree isn't really that
much slower in postgresql 14 (the way it was a few years ago), so we'll
probably just live with that for the forseeable future given the complexity
of changing things at the moment.

--
Rick


Re: creating hash indexes

2022-12-14 Thread Peter Geoghegan
On Wed, Dec 14, 2022 at 12:03 PM Rick Otten  wrote:
> Assuming I can live with the slower inserts, is there any parameter in 
> particular I can tweak that would make the time it takes to create the hash 
> index closer to the btree index creation time?  In particular if I wanted to 
> try this on a several billion row table in a busy database?

No. B-Tree index builds are parallelized, and are far better optimized
in general.

> -  As long as the index fits in memory, varchar btree isn't really that much 
> slower in postgresql 14 (the way it was a few years ago), so we'll probably 
> just live with that for the forseeable future given the complexity of 
> changing things at the moment.

The other things to consider are 1.) the index size after retail
inserts, 2.) the index size following some number of updates and
deletes.

Even if you just had plain inserts for your production workload, the
picture will not match your test case (which I gather just looked at
the index size after a CREATE INDEX ran). I think that B-Tree indexes
will still come out ahead if you take this growth into account, and by
quite a bit, but probably not due to any effect that your existing test case
exercises.

B-Tree indexes are good at accommodating unpredictable growth, without
ever getting terrible performance on any metric of interest. So it's
not just that they tend to have better performance on average than
hash indexes (though they do); it's that they have much more
*predictable* performance characteristics as conditions change.

--
Peter Geoghegan