High COMMIT times
Good morning, This week we've noticed that we're starting to see spikes where COMMITs are taking much longer than usual. Sometimes, quite a few seconds to finish. After a few minutes they disappear but then return seemingly at random. This becomes visible to the app and end user as a big stall in activity. The checkpoints are still running for their full 5 min checkpoint_timeout duration (logs all say "checkpoint starting: time" and I'm not seeing any warnings about them occurring too frequently. This is PostgreSQL 12.4 on Ubuntu 18.04, all running in MS Azure (*not* managed by them). # select version(); version - PostgreSQL 12.4 (Ubuntu 12.4-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit I have the stats_temp_directory in a tmpfs mount. I *do* have pg_wal on the same premium SSD storage volume as the data directory. Normally I would know to separate these but I was told with the cloud storage that it's all virtualized anyway, plus storage IOPS are determined by disk size so having a smaller volume just for pg_wal would hurt me in this case. The kind folks in the PG community Slack suggested just having one large premium cloud storage mount for the data directory and leave pg_wal inside because this virtualization removes any guarantee of true separation. I'm wondering if others have experience running self-managed PG in a cloud setting (especially if in MS Azure) and what they might have seen/done in cases like this. Thanks, Don. -- Don Seiler www.seiler.us
Re: High COMMIT times
> I have the stats_temp_directory in a tmpfs mount. I *do* have pg_wal on > the same premium SSD storage volume as the data directory. Normally I would > know to separate these but I was told with the cloud storage that it's all > virtualized anyway, plus storage IOPS are determined by disk size so having > a smaller volume just for pg_wal would hurt me in this case. The kind folks > in the PG community Slack suggested just having one large premium cloud > storage mount for > the data directory and leave pg_wal inside because this virtualization > removes any guarantee of true separation. > It is true that the IO is virtualized but that does not mean that separate volumes won't help. In cloud storage you are granted specific IOPS/MB/s per volume. Separating pg_wal to a new volume mount will take pressure off of page writes and allow the wal to write within its own prioritization. JD
Re: High COMMIT times
On Wed, Jan 6, 2021 at 10:51 AM Joshua Drake wrote: > I have the stats_temp_directory in a tmpfs mount. I *do* have pg_wal on >> the same premium SSD storage volume as the data directory. Normally I would >> know to separate these but I was told with the cloud storage that it's all >> virtualized anyway, plus storage IOPS are determined by disk size so having >> a smaller volume just for pg_wal would hurt me in this case. The kind folks >> in the PG community Slack suggested just having one large premium cloud >> storage mount for >> > the data directory and leave pg_wal inside because this virtualization >> removes any guarantee of true separation. >> > > It is true that the IO is virtualized but that does not mean that separate > volumes won't help. In cloud storage you are granted specific IOPS/MB/s per > volume. Separating pg_wal to a new volume mount will take pressure off of > page writes and allow the wal to write within its own prioritization. > Looking at the Azure portal metric, we are nowhere close to the advertised maximum IOPS or MB/s throughput (under half of the maximum IOPS and under a quarter of the MB/s maximum). So there must be some other bottleneck in play. The IOPS limit on this VM size is even higher so that shouldn't be it. If I were to size a separate volume for just WAL, I would think 64GB but obviously the Azure storage IOPS are much less. On this particular DB host, we're currently on a 2.0T P40 disk that is supposed to give 7500 IOPS and 250MB/s [1] (but again, Azure's own usage graphs show us nowhere near those limits). A smaller volume like 64GB would be provisioned at 240 IOPS in this example. Doesn't seem like a lot. Really until you get a terabyte it seems like a real drop-off as far as Azure storage goes. I'd be interested to hear what others might have configured on their write-heavy cloud databases. [1] https://azure.microsoft.com/en-us/pricing/details/managed-disks/ Don. -- Don Seiler www.seiler.us
Re: High COMMIT times
> > Looking at the Azure portal metric, we are nowhere close to the advertised > maximum IOPS or MB/s throughput (under half of the maximum IOPS and under a > quarter of the MB/s maximum). So there must be some other bottleneck in > play. The IOPS limit on this VM size is even higher so that shouldn't be it. > > If I were to size a separate volume for just WAL, I would think 64GB but > obviously the Azure storage IOPS are much less. On this particular DB host, > we're currently on a 2.0T P40 disk that is supposed to give 7500 IOPS and > 250MB/s [1] (but again, Azure's own usage graphs show us nowhere near those > limits). A smaller volume like 64GB would be provisioned at 240 IOPS in > this example. Doesn't seem like a lot. Really until you get a terabyte it > seems like a real drop-off as far as Azure storage goes. > > Based on those metrics, I would start looking at other things. For example, I once (it was years ago) experienced commit delays because the kernel cache on Linux was getting over run. Do you have any metrics on the system as a whole? Perhaps sar running every few minutes will help you identify a correlation? JD > I'd be interested to hear what others might have configured on their > write-heavy cloud databases. > > [1] https://azure.microsoft.com/en-us/pricing/details/managed-disks/ > > Don. > > -- > Don Seiler > www.seiler.us >
Re: High COMMIT times
On Wed, Jan 06, 2021 at 12:06:27PM -0600, Don Seiler wrote: > On Wed, Jan 6, 2021 at 10:51 AM Joshua Drake wrote: > > Looking at the Azure portal metric, we are nowhere close to the advertised > maximum IOPS or MB/s throughput (under half of the maximum IOPS and under a > quarter of the MB/s maximum). So there must be some other bottleneck in > play. The IOPS limit on this VM size is even higher so that shouldn't be it. > Hi Don, I may just be re-stating common knowledge, but the available IOPS would be constrained by how tightly coupled the storage is to the CPU. Even a small increase can limit the maximum IOPS unless you can issue multiple relatively independent queries at one. I know no details of how Azure implements their storage tiers. Regards, Ken
Re: High COMMIT times
Azure VMs do have their own IOPS limits that increase with increasing VM "size". In this current case our VM size puts that VM IOPS limit well above anything the disks are rated at, so it shouldn't be a bottleneck. On Wed, Jan 6, 2021 at 1:15 PM Kenneth Marshall wrote: > On Wed, Jan 06, 2021 at 12:06:27PM -0600, Don Seiler wrote: > > On Wed, Jan 6, 2021 at 10:51 AM Joshua Drake > wrote: > > > > Looking at the Azure portal metric, we are nowhere close to the > advertised > > maximum IOPS or MB/s throughput (under half of the maximum IOPS and > under a > > quarter of the MB/s maximum). So there must be some other bottleneck in > > play. The IOPS limit on this VM size is even higher so that shouldn't be > it. > > > > Hi Don, > > I may just be re-stating common knowledge, but the available IOPS would > be constrained by how tightly coupled the storage is to the CPU. Even a > small increase can limit the maximum IOPS unless you can issue multiple > relatively independent queries at one. I know no details of how Azure > implements their storage tiers. > > Regards, > Ken > -- Don Seiler www.seiler.us
Re: High COMMIT times
On Wed, 2021-01-06 at 10:19 -0600, Don Seiler wrote: > This week we've noticed that we're starting to see spikes where COMMITs are > taking much longer than usual. > Sometimes, quite a few seconds to finish. > > This is PostgreSQL 12.4 on Ubuntu 18.04, all running in MS Azure (*not* > managed by them). Unless you are using WITH HOLD cursors on large result sets, this is very likely I/O overload. Use tools like "sar", "vmstat" and "iostat" to monitor your I/O load. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
