Fsync IO issue

2023-05-04 Thread ProfiVPS Support

Hi there,

 I've been struggling with very high write load on a server.

 We are collecting around 400k values each 5 minutes into a hypertable. 
(We use timescaledb extension, I also shared this on timescale forum but 
then I realised the issue is postgresql related.)


 When running iostat I see a constant 7-10MB/s write by postgres, and 
this just doesn't add up for me  and I'm fully stuck with this. Even 
with the row overhead it should be around 20Mb / 5 mins ! Even with 
indeces this 7-10MB/s constant write is inexplicable for me.


  The writes may trigger an update in an other table, but not all of 
them do (I use a time filter). Let's say 70% does (which I dont think). 
There we update two timestamps, and two ints. This still doesnt add up 
for me. Even if we talk about 50MB of records, that should be 0,16MB/s 
at most!


 So I dag in and found it was WAL, of course, what else.

 Tweaking all around the config, reading forums and docs, to no avail. 
The only thing that made the scenario realistic is disabling fsync 
(which I know I must not, but for the experiment I did). That eased the 
write load to 0.6MB/s.


 I also found that the 16MB WAL segment got 80+ MB written into it 
before being closed. So what's happening here? Does fsync cause the 
whole file to be written out again and again?


 I checked with pg_dump, the content is as expected.

 We are talking about some insane data overhead here, two magnitudes 
more is being written to WAL than the actual useful data.


All help is greatly appreciated.

Thanks!

András

 ---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: [email protected]

Re: Fsync IO issue

2023-05-04 Thread ProfiVPS Support
Oh, sorry, we are using PostgreSQL 13.10 (Debian 13.10-1.pgdg100+1)  on 
the server with with TimescaleDB 2.5.1 on Debian 10.


2023-05-04 19:31 időpontban ProfiVPS Support ezt írta:


Hi there,

I've been struggling with very high write load on a server.

We are collecting around 400k values each 5 minutes into a hypertable. 
(We use timescaledb extension, I also shared this on timescale forum 
but then I realised the issue is postgresql related.)


When running iostat I see a constant 7-10MB/s write by postgres, and 
this just doesn't add up for me  and I'm fully stuck with this. Even 
with the row overhead it should be around 20Mb / 5 mins ! Even with 
indeces this 7-10MB/s constant write is inexplicable for me.


The writes may trigger an update in an other table, but not all of them 
do (I use a time filter). Let's say 70% does (which I dont think). 
There we update two timestamps, and two ints. This still doesnt add up 
for me. Even if we talk about 50MB of records, that should be 0,16MB/s 
at most!


So I dag in and found it was WAL, of course, what else.

Tweaking all around the config, reading forums and docs, to no avail. 
The only thing that made the scenario realistic is disabling fsync 
(which I know I must not, but for the experiment I did). That eased the 
write load to 0.6MB/s.


I also found that the 16MB WAL segment got 80+ MB written into it 
before being closed. So what's happening here? Does fsync cause the 
whole file to be written out again and again?


I checked with pg_dump, the content is as expected.

We are talking about some insane data overhead here, two magnitudes 
more is being written to WAL than the actual useful data.


All help is greatly appreciated.

Thanks!

András

---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: [email protected]


---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: [email protected]

Re: Fsync IO issue

2023-05-04 Thread Andres Freund
Hi,

On 2023-05-04 19:31:45 +0200, ProfiVPS Support wrote:
>  We are collecting around 400k values each 5 minutes into a hypertable. (We
> use timescaledb extension, I also shared this on timescale forum but then I
> realised the issue is postgresql related.)

I don't know how timescale does its storage - how did you conclude this is
about postgres, not about timescale? Obviously WAL write patterns depend on
the way records are inserted and flushed.


>  I also found that the 16MB WAL segment got 80+ MB written into it before
> being closed. So what's happening here? Does fsync cause the whole file to
> be written out again and again?

One possible reason for this is that you are committing small transactions
very frequently. When a transaction commits, the commit records needs to be
flushed to disk. If the transactions are small, the next commit might reside
on the same page - which needs to be written out again. Which of course can
increase the write rate considerably.

Your workload does not sound like it actually needs to commit in tiny
transactions? Some larger batching / using longer lived transactions might
help a lot.

Another possibility is that timescale does flush WAL too frequently for some
reason...

Greetings,

Andres Freund




Re: Fsync IO issue

2023-05-04 Thread ProfiVPS Support

Hi,

 thank you for your response :)

 Yes, that's exactly what's happening and I understand the issue with 
fsync in these cases. But I see no workaround about this as the data is 
ingested one-by-one (sent by collectd) and a db function handles it (it 
has to do lookup and set state in a different table based on the 
incoming value).


 I feel like ANYTHING would be better than this. Even risking loosing 
_some_ of the latest data in case of a server crash (if it crashes we 
lose data anyways until restart, ofc we could have HA I know and we will 
when there'll be a need) .


 Around 100 times the write need for wall than the useful data! That's 
insane. This is actually endangering the whole project we've been 
working on for the last 1.5 years and I face this issue after 100k 
devices have been added for a client. So I'm between a rock and a hard 
place :(


 Ye, I think this is called "experience", but I must be honest, I was 
not expecting this at all :(


 However,  collectd's plugin does have an option to increase commit 
interval, but that kept the records locked and it caused strange issues, 
that's why I disabled it. I tried now to add that setting back and it 
does ease the situation somewhat with write spikes on commit.


 All in all, thank you for your help. Honestly, after todays journey I 
thought that's the issue, but didn't want to believe it.


Thanks,

András

2023-05-04 21:21 időpontban Andres Freund ezt írta:


Hi,

On 2023-05-04 19:31:45 +0200, ProfiVPS Support wrote:

We are collecting around 400k values each 5 minutes into a hypertable. 
(We
use timescaledb extension, I also shared this on timescale forum but 
then I

realised the issue is postgresql related.)


I don't know how timescale does its storage - how did you conclude this 
is
about postgres, not about timescale? Obviously WAL write patterns 
depend on

the way records are inserted and flushed.

I also found that the 16MB WAL segment got 80+ MB written into it 
before
being closed. So what's happening here? Does fsync cause the whole 
file to

be written out again and again?


One possible reason for this is that you are committing small 
transactions
very frequently. When a transaction commits, the commit records needs 
to be
flushed to disk. If the transactions are small, the next commit might 
reside
on the same page - which needs to be written out again. Which of course 
can

increase the write rate considerably.

Your workload does not sound like it actually needs to commit in tiny
transactions? Some larger batching / using longer lived transactions 
might

help a lot.

Another possibility is that timescale does flush WAL too frequently for 
some

reason...

Greetings,

Andres Freund


---
Olcsó Virtuális szerver:
http://www.ProfiVPS.hu

Támogatás: [email protected]

Re: Fsync IO issue

2023-05-04 Thread Thomas Munro
On Fri, May 5, 2023 at 8:37 AM ProfiVPS Support  wrote:
>  I feel like ANYTHING would be better than this. Even risking loosing _some_ 
> of the latest data in case of a server crash (if it crashes we lose data 
> anyways until restart, ofc we could have HA I know and we will when there'll 
> be a need) .

Try synchronous_commit=off:

https://www.postgresql.org/docs/current/wal-async-commit.html