Sudden insert performance degradation

2020-07-13 Thread Henrique Montenegro
Hello list,

I am having issues with performance inserting data in Postgres and would
like
to ask for help figuring out the problem as I ran out of ideas.

I have a process that generates a CSV file with 1 million records in it
every
5 minutes and each file is about 240MB. I need this data to be inserted
into a
table on Postgres. A peculiarity about it is that the data on these files
might be duplicate. I might have a row on the first file that is also
present
on the second or the third and so on. I don't care about the duplicates, so
I
have a unique constraint on my table to discard those.

The data in the CSV is pretty simple:

```
user_id,name,url
```

The table is defined like this:

```
create unlogged table users_no_dups (
created_ts timestamp without time zone,
user_id bigint not null,
name text,
url text,
unique(user_id)
);
```

Table is created as `unlogged` as a way to improve performance. I am aware
of the consequences of this possibly causing data loss.

My process for inserting data into the table is as follows:

* Create an intermediary table `users` as follows:

```
create unlogged table users (
created_ts timestamp without time zone default current_timestamp,
user_id bigint,
name text,
url text
) with (autovacuum_enabled = false, toast.autovacuum_enabled = false)
```

* Use `COPY` to copy the data from the CSV file into an intermediary table

```
copy users(user_id, name, url) from
'myfile.csv' with(format csv, header true, delimiter ',', quote '"', escape
'\\')
```

* Insert the data from the `users` table into the `users_no_dups` table

```
insert into users_no_dups (
created_ts,
user_id,
name,
url
) (
select
created_ts,
user_id,
name,
url
from
users
) on conflict do nothing
```

* Drop the `users` table

* Repeat the whole thing for the next file.


Running the above loop worked fine for about 12 hours. Each file was taking
about 30 seconds to be processed. About 4 seconds to create the `users`
table
and have the CSV data loaded into it and anything between 20 and 30 seconds
to
insert the data from `users` into `users_no_dups`.

All of a sudden inserting from `users` into `users_no_dups` started taking
20+
minutes.

I recreated the table with a `fillfactor` of `30` and tried again and things
were running well again with that same 30 seconds for processing. Again
after
about 12 hours, things got really slow.

Recreating the table now isn't really providing any improvements. I tried
recreating it with a `fillfactor` of `10`, but it was taking too long and
too
much space (the table had 300GB with the fillfactor set to 30; with it set
to
10 it went up to almost 1TB).

Watching on iotop, the `INSERT` statement `WRITE` speed is always between 20
and 100 K/s now. When I first started inserting  the `WRITE` speed is always
above 100M/s.

If I try to copy the `users_no_dups` table to another table (say
users_no_dups_2 with the same structure), the `WRITE` speed also goes to
100M/s or more until it gets to the last 2 GB of data being copied. Then
speed
goes down to the 20 to 100K/s again and stays there (I know this from
watching
`iotop`).

I have the following custom configuration on my postgres installation that
I've done in order to try to improve the performance:

```
ssl = off
shared_buffers = 8GB
work_mem = 12GB
maintenance_work_mem = 12GB
max_stack_depth = 4MB
synchronous_commit = off
wal_writer_flush_after = 128MB
max_wal_size = 32GB
min_wal_size = 80MB
effective_cache_size = 96GB
```

Information about the machine:

```
Processor: 2x Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz (6 cores each, 12
threads)
RAM: 256GB


Disk1: 2TB SSD SATA-3 Samsung Evo 860
Disk2: 6TB HDD SATA-3 Seagate Exos Enterprise 7200RPM
Disk3: 8TB HDD SATA-3 Seagate Exos Enterprise 7200RPM

Disk1 and Disk2 are configured as a single logical volume.

Table `users_no_dups` is in a tablespace on `Disk3`. The defaul tablespace
is
in the logical volume composed by `Disk1` and `Disk2`.

OS: Ubuntu Linux 19.10
Postgres version: PostgreSQL 11.7 (Ubuntu 11.7-0ubuntu0.19.10.1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1
20191008, 64-bit
```

Any ideas why I am seeing this decrease in performance with the insert or
any
suggestions on how I can try to figure this out?

Sorry for the wall of text. Just trying to give as much info as I have.

Henrique


Re: Sudden insert performance degradation

2020-07-13 Thread Sebastian Dressler
Hi Henrique,

On 13. Jul 2020, at 16:23, Henrique Montenegro 
mailto:[email protected]>> wrote:

[...]

* Insert the data from the `users` table into the `users_no_dups` table

```
insert into users_no_dups (
created_ts,
user_id,
name,
url
) (
select
created_ts,
user_id,
name,
url
from
users
) on conflict do nothing
```

How do you check contraints here? Is this enforced with UK/PK?

Running the above loop worked fine for about 12 hours. Each file was taking
about 30 seconds to be processed. About 4 seconds to create the `users` table
and have the CSV data loaded into it and anything between 20 and 30 seconds to
insert the data from `users` into `users_no_dups`.

Do you see anything suspicious in the logs, i.e. something in the realms of 
running out of transaction IDs?

[...]

Recreating the table now isn't really providing any improvements. I tried
recreating it with a `fillfactor` of `10`, but it was taking too long and too
much space (the table had 300GB with the fillfactor set to 30; with it set to
10 it went up to almost 1TB).

To me it sounds like the UK/PK is getting too much to write. A possible 
solution could be to start partitioning the table.

[...]
```
ssl = off
shared_buffers = 8GB
work_mem = 12GB
maintenance_work_mem = 12GB
max_stack_depth = 4MB
synchronous_commit = off
wal_writer_flush_after = 128MB
max_wal_size = 32GB
min_wal_size = 80MB
effective_cache_size = 96GB
```

Another suggestion would be to increase the min_wal_size here, but since you 
use UNLOGGED tables it does not matter much.


Information about the machine:

```
Processor: 2x Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz (6 cores each, 12 
threads)
RAM: 256GB


Disk1: 2TB SSD SATA-3 Samsung Evo 860
Disk2: 6TB HDD SATA-3 Seagate Exos Enterprise 7200RPM
Disk3: 8TB HDD SATA-3 Seagate Exos Enterprise 7200RPM

Disk1 and Disk2 are configured as a single logical volume.

Just curious: does that mean you mix up SSD + HDD?

Cheers,
Sebastian


--

Sebastian Dressler, Solution Architect
+49 30 994 0496 72 | [email protected]

Swarm64 AS
Parkveien 41 B | 0258 Oslo | Norway
Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender 
(Styrets Leder): Dr. Sverre Munck

Swarm64 AS Zweigstelle Hive
Ullsteinstr. 120 | 12109 Berlin | Germany
Registered at Amtsgericht Charlottenburg - HRB 154382 B



Re: Sudden insert performance degradation

2020-07-13 Thread Henrique Montenegro
On Mon, Jul 13, 2020 at 12:28 PM Michael Lewis  wrote:

> Is this an insert only table and perhaps not being picked up by
> autovacuum? If so, try a manual "vacuum analyze" before/after each batch
> run perhaps. You don't mention updates, but also have been adjusting
> fillfactor so I am not not sure.
>

It is mostly an insert table. Only queries I need to run on it are to
aggegate the count of IDs inserted per hour.

I did the vacuuming of the table; Didn't help. I tried both vacuum(analyze)
and vacuum(full) ... took a long time and no improvements.

I adjusted the `fillfactor` because the documentation didn't make it too
clear if by `updates to the table` it meant updating the value of existing
rows, or updating the table itself (which in my understanding would mean
that adding new data into it would cause the table to be updated). I just
started messing with the `fillfactor` to see if that would give me any
improvements. It seems to me it did since the first time I created the
table, I didn't change the fillfactor and stumbled upon the performance
issue after 12 hours; I then recreated the table with a fillfactor of 30
and was good again for about 12 hours more. Could be a coincidence though.
I tried to recreate the table using fillfactor 10, but it was taking too
long to add the data to it (12+ hours running and it wasn't done yet and
the WRITE speed on iotop was around 20K/s  I ended up just canceling
it).

As of now, the table has about 280 million records in it.

Henrique


Re: Sudden insert performance degradation

2020-07-13 Thread Henrique Montenegro
On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler 
wrote:

> Hi Henrique,
>
> On 13. Jul 2020, at 16:23, Henrique Montenegro  wrote:
>
> [...]
>
> * Insert the data from the `users` table into the `users_no_dups` table
>
> ```
> insert into users_no_dups (
> created_ts,
> user_id,
> name,
> url
> ) (
> select
> created_ts,
> user_id,
> name,
> url
> from
> users
> ) on conflict do nothing
> ```
>
>
> How do you check contraints here? Is this enforced with UK/PK?
>

The Unique Key is supposed to to the constraint enforcing here. The `users`
table will have data that is duplicate and the maximum number of records on
it is 1 million. Then I just try to insert it into the `users_no_dups`
table with the `on conflict do nothing` to ignore the duplicates and
discard them.



> Running the above loop worked fine for about 12 hours. Each file was taking
> about 30 seconds to be processed. About 4 seconds to create the `users`
> table
> and have the CSV data loaded into it and anything between 20 and 30
> seconds to
> insert the data from `users` into `users_no_dups`.
>
>
> Do you see anything suspicious in the logs, i.e. something in the realms
> of running out of transaction IDs?
>

I set the log to debug1. I haven't seen anything that called my attention,
but I am not really sure what to look for, so perhaps I missed it. Any
suggestions on what to look for or any specific log configuration to do?


>
> [...]
>
>
> Recreating the table now isn't really providing any improvements. I tried
> recreating it with a `fillfactor` of `10`, but it was taking too long and
> too
> much space (the table had 300GB with the fillfactor set to 30; with it set
> to
> 10 it went up to almost 1TB).
>
>
> To me it sounds like the UK/PK is getting too much to write. A possible
> solution could be to start partitioning the table.
>

I thought about partitioning it, but I can't figure out on what. The
`user_id` column is a number that is somewhat random so I don't know what
kinds of range I would use for it. I will try to look at the values again
and see if there is something that I could perhaps use as a range. Any
other suggestions?


>
> [...]
> ```
> ssl = off
> shared_buffers = 8GB
> work_mem = 12GB
> maintenance_work_mem = 12GB
> max_stack_depth = 4MB
> synchronous_commit = off
> wal_writer_flush_after = 128MB
> max_wal_size = 32GB
> min_wal_size = 80MB
> effective_cache_size = 96GB
> ```
>
>
> Another suggestion would be to increase the min_wal_size here, but since
> you use UNLOGGED tables it does not matter much.
>
>
> Information about the machine:
>
> ```
> Processor: 2x Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz (6 cores each, 12
> threads)
> RAM: 256GB
>
>
> Disk1: 2TB SSD SATA-3 Samsung Evo 860
> Disk2: 6TB HDD SATA-3 Seagate Exos Enterprise 7200RPM
> Disk3: 8TB HDD SATA-3 Seagate Exos Enterprise 7200RPM
>
> Disk1 and Disk2 are configured as a single logical volume.
>
>
> Just curious: does that mean you mix up SSD + HDD?
>

Yeah, I did that. Probably not very smart of me. I plan on undoing it soon.
I assumed that is not what is causing my issue since the tablespace where
the table is stored is on `Disk3` which is not part of the Logical Volume.


>
> Cheers,
> Sebastian
>
>
> --
>
> Sebastian Dressler, Solution Architect
> +49 30 994 0496 72 | [email protected]
>
> Swarm64 AS
> Parkveien 41 B | 0258 Oslo | Norway
> Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
> CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender
> (Styrets Leder): Dr. Sverre Munck
>
> Swarm64 AS Zweigstelle Hive
> Ullsteinstr. 120 | 12109 Berlin | Germany
> Registered at Amtsgericht Charlottenburg - HRB 154382 B
>
>
Thanks!

Henrique


Re: Sudden insert performance degradation

2020-07-13 Thread Sebastian Dressler
Hi Henrique,

On 13. Jul 2020, at 18:42, Henrique Montenegro 
mailto:[email protected]>> wrote:

On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler 
mailto:[email protected]>> wrote:


Running the above loop worked fine for about 12 hours. Each file was taking
about 30 seconds to be processed. About 4 seconds to create the `users` table
and have the CSV data loaded into it and anything between 20 and 30 seconds to
insert the data from `users` into `users_no_dups`.

Do you see anything suspicious in the logs, i.e. something in the realms of 
running out of transaction IDs?

I set the log to debug1. I haven't seen anything that called my attention, but 
I am not really sure what to look for, so perhaps I missed it. Any suggestions 
on what to look for or any specific log configuration to do?

Not necessarily, if you'd run out of tx IDs you would notice that cleary, I 
guess. I also think that this is not the issue.



[...]

Recreating the table now isn't really providing any improvements. I tried
recreating it with a `fillfactor` of `10`, but it was taking too long and too
much space (the table had 300GB with the fillfactor set to 30; with it set to
10 it went up to almost 1TB).

To me it sounds like the UK/PK is getting too much to write. A possible 
solution could be to start partitioning the table.

I thought about partitioning it, but I can't figure out on what. The `user_id` 
column is a number that is somewhat random so I don't know what kinds of range 
I would use for it. I will try to look at the values again and see if there is 
something that I could perhaps use as a range. Any other suggestions?

Depending on granularity, maybe partition on `created_ts`?

Cheers,
Sebastian

--

Sebastian Dressler, Solution Architect
+49 30 994 0496 72 | [email protected]

Swarm64 AS
Parkveien 41 B | 0258 Oslo | Norway
Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender 
(Styrets Leder): Dr. Sverre Munck

Swarm64 AS Zweigstelle Hive
Ullsteinstr. 120 | 12109 Berlin | Germany
Registered at Amtsgericht Charlottenburg - HRB 154382 B



Re: Sudden insert performance degradation

2020-07-13 Thread Michael Lewis
Is this an insert only table and perhaps not being picked up by autovacuum?
If so, try a manual "vacuum analyze" before/after each batch run perhaps.
You don't mention updates, but also have been adjusting fillfactor so I am
not not sure.


Re: Sudden insert performance degradation

2020-07-13 Thread Henrique Montenegro
On Mon, Jul 13, 2020 at 12:50 PM Sebastian Dressler 
wrote:

> Hi Henrique,
>
> On 13. Jul 2020, at 18:42, Henrique Montenegro  wrote:
>
> On Mon, Jul 13, 2020 at 11:20 AM Sebastian Dressler 
> wrote:
>
>
>> Running the above loop worked fine for about 12 hours. Each file was
>> taking
>> about 30 seconds to be processed. About 4 seconds to create the `users`
>> table
>> and have the CSV data loaded into it and anything between 20 and 30
>> seconds to
>> insert the data from `users` into `users_no_dups`.
>>
>>
>> Do you see anything suspicious in the logs, i.e. something in the realms
>> of running out of transaction IDs?
>>
>
> I set the log to debug1. I haven't seen anything that called my attention,
> but I am not really sure what to look for, so perhaps I missed it. Any
> suggestions on what to look for or any specific log configuration to do?
>
>
> Not necessarily, if you'd run out of tx IDs you would notice that cleary,
> I guess. I also think that this is not the issue.
>
>
>
>>
>> [...]
>>
>>
>> Recreating the table now isn't really providing any improvements. I tried
>> recreating it with a `fillfactor` of `10`, but it was taking too long and
>> too
>> much space (the table had 300GB with the fillfactor set to 30; with it
>> set to
>> 10 it went up to almost 1TB).
>>
>>
>> To me it sounds like the UK/PK is getting too much to write. A possible
>> solution could be to start partitioning the table.
>>
>
> I thought about partitioning it, but I can't figure out on what. The
> `user_id` column is a number that is somewhat random so I don't know what
> kinds of range I would use for it. I will try to look at the values again
> and see if there is something that I could perhaps use as a range. Any
> other suggestions?
>
>
> Depending on granularity, maybe partition on `created_ts`?
>

I could give it a try. The reason I didn't try that yet was that I thought
that since the UK is on the `user_id` column it wouldn't give me any
benefit, but I can't really justify why I was thinking that. I would assume
that the constraint would be validated against the index and not the whole
table, so this might work. I will give it a try.

Thanks!

Henrique


>
> Cheers,
> Sebastian
>
> --
>
> Sebastian Dressler, Solution Architect
> +49 30 994 0496 72 | [email protected]
>
> Swarm64 AS
> Parkveien 41 B | 0258 Oslo | Norway
> Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787
> CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender
> (Styrets Leder): Dr. Sverre Munck
>
> Swarm64 AS Zweigstelle Hive
> Ullsteinstr. 120 | 12109 Berlin | Germany
> Registered at Amtsgericht Charlottenburg - HRB 154382 B
>
>


Re: Sudden insert performance degradation

2020-07-13 Thread Jeff Janes
On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro 
wrote:

insert into users_no_dups (
> created_ts,
> user_id,
> name,
> url
> ) (
> select
> created_ts,
> user_id,
> name,
> url
> from
> users
> ) on conflict do nothing
>

Once the size of the only index exceeds shared_buffers by a bit (the amount
of "a bit" depends on your RAM, kernel version, settings
for dirty_background_ratio, dirty_expire_centisecs, and probably other
things, and is not easy to predict) the performance falls off a cliff when
inserting values in a random order.  Every insert dirties a random index
leaf page, which quickly gets evicted from shared_buffers to make room for
other random leaf pages to be read in, and then turns into flush calls when
the kernel freaks out about the amount and age of dirty pages held in
memory.

What happens if you add an "ORDER BY user_id" to your above select?


> shared_buffers = 8GB
> RAM: 256GB
>

Or, crank up shared_buffers by a lot.  Like, beyond the size of the growing
index, or up to 240GB if the index ever becomes larger than that.  And make
the time between checkpoints longer.  If the dirty buffers are retained in
shared_buffers longer, chances of them getting dirtied repeatedly
between writes is much higher than if you just toss them to the kernel and
hope for the best.

Cheers,

Jeff


Re: Sudden insert performance degradation

2020-07-13 Thread Henrique Montenegro
On Mon, Jul 13, 2020 at 8:05 PM Jeff Janes  wrote:

> On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro 
> wrote:
>
> insert into users_no_dups (
>> created_ts,
>> user_id,
>> name,
>> url
>> ) (
>> select
>> created_ts,
>> user_id,
>> name,
>> url
>> from
>> users
>> ) on conflict do nothing
>>
>
> Once the size of the only index exceeds shared_buffers by a bit (the
> amount of "a bit" depends on your RAM, kernel version, settings
> for dirty_background_ratio, dirty_expire_centisecs, and probably other
> things, and is not easy to predict) the performance falls off a cliff when
> inserting values in a random order.  Every insert dirties a random index
> leaf page, which quickly gets evicted from shared_buffers to make room for
> other random leaf pages to be read in, and then turns into flush calls when
> the kernel freaks out about the amount and age of dirty pages held in
> memory.
>

That is interesting to  know. I will do some research on those things.


> What happens if you add an "ORDER BY user_id" to your above select?
>

I don't know. I will give it a try right now.

>
>
>> shared_buffers = 8GB
>> RAM: 256GB
>>
>
> Or, crank up shared_buffers by a lot.  Like, beyond the size of the
> growing index, or up to 240GB if the index ever becomes larger than that.
> And make the time between checkpoints longer.  If the dirty buffers are
> retained in shared_buffers longer, chances of them getting dirtied
> repeatedly between writes is much higher than if you just toss them to the
> kernel and hope for the best.
>
>
I cranked it up to 160GB to see how it goes.

Cheers,
>
> Jeff
>

I created the partitions as well as mentioned before. I was able to
partition the table based on the user_id (found some logic to it). I was
transferring the data from the original table (about 280 million records;
320GB) to the new partitioned table and things were going well with write
speeds between 30MB/s and 50MB/s. After reading 270GB of the 320GB (in 4
and a half hours) and writing it to the new partitioned table, write speed
went down to 7KB/s. It is so frustrating.

I will keep the partitions and try your suggestions to see how it goes.

I apologize for the long time between replies, it is just that testing this
stuff takes 4+ hours each run.

If there are any other suggestions of things for me to look meanwhile as
well, please keep them coming.

Thanks!

Henrique