Re: OOM killer while pg_restore

2022-03-05 Thread Marc Rechté

Le 03/03/2022 à 19:43, Tom Lane a écrit :

=?UTF-8?Q?Marc_Recht=c3=a9?=  writes:

Le 03/03/2022 à 16:31, Tom Lane a écrit :

Does memory consumption hold steady if you drop the FK constraints?

Actually the number of rows is 232735712.
Accordingly the RAM consumption would be x12 x3 = 7.8 GiB.
This is close to the 8,1g I reported earlier (actually it was closer to
7.8 GB, due to GiB vs. GB confusion).
So there is no memory leak.
It took 16 hours on my box to reach that RAM consumption, and then the
COPY failed when checking the first FK (as the referenced table was empty).

I'm guessing it was swapping like mad :-(

We've long recommended dropping FK constraints during bulk data loads,
and then re-establishing them later.  That's a lot cheaper than retail
validity checks, even without the memory-consumption angle.  Ideally
that sort of behavior would be automated, but nobody's gotten that
done yet.  (pg_restore does do it like that during a full restore,
but not for a data-only restore, so I guess you were doing the latter.)

regards, tom lane



Did the test without the 3 FK, but with PK and index:

I took 9.5 hours and consumed 1GB of RAM (vs. 16 hours and 8 GB).

Thanks you for the explanations.

I  assume there is currently no GUC to limit RAM consumption of a backend ?

Marc






Re: OOM killer while pg_restore

2022-03-05 Thread Ranier Vilela
Em qui., 3 de mar. de 2022 às 15:32, Marc Rechté  escreveu:

> Le 03/03/2022 à 16:31, Tom Lane a écrit :
> > =?UTF-8?Q?Marc_Recht=c3=a9?=  writes:
> >> We have a pg_restore which fails due to RAM over-consumption of the
> >> corresponding PG backend, which ends-up with OOM killer.
> >> The table has one PK, one index, and 3 FK constraints, active while
> >> restoring.
> >> The dump contains over 200M rows for that table and is in custom format,
> >> which corresponds to 37 GB of total relation size in the original DB.
> > The FKs would result in queueing row trigger events, which would occupy
> > some memory.  But those should only need ~12 bytes per FK per row,
> > which works out to less than 10GB for this number of rows, so it may
> > be that you've hit something else that we would consider a leak.
> >
> > Does memory consumption hold steady if you drop the FK constraints?
> >
> > If not, as others have noted, we'd need more info to investigate
> > this.  The leak is probably independent of the specific data in
> > the table, so maybe you could make a small self-contained example
> > using a script to generate dummy data.
> >
> >   regards, tom lane
> >
> >
> Actually the number of rows is 232735712.
>
> Accordingly the RAM consumption would be x12 x3 = 7.8 GiB.
>
> This is close to the 8,1g I reported earlier (actually it was closer to
> 7.8 GB, due to GiB vs. GB confusion).
>
> So there is no memory leak.
>
> It took 16 hours on my box to reach that RAM consumption, and then the
> COPY failed when checking the first FK (as the referenced table was empty).
>
> I dropped the FK, index, and 3 FK constraints and started over the
> pg_restore:
>
> 11 minutes to load the table (I did not have time to note RAM consumption)
>
> I then created the PK and index:
>
> 24 minutes
>
> For FK, I don't know because the referenced table are empty (but I'll be
> able to test next week, if deemed necessary).
>
> 16 hours vs. 35 minutes to reach the same state.
>
Maybe it's out of reach, but one way to help Postgres developers fix this
is to provide Flame Graphs [1] based on these slow operations.
For confidentiality and privacy reasons, the data is out of reach.

My 2c here.

regards,
Ranier Vilela
[1] https://www.brendangregg.com/flamegraphs.html