Performance issues on FK Triggers after replacing a primary column
Hi there, i recently stumbled upon a performance issue which i can't really understand. The issue occured when i (roughly) did the following without a commit in between: * Replace the PK column of a table A which has a referencing table B - I have removed the FK from the referencing tables B and have recreated them afterwards * Now following i am working in one of the referencing tables B, updating columns. This takes an extremely large amount of time. This means, e.g. updating 1000 rows would now need 35-40 seconds. * The "explain" tells, that the Foreign Key trigger in B referencing A causes this mishap. * Re-creating the Index in B for the column referencing A does not cause any performance gain. * If i again remove the FK to A from B this again shrinks back to some milliseconds. The question is, what does cause the FK trigger to be less performant than recreating the FK constraint? If executed on 100k or even 1m rows the operation takes hours or even days. Thank you very much. Sincerely, Per Kaminsky
Re: Performance issues on FK Triggers after replacing a primary column
The table structure looks (roughly) like this: * Table "Base": (id, created, deleted, origin, ...) ~3m rows * Table "A": (id as FK on "Base", ...) ~400k rows * Table "B": (id, ref_a as FK on "A", type, ...) ~2m rows Swapping the PK of "A" happens as following, the FK is dropped during the process since otherwise the performance issues also happen here when updating the PK. The update calls do normally utilize a file based import into a temporary table from which i do the actual update: ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; ALTER TABLE "A" ADD COLUMN id_temp BIGINT; // fill id_temp with new IDs UPDATE "B" SET ref_a = "A".id_temp WHERE "B".ref_a= "A".id; UPDATE "A" SET id = id_temp; ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) REFERENCES A(id); And then the new occuring step, in the same transaction, which then also has shown the performance issues described if i would not remove the FK temporarily: ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; UPDATE "B" SET type = 2 WHERE type ISNULL; ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) REFERENCES A(id); ____ From: Adrian Klaver Sent: Sunday, March 27, 2022 23:22 To: Per Kaminsky ; pgsql-gene...@postgresql.org Subject: Re: Performance issues on FK Triggers after replacing a primary column On 3/27/22 09:30, Per Kaminsky wrote: > Hi there, > > i recently stumbled upon a performance issue which i can't > really understand. > The issue occured when i (roughly) did the following without a commit in > between: > > * Replace the PK column of a table A which has a referencing table B - > I have removed the FK from the referencing tables B and have > recreated them afterwards > * Now following i am working in one of the referencing tables B, > updating columns. This takes an extremely large amount of time. This > means, e.g. updating 1000 rows would now need 35-40 seconds. > * The "explain" tells, that the Foreign Key trigger in B referencing A > causes this mishap. Post the query and the explain. Also have you run vacuum and/or analyze on the tables involved? > * Re-creating the Index in B for the column referencing A does not > cause any performance gain. > * If i again remove the FK to A from B this again shrinks back to some > milliseconds. > > The question is, what does cause the FK trigger to be less performant > than recreating the FK constraint? If executed on 100k or even 1m rows > the operation takes hours or even days. > > Thank you very much. > Sincerely, Per Kaminsky > -- Adrian Klaver adrian.kla...@aklaver.com
Re: Performance issues on FK Triggers after replacing a primary column
Sorry, i forgot to add the following: Explain / Analyze for the last "update type on B" call, normally there the table has million of rows but i removed most of them since otherwise it would not finish sometime soon: ('Update on B (cost=0.00..71.50 rows=1000 width=244) (actual time=18.015..18.015 rows=0 loops=1)',) (' Buffers: shared hit=26141 read=21 dirtied=77 written=54',) (' -> Seq Scan on B (cost=0.00..71.50 rows=1000 width=244) (actual time=0.102..0.536 rows=1000 loops=1)',) ('Filter: ((type IS NULL) AND (firmid = 1))',) ('Buffers: shared hit=59',) ('Planning Time: 0.430 ms',) # ('Trigger for constraint B_firmid_fkey: time=25.592 calls=1000',) # ('Trigger for constraint B_type_fkey: time=14.899 calls=1000',) # ('Trigger for constraint B_userid_fkey: time=16.146 calls=1000',) # ('Trigger for constraint B_version_fkey: time=11.285 calls=1000',) # ('Trigger for constraint B_common_fkey: time=7.913 calls=1000',) # ('Trigger for constraint B_shiftinstanceid_fkey: time=8543.369 calls=1000',) ('Trigger for constraint B_to_A_fkey: time=27246.413 calls=1000',) ('Execution Time: 35884.978 ms',) The tables have Index on each other. The vacuum can not be called, since all those table modifications are part of one big transaction to be able to make a rollback on any problem without causing an abnormal data state regarding the program. From: Per Kaminsky Sent: Monday, March 28, 2022 08:53 To: Adrian Klaver ; pgsql-gene...@postgresql.org Subject: Re: Performance issues on FK Triggers after replacing a primary column The table structure looks (roughly) like this: * Table "Base": (id, created, deleted, origin, ...) ~3m rows * Table "A": (id as FK on "Base", ...) ~400k rows * Table "B": (id, ref_a as FK on "A", type, ...) ~2m rows Swapping the PK of "A" happens as following, the FK is dropped during the process since otherwise the performance issues also happen here when updating the PK. The update calls do normally utilize a file based import into a temporary table from which i do the actual update: ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; ALTER TABLE "A" ADD COLUMN id_temp BIGINT; // fill id_temp with new IDs UPDATE "B" SET ref_a = "A".id_temp WHERE "B".ref_a= "A".id; UPDATE "A" SET id = id_temp; ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) REFERENCES A(id); And then the new occuring step, in the same transaction, which then also has shown the performance issues described if i would not remove the FK temporarily: ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; UPDATE "B" SET type = 2 WHERE type ISNULL; ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) REFERENCES A(id); From: Adrian Klaver Sent: Sunday, March 27, 2022 23:22 To: Per Kaminsky ; pgsql-gene...@postgresql.org Subject: Re: Performance issues on FK Triggers after replacing a primary column On 3/27/22 09:30, Per Kaminsky wrote: > Hi there, > > i recently stumbled upon a performance issue which i can't > really understand. > The issue occured when i (roughly) did the following without a commit in > between: > > * Replace the PK column of a table A which has a referencing table B - > I have removed the FK from the referencing tables B and have > recreated them afterwards > * Now following i am working in one of the referencing tables B, > updating columns. This takes an extremely large amount of time. This > means, e.g. updating 1000 rows would now need 35-40 seconds. > * The "explain" tells, that the Foreign Key trigger in B referencing A > causes this mishap. Post the query and the explain. Also have you run vacuum and/or analyze on the tables involved? > * Re-creating the Index in B for the column referencing A does not > cause any performance gain. > * If i again remove the FK to A from B this again shrinks back to some > milliseconds. > > The question is, what does cause the FK trigger to be less performant > than recreating the FK constraint? If executed on 100k or even 1m rows > the operation takes hours or even days. > > Thank you very much. > Sincerely, Per Kaminsky > -- Adrian Klaver adrian.kla...@aklaver.com
Re: Performance issues on FK Triggers after replacing a primary column
The tables have Index to each other on each foreign key. The index itself was not touched though, and a remove/recreate did not help. Could it be possible, that when the PK and FK values are replaced the Index is not (immediately) updated and thus cannot be used? The temporary table is not shown. It is created to insert the new values from file, then used to update the correct table with the new values, and then removed, it has no connection (FK or something else) to any other table. From: Adrian Klaver Sent: Monday, March 28, 2022 17:41 To: Per Kaminsky ; pgsql-gene...@postgresql.org Subject: Re: Performance issues on FK Triggers after replacing a primary column On 3/28/22 00:22, Per Kaminsky wrote: > Sorry, i forgot to add the following: > > Explain / Analyze for the last "update type on B" call, normally there > the table has million of rows but i removed most of them since otherwise > it would not finish sometime soon: > > ('Update on B (cost=0.00..71.50 rows=1000 width=244) (actual > time=18.015..18.015 rows=0 loops=1)',) > (' Buffers: shared hit=26141 read=21 dirtied=77 written=54',) > (' -> Seq Scan on B (cost=0.00..71.50 rows=1000 width=244) (actual > time=0.102..0.536 rows=1000 loops=1)',) > ('Filter: ((type IS NULL) AND (firmid = 1))',) > ('Buffers: shared hit=59',) > ('Planning Time: 0.430 ms',) > # ('Trigger for constraint B_firmid_fkey: time=25.592 calls=1000',) > # ('Trigger for constraint B_type_fkey: time=14.899 calls=1000',) > # ('Trigger for constraint B_userid_fkey: time=16.146 calls=1000',) > # ('Trigger for constraint B_version_fkey: time=11.285 calls=1000',) > # ('Trigger for constraint B_common_fkey: time=7.913 calls=1000',) > # ('Trigger for constraint B_shiftinstanceid_fkey: time=8543.369 > calls=1000',) > ('Trigger for constraint B_to_A_fkey: time=27246.413 calls=1000',) > ('Execution Time: 35884.978 ms',) > > The tables have Index on each other. The vacuum can not be called, since > all those table modifications are part of one big transaction to be able Analyze can be run by itself in the transaction. As Tom said I am not seeing any information about indexes on the tables(s). Also, which one of the tables you showed is the temporary one or was that not shown? > to make a rollback on any problem without causing an abnormal data state > regarding the program. > -- Adrian Klaver adrian.kla...@aklaver.com
Re: Performance issues on FK Triggers after replacing a primary column
Do you mean a simple "ANALYZE VERBOSE A"? Or something different? Following the thought that maybe the index got stale, i just tried to add a "REINDEX TABLE B". This did not help as well, which might be the case, if an index (re)build is always deferred until the end of the transaction (which i don't know if that is the case). From: Adrian Klaver Sent: Monday, March 28, 2022 17:59 To: Per Kaminsky ; pgsql-gene...@postgresql.org ; Tom Lane Subject: Re: Performance issues on FK Triggers after replacing a primary column On 3/28/22 08:47, Per Kaminsky wrote: > The tables have Index to each other on each foreign key. The index > itself was not touched though, and a remove/recreate did not help. Could > it be possible, that when the PK and FK values are replaced the Index is > not (immediately) updated and thus cannot be used? Have you tried an ANALYZE on "A" AND "B" after?: UPDATE "A" SET id = id_temp; As to the index not immediately updating, I don't know. > > The temporary table is not shown. It is created to insert the new values > from file, then used to update the correct table with the new values, > and then removed, it has no connection (FK or something else) to any > other table. > So that is the '// fill id_temp with new IDs' part? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Performance issues on FK Triggers after replacing a primary column
The usage of "on update cascade" certainly sounds plausible. Since the update does only happen rarely if there is a special need to completely change the backend structure it is not added to the FK at the moment. Just ran the whole thing again with the "ANALYZE A; ANALYZE B" but with no gain unfortunately. Just looking at the output i am also quite puzzled why an update to a field in B which has no connection itself to A would trigger all the foreign keys, except if this is a deferred call from updating A only now triggered by touching data in B. From: Adrian Klaver Sent: Monday, March 28, 2022 18:49 To: Per Kaminsky ; pgsql-gene...@postgresql.org Subject: Re: Performance issues on FK Triggers after replacing a primary column On 3/27/22 23:53, Per Kaminsky wrote: > The table structure looks (roughly) like this: > > * Table "Base": (id, created, deleted, origin, ...) ~3m rows > * Table "A": (id as FK on "Base", ...) ~400k rows > * Table "B": (id, ref_a as FK on "A", type, ...) ~2m rows > > Swapping the PK of "A" happens as following, the FK is dropped during > the process since otherwise the performance issues also happen here when > updating the PK. The update calls do normally utilize a file based > import into a temporary table from which i do the actual update: > > ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; > ALTER TABLE "A" ADD COLUMN id_temp BIGINT; > // fill id_temp with new IDs > UPDATE "B" SET ref_a = "A".id_temp WHERE "B".ref_a= "A".id; > UPDATE "A" SET id = id_temp; > ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) > REFERENCES A(id); Now that the morning coffee has taken effect, I'm wondering why the above is necessary at all? If "B_to_A_fkey" where to have ON UPDATE CASCADE then you could eliminate the dropping/adding back of the FK. The process would be: ALTER TABLE "A" ADD COLUMN id_temp BIGINT; // fill id_temp with new IDs UPDATE "A" SET id = id_temp; UPDATE "B" SET type = 2 WHERE type ISNULL; It might even be possible to further simplify depending on what '// fill id_temp with new IDs' actually does? > > And then the new occuring step, in the same transaction, which then also > has shown the performance issues described if i would not remove the FK > temporarily: > > ALTER TABLE "B" DROP CONSTRAINT "B_to_A_fkey"; > UPDATE "B" SET type = 2 WHERE type ISNULL; > ALTER TABLE "B" ADD CONSTRAINT "B_to_A_fkey" FOREIGN KEY (ref_a) > REFERENCES A(id); > > > > ** > > -- Adrian Klaver adrian.kla...@aklaver.com