On Fri, Sep 18, 2020 at 6:29 AM Bob Negri <[email protected]> wrote:
> So 4 hrs might not be big enough? (Looks like the person who did the > update did not check the documents.) Trying the new settings now. > Yea I don't think 4 hrs is long enough and would cause the behavior you're seeing. In that case when gc fires after 4 hours there will be incoming commands trying to recreate the table that gc is trying to drop. Then every time the gc_interval passes the same thing will happen again causing the errors you're seeing. If you set the *report_ttl* and *resource_events_ttl* settings to over 24 hours this issue should go away. You'll still see some of the *"resource_events_20200917z" does not exist *errors, but these should only happen once a day when a new partition needs to be created. I think our documentation for those two settings might be lacking with regards to this issue. We added partitioning for reports and resource_events recently and didn't mention the case you're describing in the docs. We will make an update to the docs soon using this ticket: PDB-4899 <https://tickets.puppetlabs.com/browse/PDB-4899> to track the work. Thanks for bringing this up! > On Thursday, September 17, 2020 at 6:27:29 PM UTC-5 [email protected] > wrote: > >> On Thu, Sep 17, 2020 at 8:12 AM Bob Negri <[email protected]> wrote: >> >>> We recently updated our PuppetDB servers to PuppetDB 6.12.0 and >>> PostgreSQL 12. >>> Started getting these errors: >>> >>> ERROR: relation "resource_events_20200917z" does not exist at character >>> 13 >>> ERROR: relation "resource_events_20200917z" already exists >>> ERROR: deadlock detected >>> ERROR: could not serialize access due to concurrent delete >>> >>> Not sure if it is a PuppetDB setting or a Postgresql issue. Has anyone >>> else seen this? >>> >>> Here is more detail: >>> >>> 2020-09-17 14:32:49.515 UTC [3941] ERROR: relation >>> "resource_events_20200917z" does not exist at character 13 >>> 2020-09-17 14:32:49.515 UTC [3941] QUERY: INSERT INTO >>> resource_events_20200917Z SELECT ($1).* >>> 2020-09-17 14:32:49.515 UTC [3941] CONTEXT: PL/pgSQL function >>> resource_events_insert_trigger() line 8 at EXECUTE >>> 2020-09-17 14:32:49.515 UTC [3941] STATEMENT: INSERT INTO >>> resource_events ( new_value, property, name, file, report_id, event_hash, >>> old_value, containing_class, certname_id, line, resource_type, status, >>> resource_title, timestamp, containment_path, message ) VALUES ( $1, $2, $3, >>> $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16 ) >>> RETURNING * >>> 2020-09-17 14:32:49.538 UTC [3937] ERROR: relation >>> "resource_events_20200917z" already exists >>> 2020-09-17 14:32:49.538 UTC [3937] STATEMENT: CREATE TABLE IF NOT >>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH >>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE >>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events) >>> 2020-09-17 14:32:49.538 UTC [3945] ERROR: relation >>> "resource_events_20200917z" already exists >>> 2020-09-17 14:32:49.538 UTC [3945] STATEMENT: CREATE TABLE IF NOT >>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH >>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE >>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events) >>> 2020-09-17 14:32:49.538 UTC [3941] ERROR: relation >>> "resource_events_20200917z" already exists >>> 2020-09-17 14:32:49.538 UTC [3941] STATEMENT: CREATE TABLE IF NOT >>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH >>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE >>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events) >>> 2020-09-17 14:33:27.917 UTC [2875] ERROR: deadlock detected >>> 2020-09-17 14:33:27.917 UTC [2875] DETAIL: Process 2875 waits for >>> AccessExclusiveLock on relation 7883116 of database 16385; blocked by >>> process 3945. >>> Process 3945 waits for RowExclusiveLock on relation 7883178 of >>> database 16385; blocked by process 2875. >>> Process 2875: drop table if exists reports_20200917z cascade >>> Process 3945: INSERT INTO resource_events ( new_value, property, >>> name, file, report_id, event_hash, old_value, containing_class, >>> certname_id, line, resource_type, status, resource_title, timestamp, >>> containment_path, message ) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9, >>> $10, $11, $12, $13, $14, $15, $16 ) >>> RETURNING * >>> >>> From the logs above it looks like the gc process in PuppetDB represented >> by the Postgres process 2875 is trying to drop the reports_20200917z >> partition while other transactions are trying to insert into >> resource_events partitions for the same day. PuppetDB handles >> partition creation by first trying to insert a row and then creating the >> partition for a given day if one doesn't yet exist. >> >> I'm wondering if your install has the *report_ttl* or >> *resource_events_tt*l described in the PuppetDB config docs >> <https://puppet.com/docs/puppetdb/latest/configure.html#resource-events-ttl> >> set >> to a value that's less than one day. If this were the case it's possible >> that the gc process is trying to drop partitions for the same day that >> incoming commands are trying to create them for, causing conflicts. >> Normally you could expect to see a couple of the *"resource_events_20200917z" >> does not exist *errors per day around the same time for both the >> *reports_<date>* and *resource_events_<date>* partitions. If your ttl >> settings aren't set to less than a day and you're seeing these errors more >> regularly than daily please let us know. >> >> >>> >>> 2020-09-17 14:34:47.339 UTC [2875] ERROR: could not serialize access >>> due to concurrent delete >>> 2020-09-17 14:34:47.339 UTC [2875] STATEMENT: delete from fact_paths >>> fp where not exists (select 1 from tmp_live_paths >>> where tmp_live_paths.path = fp.path) >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Puppet Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/puppet-users/6f23bccd-22cd-48dd-acd8-e8e0247440fdn%40googlegroups.com >>> <https://groups.google.com/d/msgid/puppet-users/6f23bccd-22cd-48dd-acd8-e8e0247440fdn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/puppet-users/01732949-02c5-4721-b49f-9ef55f825fddn%40googlegroups.com > <https://groups.google.com/d/msgid/puppet-users/01732949-02c5-4721-b49f-9ef55f825fddn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/CADNLmivbiSLBonqQHeUeby04ae6aOkXCgst2GgEyUaihSHOWWQ%40mail.gmail.com.
