On Fri, Sep 18, 2020 at 6:29 AM Bob Negri <[email protected]> wrote:

> So 4 hrs might not be big enough? (Looks like the person who did the
> update did not check the documents.) Trying the new settings now.
>

Yea I don't think 4 hrs is long enough and would cause the behavior you're
seeing. In that case when gc fires after 4 hours there will be incoming
commands trying to recreate the table that gc is trying to drop. Then every
time the gc_interval passes the same thing will happen again causing the
errors you're seeing. If you set the *report_ttl* and *resource_events_ttl*
settings to over 24 hours this issue should go away. You'll still see some
of the *"resource_events_20200917z" does not exist *errors, but these
should only happen once a day when a new partition needs to be created.

I think our documentation for those two settings might be lacking with
regards to this issue. We added partitioning for reports and
resource_events recently and didn't mention the case you're describing in
the docs. We will make an update to the docs soon using this ticket:
PDB-4899 <https://tickets.puppetlabs.com/browse/PDB-4899> to track the
work. Thanks for bringing this up!



> On Thursday, September 17, 2020 at 6:27:29 PM UTC-5 [email protected]
> wrote:
>
>> On Thu, Sep 17, 2020 at 8:12 AM Bob Negri <[email protected]> wrote:
>>
>>> We recently updated our PuppetDB servers to PuppetDB 6.12.0 and
>>> PostgreSQL 12.
>>> Started getting these errors:
>>>
>>> ERROR:  relation "resource_events_20200917z" does not exist at character
>>> 13
>>> ERROR:  relation "resource_events_20200917z" already exists
>>> ERROR:  deadlock detected
>>> ERROR:  could not serialize access due to concurrent delete
>>>
>>> Not sure if it is a PuppetDB setting or a Postgresql issue. Has anyone
>>> else seen this?
>>>
>>> Here is more detail:
>>>
>>> 2020-09-17 14:32:49.515 UTC [3941] ERROR:  relation
>>> "resource_events_20200917z" does not exist at character 13
>>> 2020-09-17 14:32:49.515 UTC [3941] QUERY:  INSERT INTO
>>> resource_events_20200917Z SELECT ($1).*
>>> 2020-09-17 14:32:49.515 UTC [3941] CONTEXT:  PL/pgSQL function
>>> resource_events_insert_trigger() line 8 at EXECUTE
>>> 2020-09-17 14:32:49.515 UTC [3941] STATEMENT:  INSERT INTO
>>> resource_events ( new_value, property, name, file, report_id, event_hash,
>>> old_value, containing_class, certname_id, line, resource_type, status,
>>> resource_title, timestamp, containment_path, message ) VALUES ( $1, $2, $3,
>>> $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16 )
>>>         RETURNING *
>>> 2020-09-17 14:32:49.538 UTC [3937] ERROR:  relation
>>> "resource_events_20200917z" already exists
>>> 2020-09-17 14:32:49.538 UTC [3937] STATEMENT:  CREATE TABLE IF NOT
>>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH
>>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE
>>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events)
>>> 2020-09-17 14:32:49.538 UTC [3945] ERROR:  relation
>>> "resource_events_20200917z" already exists
>>> 2020-09-17 14:32:49.538 UTC [3945] STATEMENT:  CREATE TABLE IF NOT
>>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH
>>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE
>>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events)
>>> 2020-09-17 14:32:49.538 UTC [3941] ERROR:  relation
>>> "resource_events_20200917z" already exists
>>> 2020-09-17 14:32:49.538 UTC [3941] STATEMENT:  CREATE TABLE IF NOT
>>> EXISTS resource_events_20200917Z (CHECK ( "timestamp" >= TIMESTAMP WITH
>>> TIME ZONE '2020-09-17T00:00:00Z' AND "timestamp" < TIMESTAMP WITH TIME ZONE
>>> '2020-09-18T00:00:00Z' )) INHERITS (resource_events)
>>> 2020-09-17 14:33:27.917 UTC [2875] ERROR:  deadlock detected
>>> 2020-09-17 14:33:27.917 UTC [2875] DETAIL:  Process 2875 waits for
>>> AccessExclusiveLock on relation 7883116 of database 16385; blocked by
>>> process 3945.
>>>         Process 3945 waits for RowExclusiveLock on relation 7883178 of
>>> database 16385; blocked by process 2875.
>>>         Process 2875: drop table if exists reports_20200917z cascade
>>>         Process 3945: INSERT INTO resource_events ( new_value, property,
>>> name, file, report_id, event_hash, old_value, containing_class,
>>> certname_id, line, resource_type, status, resource_title, timestamp,
>>> containment_path, message ) VALUES ( $1, $2, $3, $4, $5, $6, $7, $8, $9,
>>> $10, $11, $12, $13, $14, $15, $16 )
>>>         RETURNING *
>>>
>>> From the logs above it looks like the gc process in PuppetDB represented
>> by the Postgres process 2875 is trying to drop the reports_20200917z
>> partition while other transactions are trying to insert into
>> resource_events partitions for the same day. PuppetDB handles
>> partition creation by first trying to insert a row and then creating the
>> partition for a given day if one doesn't yet exist.
>>
>> I'm wondering if your install has the *report_ttl* or
>> *resource_events_tt*l described in the PuppetDB config docs
>> <https://puppet.com/docs/puppetdb/latest/configure.html#resource-events-ttl> 
>> set
>> to a value that's less than one day. If this were the case it's possible
>> that the gc process is trying to drop partitions for the same day that
>> incoming commands are trying to create them for, causing conflicts.
>> Normally you could expect to see a couple of the *"resource_events_20200917z"
>> does not exist *errors per day around the same time for both the
>> *reports_<date>* and *resource_events_<date>* partitions. If your ttl
>> settings aren't set to less than a day and you're seeing these errors more
>> regularly than daily please let us know.
>>
>>
>>>
>>> 2020-09-17 14:34:47.339 UTC [2875] ERROR:  could not serialize access
>>> due to concurrent delete
>>> 2020-09-17 14:34:47.339 UTC [2875] STATEMENT:  delete from fact_paths
>>> fp  where not exists (select 1 from tmp_live_paths
>>> where tmp_live_paths.path = fp.path)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Puppet Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/puppet-users/6f23bccd-22cd-48dd-acd8-e8e0247440fdn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/puppet-users/6f23bccd-22cd-48dd-acd8-e8e0247440fdn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-users/01732949-02c5-4721-b49f-9ef55f825fddn%40googlegroups.com
> <https://groups.google.com/d/msgid/puppet-users/01732949-02c5-4721-b49f-9ef55f825fddn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-users/CADNLmivbiSLBonqQHeUeby04ae6aOkXCgst2GgEyUaihSHOWWQ%40mail.gmail.com.

Reply via email to