date:20181122

Re: dsa_allocate() faliure

2018-11-22 Thread Justin Pryzby

On Wed, Nov 21, 2018 at 03:26:42PM +0100, Jakub Glapa wrote:
> Looks like my email didn't match the right thread:
> https://www.postgresql.org/message-id/flat/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
> Any chance to get some feedback on this?

In the related thread, it looks like Thomas backpatched a fix to v10, and so I
guess this should be resolved in 10.6, which was released couple weeks ago.
https://www.postgresql.org/message-id/CAEepm%3D0QxoUSkFqYbvmxi2eNvvU6BkqH6fTOu4oOzc1MRAT4Dw%40mail.gmail.com

Could you upgrade and check ?

38763d67784c6563d08dbea5c9f913fa174779b8 in master

|commit ba20d392584cdecc2808fe936448d127f43f2c07
|Author: Thomas Munro 
|Date:   Thu Sep 20 15:52:39 2018 +1200
|
|Fix segment_bins corruption in dsa.c.

Justin

Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Abi Noda

Given a table, `github_repos`, with a multi-column unique index on `org_id`
and `github_id` columns, is there any performance difference (or other
issues to be aware of) between the two bulk upsert operations below? The
difference is that in the first query, the `org_id` and `github_id` columns
are included in the UPDATE, whereas in the second query they are not. Since
the UPDATE runs ON CONFLICT, the updated values of `org_id` and `github_id`
will be the same as the old values, but those columns are included in the
UPDATE because the underlying library I am using is designed that way. I'm
wondering if its safe to use as-is or whether I should be explicitly
excluding those columns in the UPDATE.

Query #1:

INSERT INTO "github_repos" ("org_id","github_id","name")
VALUES (1,1,'foo')
ON CONFLICT (org_id, github_id)
DO UPDATE SET
"org_id"=EXCLUDED."org_id","github_id"=EXCLUDED."github_id","name"=EXCLUDED."name"
RETURNING "id"

Query #2:

INSERT INTO "github_repos" ("org_id","github_id","name")
VALUES (1,1,'foo')
ON CONFLICT (org_id, github_id)
DO UPDATE SET "name"=EXCLUDED."name"
RETURNING "id"

`github_repos` table:

  Column   |   Type| Collation | Nullable
---+---+---+--+
 id| bigint|   | not null |
 org_id| bigint|   | not null |
 github_id | bigint|   | not null |
 name  | character varying |   | not null |

Indexes:
"github_repos_pkey" PRIMARY KEY, btree (id)
"unique_repos" UNIQUE, btree (org_id, github_id)

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Abi Noda

In other words, is Postgres smart enough to not actually write to disk any
columns that haven’t changed value or update indexes based on those columns?

On Thu, Nov 22, 2018 at 11:32 AM Abi Noda  wrote:

> Given a table, `github_repos`, with a multi-column unique index on
> `org_id` and `github_id` columns, is there any performance difference (or
> other issues to be aware of) between the two bulk upsert operations below?
> The difference is that in the first query, the `org_id` and `github_id`
> columns are included in the UPDATE, whereas in the second query they are
> not. Since the UPDATE runs ON CONFLICT, the updated values of `org_id` and
> `github_id` will be the same as the old values, but those columns are
> included in the UPDATE because the underlying library I am using is
> designed that way. I'm wondering if its safe to use as-is or whether I
> should be explicitly excluding those columns in the UPDATE.
>
> Query #1:
>
> INSERT INTO "github_repos" ("org_id","github_id","name")
> VALUES (1,1,'foo')
> ON CONFLICT (org_id, github_id)
> DO UPDATE SET
> "org_id"=EXCLUDED."org_id","github_id"=EXCLUDED."github_id","name"=EXCLUDED."name"
> RETURNING "id"
>
> Query #2:
>
> INSERT INTO "github_repos" ("org_id","github_id","name")
> VALUES (1,1,'foo')
> ON CONFLICT (org_id, github_id)
> DO UPDATE SET "name"=EXCLUDED."name"
> RETURNING "id"
>
> `github_repos` table:
>
>   Column   |   Type| Collation | Nullable
> ---+---+---+--+
>  id| bigint|   | not null |
>  org_id| bigint|   | not null |
>  github_id | bigint|   | not null |
>  name  | character varying |   | not null |
>
> Indexes:
> "github_repos_pkey" PRIMARY KEY, btree (id)
> "unique_repos" UNIQUE, btree (org_id, github_id)
>

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

2018-11-22 Thread Justin Pryzby

On Thu, Nov 22, 2018 at 01:31:10PM -0800, Abi Noda wrote:
> In other words, is Postgres smart enough to not actually write to disk any
> columns that haven’t changed value or update indexes based on those columns?

You're asking about what's referred to as Heap only tuples:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT;hb=HEAD
https://wiki.postgresql.org/wiki/Index-only_scans#Interaction_with_HOT

Note, if you're doing alot of updates, you should consider setting a lower the
table fillfactor, since HOT is only possible if the new tuple (row version) is
on the same page as the old tuple.

|With HOT, a new tuple placed on the same page and with all indexed columns the
|same as its parent row version does not get new index entries."

And check pg_stat_user_tables to verify that's working as intended.

Justin

Re: dsa_allocate() faliure

Performance impact of updating target columns with unchanged values ON CONFLICT

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

Re: Performance impact of updating target columns with unchanged values ON CONFLICT

4 matches

Site Navigation

Mail list logo

Footer information