Re: bigint out of range

2019-05-18 Thread Peter J. Holzer
On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
> On Thu, May 16, 2019 at 8:31 AM Daulat Ram  wrote:
> 
> 
> url_hash    | bigint  |   | not null |
> 
> 
> Change the type of url_hash; make it text instead of bigint.

Or numeric(38, 0). I think it isn't coincidence that he tries to store
a 38-digit number in it. 

> As a bonus:
> 
> Use text instead of arbitrary varchar(4000) fields and add, e.g., check 
> (length
> (url) < 4000) or something better.
> 
> Needing 38 digit integers for "id" fields seems odd.

I guess the schema was copied from Oracle. In Oracle, all numbers are
really 38 digit decimal floating point numbers and the limit for
varchar2 is 4000.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Strange performance degregation in sql function (PG11.1)

2019-05-18 Thread Alastair McKinley
Hi all,

I recently experienced a performance degradation in an operational system that 
I can't explain.  I had a function wrapper for a aggregate query that was 
performing well using the expected indexes with the approximate structure as 
shown below.

create or replace function example_function(param1 int, param2 int) returns 
setof custom_type as
$$
select * from big_table where col1 = param1 and col2 = param2;
$$ language sql;

After creating two new indexes on this table to support a different use case 
during a migration, this unchanged function reduced in performance by several 
orders of magnitude.  Running the query inside the function manually on the 
console however worked as expected and the query plan did not appear to have 
changed.  On a hunch I changed the structure of the function to the structure 
below and immediately the query performance returned to the expected baseline.

create or replace function example_function(param1 int, param2 int) returns 
setof custom_type as
$$
BEGIN
return query execute format($query$
select * from big_table where col1 = %1$L and col2 = %1$
$query$,param1,param2);
END;
$$ language plpgsql;

The source data itself did not change during the time when I noticed this 
issue.  Can anyone explain or guess what could have caused this degradation?  
The only other maintenance that I attempted was 'DISCARD PLANS;' which did not 
help.

Best regards,

Alastair


Re: Strange performance degregation in sql function (PG11.1)

2019-05-18 Thread Adrian Klaver

On 5/18/19 4:17 AM, Alastair McKinley wrote:

Hi all,

I recently experienced a performance degradation in an operational 
system that I can't explain.  I had a function wrapper for a aggregate 
query that was performing well using the expected indexes with the 
approximate structure as shown below.


create or replace function example_function(param1 int, param2 int) 
returns setof custom_type as

$$
     select * from big_table where col1 = param1 and col2 = param2;
$$ language sql;

After creating two new indexes on this table to support a different use 
case during a migration, this unchanged function reduced in performance 


Postgres version?

Was the migration from one Postgres version to another?

Did you run ANALYZE after migration?

More below.


by several orders of magnitude.  Running the query inside the function 
manually on the console however worked as expected and the query plan 
did not appear to have changed.  On a hunch I changed the structure of 
the function to the structure below and immediately the query 
performance returned to the expected baseline.


Can you provide the EXPLAIN ANALYZE for each case. If you are worried 
about the information revealed maybe use the anonymization available here:


https://explain.depesz.com/

Using EXECUTE will override the plan caching in plpgsql.



create or replace function example_function(param1 int, param2 int) 
returns setof custom_type as

$$
BEGIN
     return query execute format($query$
         select * from big_table where col1 = %1$L and col2 = %1$
     $query$,param1,param2);
END;
$$ language plpgsql;

The source data itself did not change during the time when I noticed 
this issue.  Can anyone explain or guess what could have caused this 
degradation?  The only other maintenance that I attempted was 'DISCARD 
PLANS;' which did not help.


Best regards,

Alastair



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Loading table with indexed jsonb field is stalling

2019-05-18 Thread Adrian Klaver

On 5/17/19 12:10 PM, Will Hartung wrote:
I am trying to load data in to a table with a jsonb field that is 
indexed as gin (field jsonb_path_ops).




In addition to what Tim asked:

Is the jsonb field the only field in the table?

Can we see the table schema?




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: bigint out of range

2019-05-18 Thread David G. Johnston
On Saturday, May 18, 2019, Peter J. Holzer  wrote:

> On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
> > On Thu, May 16, 2019 at 8:31 AM Daulat Ram 
> wrote:
> >
> >
> > url_hash| bigint  |   | not null |
> >
> >
> > Change the type of url_hash; make it text instead of bigint.
>
> Or numeric(38, 0). I think it isn't coincidence that he tries to store
> a 38-digit number in it.
>

You don’t perform math on a hash thus its not a number no matter that it
may contain only digits.

David J.


Re: Loading table with indexed jsonb field is stalling

2019-05-18 Thread Tom Lane
Will Hartung  writes:
> I am trying to load data in to a table with a jsonb field that is indexed
> as gin (field jsonb_path_ops).
> ...
> The current file is "stuck", pushing past 20hrs so far.

In addition to the questions about what PG version you're using, is
the backend process that's doing the load actually consuming CPU time,
or is it just sitting?

If the latter, I wonder whether you're seeing the deadlock against
VACUUM that was fixed a few months ago.  A similarly-stuck vacuum
or autovacuum process would be pretty conclusive ...

Author: Alexander Korotkov 
Branch: master [fd83c83d0] 2018-12-13 06:55:34 +0300
Branch: REL_11_STABLE Release: REL_11_2 [9aa94d853] 2018-12-13 06:15:23 +0300
Branch: REL_10_STABLE Release: REL_10_7 [2e3bd064e] 2018-12-13 06:22:39 +0300

Fix deadlock in GIN vacuum introduced by 218f51584d5

Before 218f51584d5 if posting tree page is about to be deleted, then the 
whole
posting tree is locked by LockBufferForCleanup() on root preventing all the
concurrent inserts.  218f51584d5 reduced locking to the subtree containing
page to be deleted.  However, due to concurrent parent split, inserter 
doesn't
always holds pins on all the pages constituting path from root to the target
leaf page.  That could cause a deadlock between GIN vacuum process and GIN
inserter.  And we didn't find non-invasive way to fix this.

This commit reverts VACUUM behavior to lock the whole posting tree before
delete any page.  However, we keep another useful change by 218f51584d5: the
tree is locked only if there are pages to be deleted.

regards, tom lane




Re: bigint out of range

2019-05-18 Thread Peter J. Holzer
On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> On Saturday, May 18, 2019, Peter J. Holzer  wrote:
> 
> On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
> > On Thu, May 16, 2019 at 8:31 AM Daulat Ram 
> wrote:
> >
> >
> >     url_hash    | bigint  |   | not null |
> >
> >
> > Change the type of url_hash; make it text instead of bigint.
> 
> Or numeric(38, 0). I think it isn't coincidence that he tries to store
> a 38-digit number in it.
> 
> 
> You don’t perform math on a hash

That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.

> thus its not a number

This is just silly. All hash functions I have ever encountered compute a
single fixed size integer from a stream of integers. The result may be
larger than a machine word, in which case the representation in C (or a
similar low level language) may be an array of words (or bytes), but
it's still an integer.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: bigint out of range

2019-05-18 Thread Ron

On 5/18/19 2:27 PM, Peter J. Holzer wrote:

On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:

On Saturday, May 18, 2019, Peter J. Holzer  wrote:

 On 2019-05-16 08:48:51 -0700, David G. Johnston wrote:
 > On Thu, May 16, 2019 at 8:31 AM Daulat Ram 
 wrote:
 >
 >
 >     url_hash    | bigint  |   | not null |
 >
 >
 > Change the type of url_hash; make it text instead of bigint.

 Or numeric(38, 0). I think it isn't coincidence that he tries to store
 a 38-digit number in it.


You don’t perform math on a hash

That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.


How is it "using math" to use a hash key in a hash lookup table?

--
Angular momentum makes the world go 'round.




Re: bigint out of range

2019-05-18 Thread Peter J. Holzer
On 2019-05-18 15:19:22 -0500, Ron wrote:
> On 5/18/19 2:27 PM, Peter J. Holzer wrote:
> > On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> > > You don’t perform math on a hash
> > That's not generally true. Hashes are used for further computation for
> > example in hash tables or in cryptography.
> 
> How is it "using math" to use a hash key in a hash lookup table?

hash modulo table size.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: bigint out of range

2019-05-18 Thread Ron

On 5/18/19 3:49 PM, Peter J. Holzer wrote:

On 2019-05-18 15:19:22 -0500, Ron wrote:

On 5/18/19 2:27 PM, Peter J. Holzer wrote:

On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:

You don’t perform math on a hash

That's not generally true. Hashes are used for further computation for
example in hash tables or in cryptography.

How is it "using math" to use a hash key in a hash lookup table?

hash modulo table size.


I've seen that used when the *tablespace* is pre-allocated, and you hash 
modulo the tablespace *page number*.  (Yes, performance tanks when you start 
filling up pages.)  How do you hash on the (ever growing) table size?


--
Angular momentum makes the world go 'round.


Re: PostgreSQL on Amazon RDS

2019-05-18 Thread Alex Aquino
Jeremy Schneider - Thanks for that psqlrc file.  Pretty informative. :-)

On Wed, May 8, 2019 at 11:55 AM Jeremy Schneider 
wrote:

> On 5/6/19 23:27, Rashmi V Bharadwaj wrote:
> > Is there a SQL query or a database parameter setting that I can use from
> > an external application to determine if the PostgreSQL database is on
> > cloud (like on Amazon RDS or IBM cloud) or on a non-cloud on-prem
> > environment?
>
> Here's my psqlrc file which has pretty solid logic for detecting between
> Community PostgreSQL, RDS PostgreSQL and Aurora PostgreSQL.  Note that
> it just assumes "community/oss" as a default if it doesn't detect the
> other two.  Should be easy enough to add detection of other hosted
> environments into the query with the "priority" column (as long as
> there's a reliable way to detect).
>
> https://gist.github.com/ardentperf/52bd418e44b1be26d7b63af21331cece
>
> This psqlrc also detects read-write versus read-only (e.g. hot standby),
> and the usual username/database/superuser info and builds everything
> into the prompt in a way that suits me.
>
> -Jeremy
>
> --
> http://about.me/jeremy_schneider
>
>
>


Re: bigint out of range

2019-05-18 Thread Peter J. Holzer
On 2019-05-18 17:14:59 -0500, Ron wrote:
> On 5/18/19 3:49 PM, Peter J. Holzer wrote:
> 
> On 2019-05-18 15:19:22 -0500, Ron wrote:
> 
> On 5/18/19 2:27 PM, Peter J. Holzer wrote:
> 
> On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:
> 
> You don’t perform math on a hash
> 
> That's not generally true. Hashes are used for further 
> computation for
> example in hash tables or in cryptography.
> 
> How is it "using math" to use a hash key in a hash lookup table?
> 
> hash modulo table size.
> 
> 
> I've seen that used when the tablespace is pre-allocated, and you hash modulo
> the tablespace page number.  (Yes, performance tanks when you start filling up
> pages.)  How do you hash on the (ever growing) table size?

The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today. 

To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:

i = h % nr_buckets

If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries. 

(If nr_buckets is a power of two, the modulo operation can be
efficiently implemented by using bitwise and)

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: bigint out of range

2019-05-18 Thread Ron

On 5/18/19 5:39 PM, Peter J. Holzer wrote:

On 2019-05-18 17:14:59 -0500, Ron wrote:

On 5/18/19 3:49 PM, Peter J. Holzer wrote:

 On 2019-05-18 15:19:22 -0500, Ron wrote:

 On 5/18/19 2:27 PM, Peter J. Holzer wrote:

 On 2019-05-18 10:49:53 -0700, David G. Johnston wrote:

 You don’t perform math on a hash

 That's not generally true. Hashes are used for further computation 
for
 example in hash tables or in cryptography.

 How is it "using math" to use a hash key in a hash lookup table?

 hash modulo table size.


I've seen that used when the tablespace is pre-allocated, and you hash modulo
the tablespace page number.  (Yes, performance tanks when you start filling up
pages.)  How do you hash on the (ever growing) table size?

The hash function returns a number in a range much larger than the
possible number of buckets. 64 bits is a good choice today.

To determine the bucket you need to reduce this number to something in
the range [0, nr_buckets). This is where modulo comes in:

i = h % nr_buckets

If the the table fills up, you increase nr_buckets, reallocate and
rehash all entries.


Ouch.  Response time on a big table would take a serious hit if that rehash 
happened in the middle of the day on a big OLTP system.  Even worse if it 
were a 24x365 system, because you couldn't schedule an enlargement/rehash 
during a down period.



--
Angular momentum makes the world go 'round.




Re: Strange performance degregation in sql function (PG11.1)

2019-05-18 Thread Andrew Gierth
> "Alastair" == Alastair McKinley  writes:

 Alastair> Hi all,

 Alastair> I recently experienced a performance degradation in an
 Alastair> operational system that I can't explain. I had a function
 Alastair> wrapper for a aggregate query that was performing well using
 Alastair> the expected indexes with the approximate structure as shown
 Alastair> below.

 Alastair> create or replace function example_function(param1 int, param2 int) 
returns setof custom_type as
 Alastair> $$
 Alastair> select * from big_table where col1 = param1 and col2 = param2;
 Alastair> $$ language sql;

This function isn't inlinable due to missing a STABLE qualifier; that's
a pretty big issue.

Without inlining, the function will be run only with generic plans,
which means that the decision about index usage will be made without
knowledge of the parameter values.

Was your actual function inlinable? See
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions

You can get the query plan of a non-inlined function using the
auto_explain module (with its log_nested_statements option). The query
plan of non-inlined function calls is not otherwise shown by EXPLAIN.

 Alastair> After creating two new indexes on this table to support a
 Alastair> different use case during a migration, this unchanged
 Alastair> function reduced in performance by several orders of
 Alastair> magnitude. Running the query inside the function manually on
 Alastair> the console however worked as expected and the query plan did
 Alastair> not appear to have changed.

But when you run it manually, you'll get a custom plan, based on the
parameter values.

 Alastair> On a hunch I changed the structure of the function to the
 Alastair> structure below and immediately the query performance
 Alastair> returned to the expected baseline.

 Alastair> create or replace function example_function(param1 int, param2 int) 
returns setof custom_type as
 Alastair> $$
 Alastair> BEGIN
 Alastair> return query execute format($query$
 Alastair> select * from big_table where col1 = %1$L and col2 = %1$
 Alastair> $query$,param1,param2);
 Alastair> END;
 Alastair> $$ language plpgsql;

Using EXECUTE in plpgsql will get you a custom plan every time (though
you really should have used EXECUTE USING rather than interpolating the
parameters into the query string).

I suggest looking into the inlining question first.

-- 
Andrew (irc:RhodiumToad)