Re: jsonb Indexing

2021-09-20 Thread ramikvl

Hello Julien,

On 9/17/21 4:00 PM, Julien Rouhaud wrote:

Hi,

On Fri, Sep 17, 2021 at 9:55 PM  wrote:

I was wondering what I'm doing wrong. There are steps what I've tried:

CREATE TABLE api (
  jdoc jsonb
);

INSERT INTO api (jdoc)
  VALUES ('{
  "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
  "name": "Angela Barton",
  "is_active": true,
  "company": "Magnafone",
  "address": "178 Howard Place, Gulf, Washington, 702",
  "registered": "2009-11-07T08:53:22 +08:00",
  "latitude": 19.793713,
  "longitude": 86.513373,
  "tags": [
  "enim",
  "aliquip",
  "qui"
  ]
}');

CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));

EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc ->
'tags' ? 'qui';

And the result is

Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
time=0.019..0.021 rows=1 loops=1)
Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)

Planning Time: 0.115 ms

Execution Time: 0.047 ms

Do you know why Index Scan on idxgintag is not used?

Yes, because doing an index scan on a table containing a single row is
an order or magnitude less efficient than simply doing a sequential
scan.  You should try to simulate something close to your production
data to see something interesting.


Thank you for the tip. I've tried to generate more data. I have 2000 
rows in the table but the query still uses sequential scan.


Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual 
time=0.005..0.959 rows=2000 loops=1)

  Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
Planning Time: 0.064 ms
Execution Time: 1.027 ms

Any thoughts?





Re: Proposed French Translation of Code of Conduct Policy

2021-09-20 Thread Anthony Nowocien
Hi,
not much to add besides thanks to both Alice and Guillaume.
Best regards,
Anthony

On Fri, Sep 17, 2021 at 6:57 AM Stacey Haysler 
wrote:

> The PostgreSQL Community Code of Conduct Committee has received a draft of
> the French translation of the Code of Conduct Policy updated August 18,
> 2020 for review.
>
> The English version of the Policy is at:
> *https://www.postgresql.org/about/policies/coc/
> *
>
> The translation was created by: Alice Armand
>
> The translation was reviewed by: Guillaume LeLarge
>
> The proposed translation is attached as both a plain text and a PDF file
> to this message.
>
> If you have any comments or suggestions for the proposed translation,
> please bring them to our attention no later than 5:00 PM Pacific Time on
> Friday, September 23, 2021.
>
> Thank you.
>
> Regards,
> Stacey S. Haysler
> Chair, PostgreSQL Community Code of Conduct Committee
>


Re: jsonb Indexing

2021-09-20 Thread Ilya Anfimov
On Mon, Sep 20, 2021 at 12:52:54PM +0200, rami...@gmail.com wrote:
> Hello Julien,
> 
> On 9/17/21 4:00 PM, Julien Rouhaud wrote:
> > Hi,
> > 
> > On Fri, Sep 17, 2021 at 9:55 PM  wrote:
> > > I was wondering what I'm doing wrong. There are steps what I've tried:
> > > 
> > > CREATE TABLE api (
> > >   jdoc jsonb
> > > );
> > > 
> > > INSERT INTO api (jdoc)
> > >   VALUES ('{
> > >   "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
> > >   "name": "Angela Barton",
> > >   "is_active": true,
> > >   "company": "Magnafone",
> > >   "address": "178 Howard Place, Gulf, Washington, 702",
> > >   "registered": "2009-11-07T08:53:22 +08:00",
> > >   "latitude": 19.793713,
> > >   "longitude": 86.513373,
> > >   "tags": [
> > >   "enim",
> > >   "aliquip",
> > >   "qui"
> > >   ]
> > > }');
> > > 
> > > CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));
> > > 
> > > EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc ->
> > > 'tags' ? 'qui';
> > > 
> > > And the result is
> > > 
> > > Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
> > > time=0.019..0.021 rows=1 loops=1)
> > > Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
> > > 
> > > Planning Time: 0.115 ms
> > > 
> > > Execution Time: 0.047 ms
> > > 
> > > Do you know why Index Scan on idxgintag is not used?
> > Yes, because doing an index scan on a table containing a single row is
> > an order or magnitude less efficient than simply doing a sequential
> > scan.  You should try to simulate something close to your production
> > data to see something interesting.
> 
> Thank you for the tip. I've tried to generate more data. I have 2000 rows in
> the table but the query still uses sequential scan.
> 
> Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual
> time=0.005..0.959 rows=2000 loops=1)
>   Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
> Planning Time: 0.064 ms
> Execution Time: 1.027 ms
> 
> Any thoughts?

 The planner expects index selectivity around 1 (all the rows to be selected).
 btw, it was right (all the rows were selected).

 So, trying to select something by the index is just wasting time,
compared to seq scan.

> 
> 




Re: jsonb Indexing

2021-09-20 Thread ramikvl



On 9/20/21 12:52 PM, rami...@gmail.com wrote:

Hello Julien,

On 9/17/21 4:00 PM, Julien Rouhaud wrote:

Hi,

On Fri, Sep 17, 2021 at 9:55 PM  wrote:

I was wondering what I'm doing wrong. There are steps what I've tried:

CREATE TABLE api (
  jdoc jsonb
);

INSERT INTO api (jdoc)
  VALUES ('{
  "guid": "9c36adc1-7fb5-4d5b-83b4-90356a46061a",
  "name": "Angela Barton",
  "is_active": true,
  "company": "Magnafone",
  "address": "178 Howard Place, Gulf, Washington, 702",
  "registered": "2009-11-07T08:53:22 +08:00",
  "latitude": 19.793713,
  "longitude": 86.513373,
  "tags": [
  "enim",
  "aliquip",
  "qui"
  ]
}');

CREATE INDEX idxgintags ON api USING GIN ((jdoc->'tags'));

EXPLAIN ANALYZE SELECT jdoc->'guid', jdoc->'name' FROM api WHERE 
jdoc ->

'tags' ? 'qui';

And the result is

Seq Scan on api  (cost=0.00..1.02 rows=1 width=64) (actual
time=0.019..0.021 rows=1 loops=1)
    Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)

Planning Time: 0.115 ms

Execution Time: 0.047 ms

Do you know why Index Scan on idxgintag is not used?

Yes, because doing an index scan on a table containing a single row is
an order or magnitude less efficient than simply doing a sequential
scan.  You should try to simulate something close to your production
data to see something interesting.


Thank you for the tip. I've tried to generate more data. I have 2000 
rows in the table but the query still uses sequential scan.


Seq Scan on api  (cost=0.00..131.00 rows=2000 width=64) (actual 
time=0.005..0.959 rows=2000 loops=1)

  Filter: ((jdoc -> 'tags'::text) ? 'qui'::text)
Planning Time: 0.064 ms
Execution Time: 1.027 ms

Any thoughts?


Strangely enough when I re-created the index it's working, now. I 
probably made a mistake.


Thank you.





Re: Possibilities for optimizing inserts across oracle_fdw foreign data wrapper

2021-09-20 Thread Laurenz Albe
On Sun, 2021-09-19 at 10:28 +, Niels Jespersen wrote:
> We are often using the oracle_fdw to transfer data between Postgres (version 
> 11+) and Oracle (version 18+). It works great.
>  
> However I have a task at hand that requires inserting a few billion rows in 
> an Oracle table from a Postgres query.
>  
> insert into t_ora (a,b,c)   
> select a,b,c from t_pg;
>  
> This is driven from a plpgsql stored procedure, if that matters.
>  
> I want to optimize the running time of this. But I am unsure of which, if 
> any, possibilities there actually is.
>  
> Reducing the number of network roundtrips is usually a good way to increase 
> throughput. But, how do I do that?
>  
> If I could make the Oracle insert direct load, that would usually also 
> increase throughput.
> But, is that possible here. There are no constraints defined on the 
> destinaton tables.

The cause of the bad performance for bulk data modifications is that the FDW 
API is built
that way: each row INSERTed means a round trip between PostgreSQL and Oracle.

That could be improved by collecting rows and inserting them in bulk on the 
Oracle
side, but I don't feel like implementing that and complicating the code.

>From my point of view, oracle_fdw is good for reading, but not for bulk writes.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Possibilities for optimizing inserts across oracle_fdw foreign data wrapper

2021-09-20 Thread Magnus Hagander
On Mon, Sep 20, 2021 at 3:18 PM Laurenz Albe  wrote:
>
> On Sun, 2021-09-19 at 10:28 +, Niels Jespersen wrote:
> > We are often using the oracle_fdw to transfer data between Postgres 
> > (version 11+) and Oracle (version 18+). It works great.
> >
> > However I have a task at hand that requires inserting a few billion rows in 
> > an Oracle table from a Postgres query.
> >
> > insert into t_ora (a,b,c)
> > select a,b,c from t_pg;
> >
> > This is driven from a plpgsql stored procedure, if that matters.
> >
> > I want to optimize the running time of this. But I am unsure of which, if 
> > any, possibilities there actually is.
> >
> > Reducing the number of network roundtrips is usually a good way to increase 
> > throughput. But, how do I do that?
> >
> > If I could make the Oracle insert direct load, that would usually also 
> > increase throughput.
> > But, is that possible here. There are no constraints defined on the 
> > destinaton tables.
>
> The cause of the bad performance for bulk data modifications is that the FDW 
> API is built
> that way: each row INSERTed means a round trip between PostgreSQL and Oracle.
>

Just as a note, with PostgreSQL 14 the FDW APIs allow batch insert. It
should be possible to update oracle_fdw to take advantage of that as
well, right?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Azure Postgresql High connection establishment time

2021-09-20 Thread Abhay Gupta
Hi

we are writing a python(3.9) azure function hosted on linux which is
connecting to azure postgresql 11. We are using psycopg2 to connect the
azure function to postgresql.

we are noticing the connection establishment is taking around 200-300 ms.
Below is the statment.

conn=psycopg2.connect(conn_string)

Conn_string includes host, user, password, dbname, sslmode=require

The connection establishment time is very high as our functions are written
to be very high performance.

can you please let us know if there is something we should be doing to
reduce the connection establishment time.

Thanks


Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Michael Lewis
This is not a Postgres issue. Please reach out to the Azure team.

>


Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Tom Lane
Abhay Gupta  writes:
> we are noticing the connection establishment is taking around 200-300 ms.

Not sure whether that's out of line or not, since you've provided
no background data (e.g. is the connection across a network link?
how busy is the server? what encryption and authentication are
you using?).  However ...

> The connection establishment time is very high as our functions are written
> to be very high performance.

... if you are looking for good performance, the last thing you
should be doing is making a connection per query.  Postgres backend
processes are pretty heavyweight things.  Even after the connection
is complete, there's overhead involved in populating caches and so
forth.  You'd be well-served to use a connection pooler and/or try
to keep an application's connection open once made.

regards, tom lane




Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Abhay Gupta
Hello Tom

Thanks for taking time in looking into this for us. Apologies for not
providing any required data.

Yes, Azure postgresql has a private endpoint attached to it so it is going
through a private link. Also the server has 2 vcore and 5gb. Since
currently we are in development phase there is no load on the server at
all. No encryption and in the connection string we are passing the user and
password. It is Ssl enabled. To negate it is Azure issue, we installed
postgresql 11 on our on prem server and there also we are seeing a
connection latency of about 150-200 ms.

We are only opening 1 connection in our application i.e. after all the work
is done for that user session than only we close the connection.

If you could assist in what we can check will be very helpful.

Please let me know if I still missed something as we have just started our
journey with Postgresql.

Thanks

On Mon, Sep 20, 2021 at 1:16 PM Tom Lane  wrote:

> Abhay Gupta  writes:
> > we are noticing the connection establishment is taking around 200-300 ms.
>
> Not sure whether that's out of line or not, since you've provided
> no background data (e.g. is the connection across a network link?
> how busy is the server? what encryption and authentication are
> you using?).  However ...
>
> > The connection establishment time is very high as our functions are
> written
> > to be very high performance.
>
> ... if you are looking for good performance, the last thing you
> should be doing is making a connection per query.  Postgres backend
> processes are pretty heavyweight things.  Even after the connection
> is complete, there's overhead involved in populating caches and so
> forth.  You'd be well-served to use a connection pooler and/or try
> to keep an application's connection open once made.
>
> regards, tom lane
>


Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Adrian Klaver

On 9/20/21 10:37 AM, Abhay Gupta wrote:

Hello Tom

Thanks for taking time in looking into this for us. Apologies for not 
providing any required data.


Yes, Azure postgresql has a private endpoint attached to it so it is 
going through a private link. Also the server has 2 vcore and 5gb. Since 
currently we are in development phase there is no load on the server at 
all. No encryption and in the connection string we are passing the user 
and password. It is Ssl enabled. To negate it is Azure issue, we 
installed postgresql 11 on our on prem server and there also we are 
seeing a connection latency of about 150-200 ms.


We are only opening 1 connection in our application i.e. after all the 
work is done for that user session than only we close the connection.


If you could assist in what we can check will be very helpful.


Per upstream comment take a look at connection pooling.

Two solutions that come to mind:

Pgpool-II

https://www.pgpool.net/mediawiki/index.php/Main_Page

PgBouncer

http://www.pgbouncer.org/




Thanks





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Azure Postgresql High connection establishment time

2021-09-20 Thread Peter J. Holzer
On 2021-09-20 13:37:52 -0400, Abhay Gupta wrote:
> Yes, Azure postgresql has a private endpoint attached to it so it is going
> through a private link. Also the server has 2 vcore and 5gb. Since currently 
> we
> are in development phase there is no load on the server at all. No encryption
> and in the connection string we are passing the user and password. It is Ssl
> enabled. To negate it is Azure issue, we installed postgresql 11 on our on 
> prem
> server and there also we are seeing a connection latency of about 150-200 ms.

How are you authenticating? Are you using password hashes stored in the
database (MD5 or preferrably SCRAM-SHA-256) or are you authenticating
against an external source (e.g. active directory). If the latter, what
and how? Can you measure how long the authentication takes?

I have seen LDAP authentication against a local AD instance take over
hundred milliseconds, so I wouldn't be surprised if that was the
culprit.

hp

PS: a local connection (ident) takes about 3.5 ms on my (not very fast)
laptop.

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Timestamp with vs without time zone.

2021-09-20 Thread Tim Uckun
I am hoping to get some clarification on timestamp with time zone.

My understanding is that timestamp with time zone stores data in UTC
but displays it in your time zone.  Does this also work on queries? If
I query between noon and 2:00 PM on some date in time zone XYZ does pg
translate the query to UTC before sending it to the server?

To provide context I have the following situation.

I have a data file to import. All the dates in the time zone
pacific/auckland. My app reads the data , does some processing and
cleaning up and then saves it to the database.

The language I am using creates the time data type with the right time
zone. The processing is being done on a server which is on UTC, the
database server is also on UTC.  I am pretty sure the ORM isn't
appending "at time zone pacific/Auckland" to the data when it appends
it to the database.

So does the database know the timestamp is in auckland time when the
client is connecting from a server on UTC?