Re: Weird seqscan node plan

2019-11-26 Thread Andrei Zhidenkov
How many tables do you have in your query? If too many, in your case “Genetic 
Query Optiomiation” might be used 
(https://www.postgresql.org/docs/10/geqo-pg-intro.html 
).

> On 26. Nov 2019, at 03:19, Игорь Выскорко  wrote:
> 
> Hi all!
> I'm confused by planner choice: seqscan but not index scan when index scan 
> cost is much cheaper.
> 1st plan: https://explain.depesz.com/s/Cti#l8
> 2nd plan (with "set enable_seqscan = off"): 
> https://explain.depesz.com/s/qn0I#l8
> 
> Look at 8th row: this nested loop decided to join the second node by using 
> seqscan (1st plan) when index is available (2nd plan). Index scan is much 
> cheaper (0.430 over 257.760). 
> 
> What am I missing?
> 
> And thanks for any reply!
> 
> 



Re: Weird seqscan node plan

2019-11-26 Thread Игорь Выскорко
  26.11.2019, 16:02, "Andrei Zhidenkov" :How many tables do you have in your query? If too many, in your case “Genetic Query Optiomiation” might be used (https://www.postgresql.org/docs/10/geqo-pg-intro.html). On 26. Nov 2019, at 03:19, Игорь Выскорко  wrote: Hi all!I'm confused by planner choice: seqscan but not index scan when index scan cost is much cheaper.1st plan: https://explain.depesz.com/s/Cti#l82nd plan (with "set enable_seqscan = off"): https://explain.depesz.com/s/qn0I#l8Look at 8th row: this nested loop decided to join the second node by using seqscan (1st plan) when index is available (2nd plan). Index scan is much cheaper (0.430 over 257.760).What am I missing?And thanks for any reply!  Hm... about 12 tables.I tried to disable geqo (set geqo = off;)  plan didn't change. But thanks for your try )Version I'm using (if matter):select version();PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit

pgmodeler an server V12

2019-11-26 Thread stan
Looks like pgmodeler still requires use of the Beta version to work with a
V11 server. I am planing on moving to V12 in the near future. Will this same
Beta version work with V12?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




Postgres Full Text Search Jsonb Array column does not search for first row

2019-11-26 Thread Dmytro Zhluktenko
Hey, Pgsql is unable to perform indexed full text search onto jsonb column containing an array when looking for the first row in the table.Any ideas why this is happening?Table is nothing more than just Id and Foo which is jsonb column.The case is that I have an empty database with predefined gin index repro_fts_idx on make_tsvector function. make_tsvector creates tsvector from given jsonb column.When I add a new item into the table, I expect it to appear in make_tsvector function in a form of tsvector. It's there. Also, I expect that if I run full text search query onto it, it would appear in search results. However, this is not the case because it returns empty specifically for the first row. It simply does not take it into account. If I add one more row which is completely the same, the system is able to find it with the same query.here is a small repro case: -- drop table cp."Repro" cascade CREATE TABLE cp."Repro" (    "Id" serial NOT NULL,    "Foo" jsonb NULL); CREATE OR REPLACE FUNCTION cp.make_tsvector(in_t cp."Repro") RETURNS tsvector LANGUAGE plpgsql IMMUTABLEAS $function$ begin    return to_tsvector(jsonb_agg(x.prop))    from (SELECT CONCAT( jsonb_array_elements(in_t."Foo") ->> 'Name', ' ', jsonb_array_elements(in_t."Foo") ->> 'Address' ) as prop from cp."Repro" f) as x;    END;    $function$;  CREATE INDEX repro_fts_idx ON cp."Repro" USING gin (cp.make_tsvector(cp."Repro".*)) WITH (fastupdate=off, gin_pending_list_limit='64');  INSERT INTO cp."Repro"("Foo")VALUES('[{"Name": "Sup", "Address": "Adress", "IsCurrent": true}]'); -- just in case it's the indexing issue-- REINDEX INDEX cp.repro_fts_idx; select * from cp."Repro" select cp.make_tsvector(x) from cp."Repro" x select * from ts_stat('select cp.make_tsvector(x) from cp."Repro" x') -- explain analyzeSELECT * FROM "cp"."Repro" x where cp.make_tsvector(x) @@ 'sup:*'::tsquery   INSERT INTO cp."Repro"("Foo")VALUES('[{"Name": "Sup", "Address": "Adress", "IsCurrent": true}]');  -- explain analyzeSELECT * FROM "cp"."Repro" x where cp.make_tsvector(x) @@ 'sup:*'::tsquery  BR, Dmytro. 




RE: Weird seqscan node plan

2019-11-26 Thread Igor Neyman
From: Игорь Выскорко [mailto:vyskorko.i...@yandex.ru]
Sent: Tuesday, November 26, 2019 4:13 AM
To: Andrei Zhidenkov 
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Weird seqscan node plan

26.11.2019, 16:02, "Andrei Zhidenkov" 
mailto:andrei.zhiden...@n26.com>>:
How many tables do you have in your query? If too many, in your case “Genetic 
Query Optiomiation” might be used 
(https://www.postgresql.org/docs/10/geqo-pg-intro.html).

On 26. Nov 2019, at 03:19, Игорь Выскорко 
mailto:vyskorko.i...@yandex.ru>> wrote:

Hi all!
I'm confused by planner choice: seqscan but not index scan when index scan cost 
is much cheaper.
1st plan: https://explain.depesz.com/s/Cti#l8
2nd plan (with "set enable_seqscan = off"): https://explain.depesz.com/s/qn0I#l8

Look at 8th row: this nested loop decided to join the second node by using 
seqscan (1st plan) when index is available (2nd plan). Index scan is much 
cheaper (0.430 over 257.760).

What am I missing?

And thanks for any reply!


Hm... about 12 tables.
I tried to disable geqo (set geqo = off;)  plan didn't change. But thanks for 
your try )

Version I'm using (if matter):
select version();
PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10) 
4.9.2, 64-bit

Try increasing the following parameters to 14 (or even 16, if you are not sure 
about number of tables involved):

geqo_threshold = 14
from_collapse_limit = 14
join_collapse_limit = 14

“about 12” is too close to default limit, 12.

Regards,
Igor Neyman



Re: I think that my data is saved correctly, but when printing again, other data appears

2019-11-26 Thread Joe Conway
On 11/25/19 4:38 PM, Yessica Brinkmann wrote:
> Well, as I was told that I should save the
> CurrentMemoryContext before starting the SPI, Let's say it would be
> before doing SPI_connect (). Is this correct? And then I must use
> MemoryContextStrdup. As you told me the MemoryContextStrdup It is
> used to create a copy of a string in a specific memory context. Well,
> where in the source code should I use MemoryContextStrdup? After 
> doing the SPI_connect () or where? I would use it from 
> MemoryContextStrdup to copy the data variable as I understand it, But
> in what context would I have to create the copy? In a new context or
> what would the theme be like? Should I use AllocSetContextCreate to
> create the new context or what would the theme be like? And if I have
> to create the new context with AllocSetContextCreate, where in the
> source code will I have to create it? After doing SPI_connect () or
> where? The truth is that I also read the source code of 
> https://github.com/cohenjo/pg_idx_advisor but I don't see that The 
> MemoryContextStrdup that they told me to use has been used there. 
> Sorry for the inconvenience and see the same thing again. But as I 
> indicated, I also read the explanations they told me to read (which
> were explanations about the memory contexts in Postgresql mainly) but
> there were no examples of source code. And since there is nothing on
> the Internet of examples that will help me to use, I am asking
> again. And the truth is that I didn't find examples of this in the
> Postgres source code, just definitions, That is the source code
> where MemoryContextStrdup is defined. It may be very easy for you and
> you will see it very clearly, but for me it really is not, and there
> are no examples of use on the Internet. I really searched a lot and
> found nothing. I would greatly appreciate a help please.
Sorry but I am not going to try to address that wall of text ;-)
But here is some general information about how that stuff works:
-
1. The most common pattern is something like this:

   MemoryContext oldcontext;

   oldcontext = MemoryContextSwitchTo();

   /* do stuff that allocates memory
* using PostgreSQL allocation functions
* e.g. palloc, pstrdup, other exported
* backend functions, etc
*/

   MemoryContextSwitchTo(oldcontext);

2. MemoryContextStrdup() is similar to the above, except in that case
   you do not need MemoryContextSwitchTo(). It directly allocates into
   the specified memory context without all the switching back and
   forth. If you are simply copying one string and need it in a context
   other than what is current, it is more convenient. But either method
   could be used.

3. When you run SPI_connect() the memory context is switched
   transparently for you to a special SPI memory context. When you run
   SPI_finish() the original memory context (the one in effect before
   SPI_connect) is restored.

4. Depending on what you are trying to do, use method #1 or method #2 if
   needed, including while doing SPI related things (in between
   SPI_connect and SPI_finish)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: I think that my data is saved correctly, but when printing again, other data appears

2019-11-26 Thread Yessica Brinkmann
I understand. Thank you very much for clearing things up.
It helps me a lot, especially point 3.
"3. When you run SPI_connect () the memory context is switched
transparently for you to a special SPI memory context. When you run
SPI_finish () the original memory context (the one in effect before
SPI_connect) is restored. "
Best regards,
Yessica Brinkmann

El mar., 26 nov. 2019 a las 15:25, Joe Conway ()
escribió:

> On 11/25/19 4:38 PM, Yessica Brinkmann wrote:
> > Well, as I was told that I should save the
> > CurrentMemoryContext before starting the SPI, Let's say it would be
> > before doing SPI_connect (). Is this correct? And then I must use
> > MemoryContextStrdup. As you told me the MemoryContextStrdup It is
> > used to create a copy of a string in a specific memory context. Well,
> > where in the source code should I use MemoryContextStrdup? After
> > doing the SPI_connect () or where? I would use it from
> > MemoryContextStrdup to copy the data variable as I understand it, But
> > in what context would I have to create the copy? In a new context or
> > what would the theme be like? Should I use AllocSetContextCreate to
> > create the new context or what would the theme be like? And if I have
> > to create the new context with AllocSetContextCreate, where in the
> > source code will I have to create it? After doing SPI_connect () or
> > where? The truth is that I also read the source code of
> > https://github.com/cohenjo/pg_idx_advisor but I don't see that The
> > MemoryContextStrdup that they told me to use has been used there.
> > Sorry for the inconvenience and see the same thing again. But as I
> > indicated, I also read the explanations they told me to read (which
> > were explanations about the memory contexts in Postgresql mainly) but
> > there were no examples of source code. And since there is nothing on
> > the Internet of examples that will help me to use, I am asking
> > again. And the truth is that I didn't find examples of this in the
> > Postgres source code, just definitions, That is the source code
> > where MemoryContextStrdup is defined. It may be very easy for you and
> > you will see it very clearly, but for me it really is not, and there
> > are no examples of use on the Internet. I really searched a lot and
> > found nothing. I would greatly appreciate a help please.
> Sorry but I am not going to try to address that wall of text ;-)
> But here is some general information about how that stuff works:
> -
> 1. The most common pattern is something like this:
>
>MemoryContext oldcontext;
>
>oldcontext = MemoryContextSwitchTo();
>
>/* do stuff that allocates memory
> * using PostgreSQL allocation functions
> * e.g. palloc, pstrdup, other exported
> * backend functions, etc
> */
>
>MemoryContextSwitchTo(oldcontext);
>
> 2. MemoryContextStrdup() is similar to the above, except in that case
>you do not need MemoryContextSwitchTo(). It directly allocates into
>the specified memory context without all the switching back and
>forth. If you are simply copying one string and need it in a context
>other than what is current, it is more convenient. But either method
>could be used.
>
> 3. When you run SPI_connect() the memory context is switched
>transparently for you to a special SPI memory context. When you run
>SPI_finish() the original memory context (the one in effect before
>SPI_connect) is restored.
>
> 4. Depending on what you are trying to do, use method #1 or method #2 if
>needed, including while doing SPI related things (in between
>SPI_connect and SPI_finish)
>
> Joe
>
> --
> Crunchy Data - http://crunchydata.com
> PostgreSQL Support for Secure Enterprises
> Consulting, Training, & Open Source Development
>
>


Re: Postgres Full Text Search Jsonb Array column does not search for first row

2019-11-26 Thread Laurenz Albe
On Tue, 2019-11-26 at 13:37 +0200, Dmytro Zhluktenko wrote:
> Pgsql is unable to perform indexed full text search onto jsonb column 
> containing an array when looking for the first row in the table.
> 
> Any ideas why this is happening?
>  
> CREATE OR REPLACE FUNCTION cp.make_tsvector(in_t cp."Repro")
>  RETURNS tsvector
>  LANGUAGE plpgsql
>  IMMUTABLE
> 
> [...] 
>  
> CREATE INDEX repro_fts_idx ON cp."Repro" USING gin 
> (cp.make_tsvector(cp."Repro".*)) WITH (fastupdate=off, 
> gin_pending_list_limit='64');
>  
> [...]
>
> -- explain analyze
> SELECT *
>  FROM "cp"."Repro" x where cp.make_tsvector(x) @@ 'sup:*'::tsquery

One possibility is that there ar just too few rows in the table.

SET enable_seqscan = off;

and then try again.

If that is not the problm, please provide EXPLAIN (ANALYZE, BUFFERS) output for
the query.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: PostGreSQL Replication and question on maintenance

2019-11-26 Thread github kran
Thanks Jeff for your reply.  We are not using replication concept here but
doing the data copy using few custom sql scripts. We had our PostgreSQL
engine supported by AWS and the instance was restarted automatically based
on the AWS design of PostgreSQL engine. Replica is restarted automatically
when the data sync is behind the master.

My intension is to bring to this forum and see if this is expected based on
the use case what I mentioned. I love this community and advices the
community has for various problems.  i still need to find out how can I do
this using replication. Please provide me few resources if there is a
better solution for this problem.

Thanks
githubkran



On Sat, Nov 16, 2019 at 11:13 AM Jeff Janes  wrote:

> On Thu, Nov 14, 2019 at 12:23 PM github kran  wrote:
>
>>
>>>
>>> *Problem what we have right now. *
>>>
>>> When the migration activity runs(weekly) from past 2 times , we saw the
>>> cluster read replica instance has restarted as it fallen behind the
>>> master(writer instance).
>>>
>>
> I can't figure out what your setup is here.  You must be using logical
> replication (not physical) or you wouldn't be able to write to the replica
> at all.  But if you are using logical replication, why do you also need
> these weekly jobs?  Why isn't logical replication taking care of it?
>
>
>
>
>> Everything
>>>
>>> after that worked seamlessly but we want to avoid the replica getting
>>> restarted. To avoid from restart we started doing smaller binary files and
>>> copy those files to the cluster-2
>>>
>>
> Who restarted it?  I am not aware of any case where the replica responds
> to falling behind by restarting itself.  With physical replication, it can
> start cancelling queries, but you don't seem to be using physical
> replication.
>
> Cheers,
>
> Jeff
>
>>


Re: Weird seqscan node plan

2019-11-26 Thread Игорь Выскорко



>
> Try increasing the following parameters to 14 (or even 16, if you are not 
> sure about number of tables involved):
>
> geqo_threshold = 14
>
> from_collapse_limit = 14
>
> join_collapse_limit = 14
>
> “about 12” is too close to default limit, 12.
>
> Regards,
>
> Igor Neyman

Hi Igor,

Is "set geqo = off;" not enough to fully disable geqo? 
I know what is geqo and know about limit when it is in action. Moreover, I 
actually tried to set these parameters to 100 and it doesn't help

> Hi Игорь
> 
> I suggest running Explain and Analyze to see what the actual query results vs 
> the planner are , 
> 
> Post the SQL code   

Hi Justin,
let me show 2 variants of "explain analyze" which differs only by actual rows 
returned by inner node (9th row):

1st case:  https://explain.depesz.com/s/lA4f
45358 rows actually returned and postgres decided to join each row of 45358 set 
with each row in yankee_foxtrot using seq scan:
  Seq Scan on yankee_foxtrot foxtrot_bravo (cost=0.000..267.670 rows=7,467 
width=13) (actual time=0.003..1.090 rows=7,467 loops=45,358)
 and then filter it:
  Rows Removed by Join Filter: 338685224
it was an awful choice =)

2st: case: https://explain.depesz.com/s/zkKY
4130 rows returned and now index only scan in action


Why planner mistakes in determining the number of rows (every time planner 
expects only 1 row) in this step I can understand - inner nodes do some joins 
(inner and outer with filtration) and it's hard to predict result.
But what I can't understand is why seq scan when it is always slower than 
index. Forget to mention that join condition is by unique key. So, when planner 
expects only 1 row then it must join only one row from second table!

> 
> Also pull this part of the query out and run it by itself to see if the 
> "Planner" changes  how  it joins these two tables.  
> 
> May need to increase the statistics collected 
> https://www.postgresql.org/docs/12/planner-stats.html
> 
> also read this 
> https://www.postgresql.org/docs/12/explicit-joins.html

In this step of planning statistics can't help (let me know if I'm wrong). 
Using stats, planner knows everything about second table (yankee_foxtrot) but 
It can only suggest approximate stats of first set (and it actually wrongs 
about it)
And yes I know about the possibility of explicit joining

> forgot to state 
> 
> Generally, it's index scan -> bitmap index scan -> sequential scan, the more 
> rows  expected to be retrieved to number of rows in the table the more likely 
> the planner will go to sequential scan 

The key word is "expected" here I assume and according to expectation of 
planner it must be only 1 row...


So, to conclude: I know how to make my query faster (how to exclude this "seq 
scan") but what I really what to understand: WHY seq scan is in my plan? Why 
planner thinks it's the best choice?