date:20181026

Re: rw_redis_fdw: SQL Errors when statement is within a function

2018-10-26 Thread GPT

I have searched in
https://github.com/nahanni/rw_redis_fdw/blob/master/redis_fdw.c for
PREPARE and EXECUTE keywords. There are not any of them, except in
comments.

So, the developer doesn´t use any PREPARE, EXECUTE statements.

So, this change occurs internally. If I am correct then the PG fails
to respond properly.

Tia

On 10/25/18, Christoph Moench-Tegeder  wrote:
> ## GPT (gptmailingli...@gmail.com):
>
>> So, this kind of switch after a few goes is a normal behavior or
>> something unexpected which will change in future?
>
> It's expected, and even documented (when you look at the user-level
> interface):
> https://www.postgresql.org/docs/current/static/sql-prepare.html
>   Prepared statements can use generic plans rather than re-planning
>   with each set of supplied EXECUTE values. This occurs immediately
>   for prepared statements with no parameters; otherwise it occurs
>   only after five or more executions produce plans whose estimated
>   cost average (including planning overhead) is more expensive than
>   the generic plan cost estimate.
>
> Regards,
> Christoph
>
> --
> Spare Space.
>
>

Re: Postgresql issue: FATAL: dsa_allocate could not find 7 free pages

2018-10-26 Thread Alexandre Assouad

Thanks for your response !
I’ll try next week to build a test environment and I’ll send you the results.
Does it make any difference to set up a VM vs a dedicated machine ?
Thanks for your help !

> Le 26 oct. 2018 à 00:58, Thomas Munro  a écrit 
> :
> 
> On Fri, Oct 26, 2018 at 2:21 AM Alexandre Assouad
>  wrote:
>> FATAL:  dsa_allocate could not find 7 free pages
> 
>> Some users have faced this issue which seems to be a bug in postgresql query 
>> planner which should be solved :
>> https://www.postgresql.org/message-id/CAEepm%3D1k7sYJbxoOSJcS-4ti2MHOnBXBfLf%3D-gtuFLTXPqvTDg%40mail.gmail.com
> 
> Hello Alexandre,
> 
> Thanks for the report.  Yes, that bug does seem like a plausible
> explanation for that error.  The fix was in commit 8ffc3be1, which
> will be included in 10.6 (target release date:  November 8th).  It's
> also in 11.0, out now.
> 
> If you are able to reproduce the problem easily on a copy of your
> database, and you have the time/inclination to investigate, is there
> any chance you could test the query on a local build of REL_10_STABLE
> (the branch that will produce 10.6 soon), instructions below, or the
> released v11.0 (if Timescale is available for that, it doesn't look
> like it)?  If not, no worries.
> 
>> But I’m still facing this issue.
>> I’m using postgresql 10.5 on ubuntu 18.04
>> With timescaledb extension (which could be involved in this bug but I 
>> couldn’t find any related issue on their side)
> 
> It's interesting that reports came from users of Citus and Timescale.
> There doesn't seem to be any reason to think it's caused by anything
> these extension are doing, other than just having a lot of data, big
> queries and the right access pattern to hit that bug.
> 
> === How to set up a throw-away REL_10_STABLE cluster:
> 
> On an Ubuntu developer machine, check out, build, install into
> temporary directory:
> sudo apt-get install git make gcc flex bison libz-dev libreadline-dev
> git clone https://github.com/postgres/postgres.git
> cd postgres
> git checkout REL_10_STABLE
> ./configure --prefix=$HOME/tmp_install
> make -s -j8 && make -s install
> 
> Initialise and start a database cluster:
> ~/tmp_install/bin/initdb -D ~/tmp_pgdata
> ~/tmp_install/bin/postgres -D ~/tmp_pgdata
> ... now postgres is running in the foreground, until you hit ^C
> ... do whatever you need to do to install Timescale extension, schema,
> data, reproduce problem
> 
> To check that you can reproduce the problem in 10.5 with a server
> built that way, stop that and:
> git checkout REL_10_5
> make -s clean && make -s -j8 && make -s install
> ~/tmp_install/bin/postgres -D ~/tmp_pgdata
> 
> To install Timescale it's probably the instructions from
> https://github.com/timescale/timescaledb, using ./bootstrap
> -DPG_CONFIG=~/tmp_install/bin/pgconfig but i haven't tried that
> myself.
> 
> (You don't have to run initdb again or reload data when switching
> between tags/branches in the 10.x series).
> 
> -- 
> Thomas Munro
> http://www.enterprisedb.com

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Tom Lane

Bruno Wolff III  writes:
> I have something that seems to produce it on rhel7. Fedora isn't working 
> well either, but the difference may be due to postgresql.conf being 
> different or some difference in the Fedora build.

Hmm, in my hands this produces the same size leak (~28GB) in either v10
or v11.  In HEAD, somebody's made it even worse (~43GB).  So this is
certainly pretty broken, but I'm not sure why it seems worse to you in
v11 than before.

The core of the problem I see is that check_exclusion_or_unique_constraint
does index_beginscan/index_rescan/index_endscan in a long-lived context,
while spgendscan seems to have employed dice while deciding which of
spgbeginscan's allocations it would bother to pfree.  This is ancient,
though the specific case you have here can only be tested back to v10
because the inet SPGIST opclass wasn't there before.

A quick review of the other index AM endscan methods seems to indicate
that they all try to clean up their mess.  So maybe we should just make
spgendscan do likewise.  Alternatively, we could decide that requiring
endscan methods to free storage is not very safe, in which case it would
be incumbent on check_exclusion_or_unique_constraint to make a temporary
context to run the scan in.  But if we're going to do the latter, I think
we oughta go full in and remove the retail pfree's from all the *endscan
functions.  We'd also have to review other callers of
index_beginscan/index_endscan to see which ones might also need their own
temp contexts.  So that would surely end up being more invasive than
just adding some pfree's to spgendscan would be.  Maybe in the long run
it'd be worth it, but probably not in the short run, or for back-patching.

Thoughts?

regards, tom lane

Re: Postgresql issue: FATAL: dsa_allocate could not find 7 free pages

2018-10-26 Thread Thomas Munro

On Fri, Oct 26, 2018 at 9:17 PM Alexandre Assouad
 wrote:
> I’ll try next week to build a test environment and I’ll send you the results.
> Does it make any difference to set up a VM vs a dedicated machine ?

Thanks.  Probably not, the important thing is probably the same data,
settings (work_mem etc) and query.  Maybe start with a build of
REL_10_5 and try to reproduce the problem, and if you can get that to
happen reliably, then switch to a build of REL_10_STABLE to confirm
that the problem goes away?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Amit Langote

On 2018/10/26 18:16, Tom Lane wrote:
> The core of the problem I see is that check_exclusion_or_unique_constraint
> does index_beginscan/index_rescan/index_endscan in a long-lived context,
> while spgendscan seems to have employed dice while deciding which of
> spgbeginscan's allocations it would bother to pfree.  This is ancient,
> though the specific case you have here can only be tested back to v10
> because the inet SPGIST opclass wasn't there before.
> 
> A quick review of the other index AM endscan methods seems to indicate
> that they all try to clean up their mess.  So maybe we should just make
> spgendscan do likewise.  Alternatively, we could decide that requiring
> endscan methods to free storage is not very safe, in which case it would
> be incumbent on check_exclusion_or_unique_constraint to make a temporary
> context to run the scan in.  But if we're going to do the latter, I think
> we oughta go full in and remove the retail pfree's from all the *endscan
> functions.  We'd also have to review other callers of
> index_beginscan/index_endscan to see which ones might also need their own
> temp contexts.  So that would surely end up being more invasive than
> just adding some pfree's to spgendscan would be.  Maybe in the long run
> it'd be worth it, but probably not in the short run, or for back-patching.

FWIW, I would prefer the latter.  Not that people write new AMs on a
regular basis because we gave them an easier interface via CREATE ACCESS
METHOD, but it still seems better for the core code to deal with memory
allocation/freeing to avoid running into issues like this.

Thanks,
Amit

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Tom Lane

Amit Langote  writes:
> On 2018/10/26 18:16, Tom Lane wrote:
>> A quick review of the other index AM endscan methods seems to indicate
>> that they all try to clean up their mess.  So maybe we should just make
>> spgendscan do likewise.  Alternatively, we could decide that requiring
>> endscan methods to free storage is not very safe, in which case it would
>> be incumbent on check_exclusion_or_unique_constraint to make a temporary
>> context to run the scan in.  But if we're going to do the latter, I think
>> we oughta go full in and remove the retail pfree's from all the *endscan
>> functions.  We'd also have to review other callers of
>> index_beginscan/index_endscan to see which ones might also need their own
>> temp contexts.  So that would surely end up being more invasive than
>> just adding some pfree's to spgendscan would be.  Maybe in the long run
>> it'd be worth it, but probably not in the short run, or for back-patching.

> FWIW, I would prefer the latter.  Not that people write new AMs on a
> regular basis because we gave them an easier interface via CREATE ACCESS
> METHOD, but it still seems better for the core code to deal with memory
> allocation/freeing to avoid running into issues like this.

After a quick look around, I think that making systable_begin/endscan
do this is a nonstarter; there are just too many call sites that would
be affected.  Now, you could imagine specifying that indexes on system
catalogs (in practice, only btree) have to clean up at endscan time
but other index types don't, so that only operations that might be
scanning user indexes need to have suitable wrapping contexts.  Not sure
there's a lot of benefit to that, though.

regards, tom lane

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Amit Langote

On 2018/10/26 18:59, Tom Lane wrote:
> Amit Langote  writes:
>> On 2018/10/26 18:16, Tom Lane wrote:
>>> A quick review of the other index AM endscan methods seems to indicate
>>> that they all try to clean up their mess.  So maybe we should just make
>>> spgendscan do likewise.  Alternatively, we could decide that requiring
>>> endscan methods to free storage is not very safe, in which case it would
>>> be incumbent on check_exclusion_or_unique_constraint to make a temporary
>>> context to run the scan in.  But if we're going to do the latter, I think
>>> we oughta go full in and remove the retail pfree's from all the *endscan
>>> functions.  We'd also have to review other callers of
>>> index_beginscan/index_endscan to see which ones might also need their own
>>> temp contexts.  So that would surely end up being more invasive than
>>> just adding some pfree's to spgendscan would be.  Maybe in the long run
>>> it'd be worth it, but probably not in the short run, or for back-patching.
> 
>> FWIW, I would prefer the latter.  Not that people write new AMs on a
>> regular basis because we gave them an easier interface via CREATE ACCESS
>> METHOD, but it still seems better for the core code to deal with memory
>> allocation/freeing to avoid running into issues like this.
> 
> After a quick look around, I think that making systable_begin/endscan
> do this is a nonstarter; there are just too many call sites that would
> be affected.  Now, you could imagine specifying that indexes on system
> catalogs (in practice, only btree) have to clean up at endscan time
> but other index types don't, so that only operations that might be
> scanning user indexes need to have suitable wrapping contexts.  Not sure
> there's a lot of benefit to that, though.

By "core code", I didn't mean systable_being/endscan, but rather
check_exclusion_or_unique_constraint() or its core-side caller(s).  But
maybe I misunderstood something about your diagnosis upthread where you said:

"The core of the problem I see is that check_exclusion_or_unique_constraint
does index_beginscan/index_rescan/index_endscan in a long-lived context,"

Thanks,
Amit

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Tom Lane

Amit Langote  writes:
> On 2018/10/26 18:59, Tom Lane wrote:
>> After a quick look around, I think that making systable_begin/endscan
>> do this is a nonstarter; there are just too many call sites that would
>> be affected.  Now, you could imagine specifying that indexes on system
>> catalogs (in practice, only btree) have to clean up at endscan time
>> but other index types don't, so that only operations that might be
>> scanning user indexes need to have suitable wrapping contexts.  Not sure
>> there's a lot of benefit to that, though.

> By "core code", I didn't mean systable_being/endscan, but rather
> check_exclusion_or_unique_constraint() or its core-side caller(s).

Well, we'd need to consider any call path leading to index_endscan.
One of those is systable_endscan and its multitude of callers.  It seems
unlikely that you could just switch context in systable_beginscan
without breaking many of the callers.

If we forbade leaks in system catalog index AMs, then the number of
places that would need work would be manageable (about 3, it looked
like).  But then it seems more like a wart than a real API improvement.

regards, tom lane

Re: Strange behavior of the random() function

2018-10-26 Thread Олег Самойлов



> 26 сент. 2018 г., в 6:35, Tom Lane  написал(а):
> 
> r.zhar...@postgrespro.ru writes:
>> Can anybody explain me that strange behavior?
> 
> It's a squishiness in the SQL language, if you ask me.  Consider this
> simplified query:
> 
> select random() from generate_series(1, 3) order by random();
> 
> Would you expect the output of this query to appear ordered, or not?
> There's an argument to be made that the two instances of random() ought
> to be executed independently for each row, so that the output would
> (probably) seem unordered.  But practical usage generally expects that
> we unify the two textually-identical expressions, so that the behavior
> is the same as
> 
> select random() from generate_series(1, 3) order by 1;
> 
>>  select random() as "rnd", random() as "rnd1", random() as "rnd2"
>>  from generate_series( 1, 3 )
>>  order by random();
>> All values in any row are the same.
> 
> Here, we've unified *all* the appearances of the ORDER BY expression in
> the SELECT-list.  Maybe that's wrong, but it's hard to make a principled
> argument why we shouldn't do it.

Because random() is volatile function, but «unified» in such way can be only 
stable expressions. That the «volatile» function behaves like  the «stable» 
function this is obviously is a bug.

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Bruno Wolff III


On Fri, Oct 26, 2018 at 10:16:09 +0100,
 Tom Lane  wrote:

Bruno Wolff III  writes:

I have something that seems to produce it on rhel7. Fedora isn't working
well either, but the difference may be due to postgresql.conf being
different or some difference in the Fedora build.


Hmm, in my hands this produces the same size leak (~28GB) in either v10
or v11.  In HEAD, somebody's made it even worse (~43GB).  So this is
certainly pretty broken, but I'm not sure why it seems worse to you in
v11 than before.


As a short term work around, could I create the index first and use 
insert statements, each in their own transaction, to get the table loaded 
with the index?


Is the issue on Fedora taking very long to build a normal spgist index for 
network addresses worth pursuing separately, or is it likely to be the same 
underlying cause? I don't really need to get this working there, as that was 
just to help with testing.


I could also try adjusting memory limits temporarily. If the leak is 28GB 
on a 32GB system I might be able to get the index built if less memory 
is tied up in other things. My workstation also has 32GB and the exclude 
index did build there with the postgresql.conf having lower memory limits.

ERROR: type "myrowtype" does not exist

2018-10-26 Thread Arup Rakshit

I was reading the json_populate_record function from the official doc 
https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE
 
.
 The doc mentioned examples are giving me errors

psql (11.0, server 10.5)
Type "help" for help.

aruprakshit=# select * from json_populate_record(null::extype, '{"x":"foo", 
"y":43, "q": true}');
ERROR:  type "extype" does not exist
LINE 1: select * from json_populate_record(null::extype, '{"x":"foo"...
 ^
aruprakshit=# select * from json_populate_record(null::myrowtype, '{"a": 1, 
"b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}}');
ERROR:  type "myrowtype" does not exist
LINE 1: select * from json_populate_record(null::myrowtype, '{"a": 1…

What does null::extype, null::myrowtype mean here ?


Thanks,

Arup Rakshit
a...@zeit.io

Shell Command within function

2018-10-26 Thread Mike Martin

Is this possible?
I have a script which imports csvlogs into a table, and it would be useful
to truncate the log files after import

thanks

Mike

Re: Shell Command within function

2018-10-26 Thread Andreas Kretschmer

On 26 October 2018 13:22:19 WEST, Mike Martin  wrote:
>Is this possible?
>I have a script which imports csvlogs into a table, and it would be
>useful
>to truncate the log files after import
>

You can use an untrusted language (pl/perlu, pl/sh, ...).



Regards, Andreas


-- 
2ndQuadrant - The PostgreSQL Support Company

Re: Shell Command within function

2018-10-26 Thread Andrew Gierth

> "Mike" == Mike Martin  writes:

 Mike> Is this possible?

In an untrusted pl language, yes, but it's a bad idea because:

 Mike> I have a script which imports csvlogs into a table, and it would
 Mike> be useful to truncate the log files after import

If you had an error after truncating the file but before committing the
transaction, then the imported data would be gone, but the file would be
gone too, so you'd have lost it.

Things like this need to be handled _after_ commit, not before, so a
function is the wrong place for it.

-- 
Andrew (irc:RhodiumToad)

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Tom Lane

Bruno Wolff III  writes:
>   Tom Lane  wrote:
>> Hmm, in my hands this produces the same size leak (~28GB) in either v10
>> or v11.  In HEAD, somebody's made it even worse (~43GB).  So this is
>> certainly pretty broken, but I'm not sure why it seems worse to you in
>> v11 than before.

> As a short term work around, could I create the index first and use 
> insert statements, each in their own transaction, to get the table loaded 
> with the index?

Yes; it might also be that you don't even need to break it up into
separate statements.

> Is the issue on Fedora taking very long to build a normal spgist index for 
> network addresses worth pursuing separately, or is it likely to be the same 
> underlying cause?

This issue only applies if it was an exclusion constraint.  If you saw
slowness or bloat with a plain index, that would be worth investigating
separately.

regards, tom lane

Different memory allocation strategy in Postgres 11?

2018-10-26 Thread Thomas Kellerer

I have a Postgres instance running on my Windows laptop for testing purposes. 

I typically configure "shared_buffers = 4096MB" on my 16GB system as sometimes 
when testing, it pays off to have a bigger cache. 

With Postgres 10 and earlier, the Postgres process(es) would only allocate that 
memory from the operating system when needed. 
So right after startup, it would only consume several hundred MB, not the 
entire 4GB

However with Postgres 11 I noticed that it immediately grabs the complete 
memory configured for shared_buffers during startup.

It's not really a big deal, but I wonder if that is an intentional change or a 
result from something else? 

Regards
Thomas

Re: ERROR: type "myrowtype" does not exist

2018-10-26 Thread Adrian Klaver


On 10/26/18 4:54 AM, Arup Rakshit wrote:
I was reading the json_populate_record function from the official doc 
https://www.postgresql.org/docs/10/static/functions-json.html#FUNCTIONS-JSON-PROCESSING-TABLE. 
The doc mentioned examples are giving me errors


psql (11.0, server 10.5)
Type "help" for help.

aruprakshit=# select * from json_populate_record(null::extype, 
'{"x":"foo", "y":43, "q": true}');

ERROR:  type "extype" does not exist
LINE 1: select * from json_populate_record(null::extype, '{"x":"foo"...
                                                  ^
aruprakshit=# select * from json_populate_record(null::myrowtype, '{"a": 
1, "b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}}');

ERROR:  type "myrowtype" does not exist
LINE 1: select * from json_populate_record(null::myrowtype, '{"a": 1…

What does null::extype, null::myrowtype mean here ?


See this thread:

https://www.postgresql.org/message-id/2a8ae0b7-1fd4-869a-a25e-d598968898c9%40aklaver.com

In meantime:

CREATE TABLE public.test_table (
id integer,
username character varying,
fullname character varying
);


test=> \d test_table
   Table "public.test_table"
  Column  |   Type| Collation | Nullable | Default
--+---+---+--+-
 id   | integer   |   |  |
 username | character varying |   |  |
 fullname | character varying |   |  |


test=> select * from json_populate_record(null::test_table, '{"id": 1, 
"username": "adrian", "fullname": "adrian klaver"}')

;
 id | username |   fullname
+--+---
  1 | adrian   | adrian klaver
(1 row)


So myrowtype is just a placeholder in the docs for a rowtype that you 
need to supply.





Thanks,

Arup Rakshit
a...@zeit.io 






--
Adrian Klaver
adrian.kla...@aklaver.com

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Alvaro Herrera

On 2018-Oct-26, Tom Lane wrote:

> After a quick look around, I think that making systable_begin/endscan
> do this is a nonstarter; there are just too many call sites that would
> be affected.  Now, you could imagine specifying that indexes on system
> catalogs (in practice, only btree) have to clean up at endscan time
> but other index types don't, so that only operations that might be
> scanning user indexes need to have suitable wrapping contexts.  Not sure
> there's a lot of benefit to that, though.

How about modifying SysScanDescData to have a memory context member,
which is created by systable_beginscan and destroyed by endscan?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Tom Lane

Alvaro Herrera  writes:
> On 2018-Oct-26, Tom Lane wrote:
>> After a quick look around, I think that making systable_begin/endscan
>> do this is a nonstarter; there are just too many call sites that would
>> be affected.  Now, you could imagine specifying that indexes on system
>> catalogs (in practice, only btree) have to clean up at endscan time
>> but other index types don't, so that only operations that might be
>> scanning user indexes need to have suitable wrapping contexts.  Not sure
>> there's a lot of benefit to that, though.

> How about modifying SysScanDescData to have a memory context member,
> which is created by systable_beginscan and destroyed by endscan?

I think it would still have problems, in that it would affect in which
context index_getnext delivers its output.  We could reasonably make
these sorts of changes in places where the entire index_beginscan/
index_getnext/index_endscan call structure is in one place, but that
doesn't apply for the systable functions.

Also, going in that direction would imply an additional memory context
switch / switch-back per tuple processed (around the index_getnext call),
which would create questions about whether it has a negative performance
impact.

regards, tom lane

Re: Should pg 11 use a lot more memory building an spgist index?

2018-10-26 Thread Bruno Wolff III


On Fri, Oct 26, 2018 at 13:44:07 +0100,
 Tom Lane  wrote:

Bruno Wolff III  writes:


As a short term work around, could I create the index first and use
insert statements, each in their own transaction, to get the table loaded
with the index?


Yes; it might also be that you don't even need to break it up into
separate statements.


It was time to refresh the geolite data anyway so I tried this. I needed 
to turn memory_overcommit back on (0) to avoid an error, but the load went OK 
without the oom killer doing anything. So things are fully working again.


Thanks for your help.


Is the issue on Fedora taking very long to build a normal spgist index for
network addresses worth pursuing separately, or is it likely to be the same
underlying cause?


This issue only applies if it was an exclusion constraint.  If you saw
slowness or bloat with a plain index, that would be worth investigating
separately.


I'll start a seperate thread if I get something to reasonably ask about. 
The current dataset is probably a lot larger then needed to demonstrate the 
issue. The difference might be do to configuration or how Fedora built it. 
And I'll want to compare back to version 10. In the end I'll probably ask 
why it is slower in one case as opposed to the other and it might not even 
be a real bug.

Re: Different memory allocation strategy in Postgres 11?

2018-10-26 Thread Jeff Janes

On Fri, Oct 26, 2018 at 9:12 AM Thomas Kellerer  wrote:

> I have a Postgres instance running on my Windows laptop for testing
> purposes.
>
> I typically configure "shared_buffers = 4096MB" on my 16GB system as
> sometimes when testing, it pays off to have a bigger cache.
>
> With Postgres 10 and earlier, the Postgres process(es) would only allocate
> that memory from the operating system when needed.
> So right after startup, it would only consume several hundred MB, not the
> entire 4GB
>
> However with Postgres 11 I noticed that it immediately grabs the complete
> memory configured for shared_buffers during startup.
>
> It's not really a big deal, but I wonder if that is an intentional change
> or a result from something else?
>

Do you have pg_prewarm in shared_preload_libraries?

Cheers,

Jeff

Re: GIN Index for low cardinality

2018-10-26 Thread Jeff Janes

On Wed, Oct 17, 2018 at 6:47 AM Ravi Krishna  wrote:

> In
> https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/
>
> it is mentioned:
>
> "GIN, the most know non-default index type perhaps, has been actually
> around for ages (full-text search) and in short is perfect for indexing
> columns where there are lot of repeating values – think all kinds of
> statuses or good old Mr/Mrs/Miss. GIN only stores every unique column value
> only once as for the default B-tree you’ll have e.g. 1 millon leaf nodes
> with the integer “1” in it."
>
>
> Does it mean that GIN is a very good choice for low cardinality columns.
>

For extremely low cardinality, like Mr. Mrs. Miss., I doubt any index would
be very useful.  For less extreme cases, like say one million different
values present around 50 times each, yes, it can be useful to keep the
index size down.  It will not support needing to deliver rows in sorted
order, for example to fulfill an ORDER BY or a merge join.  Think carefully
about what setting you want for fast_update, and, if set to on, then what
value to use for gin_pending_list_limit.

Cheers,

Jeff

Re: GIN Index for low cardinality

2018-10-26 Thread Jeff Janes

On Thu, Oct 25, 2018 at 9:36 AM Олег Самойлов  wrote:

>
> 17 окт. 2018 г., в 13:46, Ravi Krishna  написал(а):
>
> In
> https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/
>
> it is mentioned:
>
> "GIN, the most know non-default index type perhaps, has been actually
> around for ages (full-text search) and in short is perfect for indexing
> columns where there are lot of repeating values – think all kinds of
> statuses or good old Mr/Mrs/Miss. GIN only stores every unique column value
> only once as for the default B-tree you’ll have e.g. 1 millon leaf nodes
> with the integer “1” in it."
>
>
> Does it mean that GIN is a very good choice for low cardinality columns.
>
>
> Not necessary. There is other index which also don’t keep column value in
> an every leaf. Hash, for instance.
>

For smallish values (which low cardinality columns tend to be) the
per-tuple overhead and the pointer itself is probably much larger than the
value, so hash won't save you much if any space.  The GIN index removes not
just the value, but the per-tuple overhead.  And also compresses the point
list to further save space.

Here is a real-world example from one of my databases where each value is
about 17 characters long, and is present about 20 times:

gin: 411 MB
btree: 2167 MB
hash: 2159 MB

Cheers,

Jeff

Re: GIN Index for low cardinality

2018-10-26 Thread Ozz Nixon

Jeff,
   Great info! Your example on Mr. Mrs. Miss, etc. is there a good rule of
thumb that if the data is under "x"KB an index is overhead not help? I am
not worried about space, more interested in performance.

Re: Different memory allocation strategy in Postgres 11?

2018-10-26 Thread Thomas Kellerer


Jeff Janes schrieb am 26.10.2018 um 17:42:

I typically configure "shared_buffers = 4096MB" on my 16GB system as 
sometimes when testing, it pays off to have a bigger cache.

With Postgres 10 and earlier, the Postgres process(es) would only allocate 
that memory from the operating system when needed.
So right after startup, it would only consume several hundred MB, not the 
entire 4GB

However with Postgres 11 I noticed that it immediately grabs the complete 
memory configured for shared_buffers during startup.

It's not really a big deal, but I wonder if that is an intentional change 
or a result from something else?


Do you have pg_prewarm in shared_preload_libraries?


No. The only shared libraries are those for pg_stat_statemens

RE: GIN Index for low cardinality

2018-10-26 Thread Igor Neyman

From: Ozz Nixon 
Sent: Friday, October 26, 2018 12:50 PM
To: jeff.ja...@gmail.com
Cc: spl...@ya.ru; srkrish...@aol.com; pgsql-general@lists.postgresql.org
Subject: Re: GIN Index for low cardinality

Jeff,
   Great info! Your example on Mr. Mrs. Miss, etc. is there a good rule of 
thumb that if the data is under "x"KB an index is overhead not help? I am not 
worried about space, more interested in performance.

I think, partial/conditional indexes: …. Where greeting = ‘…’ – may help.

Regards,
Igor Neyman

How to get partition info for a partition table?

2018-10-26 Thread Yuxia Qiu

HI Dear PostgreSQL,

 I am pretty new for this DB.
 I have created two partition tables as bellow:
*CREATE TABLE* measurement_year_month (
 logdate date not null,
 peaktemp int,
 unitsales int
) PARTITION BY RANGE (EXTRACT(YEAR FROM logdate), EXTRACT(MONTH FROM
 logdate));

*create table* partitionTest(
  firstname  varchar[20],
  lastName   varchar[20],
  id int
)
partition by range (id, firstname);

*My question is:* how do I retrieve the information like bellow?
*Table name**Partition type *  * Partition
information*
 measurement_year_month   Range  (EXTRACT(YEAR FROM
logdate), EXTRACT(MONTH FROM logdate)
partitionTestRange  (id,
firstname)

Thanks in advance.

Regards,
Yuxia

Re: Different memory allocation strategy in Postgres 11?

2018-10-26 Thread Thomas Munro

On Sat, Oct 27, 2018 at 6:10 AM Thomas Kellerer  wrote:
> Jeff Janes schrieb am 26.10.2018 um 17:42:
> > I typically configure "shared_buffers = 4096MB" on my 16GB system as 
> > sometimes when testing, it pays off to have a bigger cache.
> >
> > With Postgres 10 and earlier, the Postgres process(es) would only 
> > allocate that memory from the operating system when needed.
> > So right after startup, it would only consume several hundred MB, not 
> > the entire 4GB
> >
> > However with Postgres 11 I noticed that it immediately grabs the 
> > complete memory configured for shared_buffers during startup.
> >
> > It's not really a big deal, but I wonder if that is an intentional 
> > change or a result from something else?
> >
> >
> > Do you have pg_prewarm in shared_preload_libraries?
>
> No. The only shared libraries are those for pg_stat_statemens

Does your user have "Lock Pages in Memory" privilege?  One thing that
is new in 11 is huge AKA large page support, and the default is
huge_pages=try.  Not a Windows person myself but I believe that should
succeed if you have that privilege and enough contiguous chunks of
physical memory are available.  If you set huge_pages=off does it
revert to the old behaviour?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: rw_redis_fdw: SQL Errors when statement is within a function

2018-10-26 Thread Christoph Moench-Tegeder

## GPT (gptmailingli...@gmail.com):

> I have searched in
> https://github.com/nahanni/rw_redis_fdw/blob/master/redis_fdw.c for
> PREPARE and EXECUTE keywords. There are not any of them, except in
> comments.

Of course not - the FDW does not execute SQL on the PostgreSQL side,
but sends commands to redis.
Still, the FDW operations are part of PostgreSQL's query plan; they
have to be - everything PostgreSQL does is organized in plans, and the
operations done on the Foreign Server (redis in this case) have to
have their interfaces in the plan.

And the important thing is: there is no guarantee that the same SQL
statement will always execute with the same plan: One reason would be
changing table statistics, another is when PostgreSQL switches to
the generic plan for a prepared statement. Your case looks like the
latter, especially the observation "After that (6th time)" in
https://github.com/nahanni/rw_redis_fdw/issues/13#issuecomment-428670890
hints to that.
So, where does that prepared statement come from? You don't really
describe your environment... It's unlikely that you're calling PREPARE
yourself - but some drivers are notorious for that (Perl DBI's
$dbh->prepare() or JDBC's PreparedStatement come to mind), and
even PL/pgSQL uses prepared statements internally:
https://www.postgresql.org/docs/11/static/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

So: plans are not stable between query executions, and you may have
prepared statements without knowing that.

Regards,
Christoph

-- 
Spare Space.

Re: GIN Index for low cardinality

2018-10-26 Thread Jeff Janes

On Fri, Oct 26, 2018 at 12:27 PM Jeff Janes  wrote:

Here is a real-world example from one of my databases where each value is
> about 17 characters long, and is present about 20 times:
>
> gin: 411 MB
> btree: 2167 MB
> hash: 2159 MB
>

For what it is worth, that was 9.6 with freshly rebuilt indexes.

On 11.0 with freshly rebuilt indices, the hash index does get a bit smaller:

gin: 411 MB
btree: 2167 MB
hash: 1864 MB

Cheers,

Jeff
>
>

Question about partition table

2018-10-26 Thread Yuxia Qiu

HI Dear PostgreSQL team,

   I have created a partition table as bellow:
   *CREATE TABLE* measurement_year_month (
 logdate date not null,
 peaktemp int,
 unitsales int
) PARTITION BY RANGE (EXTRACT(YEAR FROM logdate), EXTRACT(MONTH FROM
 logdate));

so the content for this  column *partexprs* for this table  in
pg_partitioned_table will be:

({FUNCEXPR :funcid 1384 :funcresulttype 701 :funcretset false :funcvariadic
false :funcformat 0 :funccollid 0 :inputcollid 100 :args ({CONST :consttype
25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false
:constisnull false :location 122 :constvalue 8 [ 32 0 0 0 121 101 97 114 ]}
{VAR :varno 1 :varattno 1 :vartype 1082 :vartypmod -1 :varcollid 0
:varlevelsup 0 :varnoold 1 :varoattno 1 :location 132}) :location 114}
{FUNCEXPR :funcid 1384 :funcresulttype 701 :funcretset false :funcvariadic
false :funcformat 0 :funccollid 0 :inputcollid 100 :args ({CONST :consttype
25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false
:constisnull false :location 150 :constvalue 9 [ 36 0 0 0 109 111 110 116
104 ]} {VAR :varno 1 :varattno 1 :vartype 1082 :vartypmod -1 :varcollid 0
:varlevelsup 0 :varnoold 1 :varoattno 1 :location 162}) :location 142})

My question is: Can I get string value from this column? and how?
In the end I want to have the bellow result:


*Table name* *Partition
information*
measurement_year_month(EXTRACT(YEAR FROM logdate),
EXTRACT(MONTH FROM  logdate))


Your help is highly appreciated.

Thanks,
Yuxia

Re: rw_redis_fdw: SQL Errors when statement is within a function

Re: Postgresql issue: FATAL: dsa_allocate could not find 7 free pages

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Postgresql issue: FATAL: dsa_allocate could not find 7 free pages

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Strange behavior of the random() function

Re: Should pg 11 use a lot more memory building an spgist index?

ERROR: type "myrowtype" does not exist

Shell Command within function

Re: Shell Command within function

Re: Shell Command within function

Re: Should pg 11 use a lot more memory building an spgist index?

Different memory allocation strategy in Postgres 11?

Re: ERROR: type "myrowtype" does not exist

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Should pg 11 use a lot more memory building an spgist index?

Re: Different memory allocation strategy in Postgres 11?

Re: GIN Index for low cardinality

Re: GIN Index for low cardinality

Re: GIN Index for low cardinality

Re: Different memory allocation strategy in Postgres 11?

RE: GIN Index for low cardinality

How to get partition info for a partition table?

Re: Different memory allocation strategy in Postgres 11?

Re: rw_redis_fdw: SQL Errors when statement is within a function

Re: GIN Index for low cardinality

Question about partition table

31 matches

Site Navigation

Mail list logo

Footer information