jsonb: unwrapping text

2021-10-27 Thread tomas
Hi,

I'm trying to extract text from a jsonb 'string'. Simply casting
gives me the string wrapped in quotes:

  foo=# select '"foo"'::jsonb::text;
   text  
  ---
   "foo"
  (1 row)

This, of course, makes kind of sense, since it /is/ the JSON's
textual representation.

What is the canonical way to get at the string's content itself?

I've found out that one can treat a string singleton as if it
were an array:

  foo=# select '"foo"'::jsonb ->> 0;
   ?column? 
  --
   foo
  (1 row)

which conveniently returns the right type. My question: can I rely
on that, or am I missing a much more obvious option?

(I try to stay compatible down to PostgreSQL 11, 9 when possible.
Conservative customers and that).

Thanks for any insights, cheers
-- tomás


signature.asc
Description: Digital signature


Re: jsonb: unwrapping text

2021-10-28 Thread tomas
On Wed, Oct 27, 2021 at 04:18:20PM -0700, David G. Johnston wrote:
> On Wed, Oct 27, 2021 at 11:58 AM  wrote:
> 
> >
> > I've found out that one can treat a string singleton as if it
> > were an array:
> >
> >   foo=# select '"foo"'::jsonb ->> 0;
> >?column?
> >   --
> >foo
> >   (1 row)
> >
> > which conveniently returns the right type. My question: can I rely
> > on that, or am I missing a much more obvious option?
> >
> >
> Not sure if this exact behavior is trustworthy - but you are on the right
> path. Place the value into either a json array or json object and then use
> the text versions of the accessor methods to get the json value to pass
> through the decoding routine.

Thanks a bunch :)

I know that, behind the scenes, jsonb scalars (didn't check that
for json) are actually represented as one-element arrays, but was unsure
how much this can be relied on as "official interface" :-)

This leaves us with

  foo=# select jsonb_build_array('"foo"'::jsonb)->>0;
   ?column? 
  --
   foo
  (1 row)

...which feels somewhat roundabout, but hey, it actually works. I'll

What also seems to work is #>> with an empty path specifier, i.e.

  select '"foo"'::jsonb #>> '{}';

...but all of them feel somewhat hacky. I'll post a request with the
form linked in [1], let's see :-)

Thanks again for your assessment, cheers
 - t


signature.asc
Description: Digital signature


PQexecParams, placeholders and variable lists of params

2021-11-23 Thread tomas
Hi,

PQexecParams expects a query string with "$1", "$2"... placeholders,
which refer to as many params in the param list. This keeps SQL
injection at bay.

Is there a way to express "variable length" lists? IOW, if I want to do
a query like

  "SELECT * FROM customers WHERE id IN ($1, $2) AND name like $3;"

is there a way to do that without knowing beforehand how many values go
into the IN list?

It would be very welcome for you to rub my nose against the place in The
Fine Manual where I could have found that :-)

Thanks & cheers
 - tomás


signature.asc
Description: PGP signature


Re: Regex for Word space Word space Word ....

2021-11-23 Thread tomas
On Tue, Nov 23, 2021 at 09:58:00AM +, Shaozhong SHI wrote:
> Is there any regex for Word space Word space Word and more?

It isn't very clear what you want to achieve. From the other mails in
this thread I understand that your words start with an uppercase char
and continue with lowercase chars. Is that right?

You want exactly one space between words, or more than one?

What is this "...and more"? Arbitrary repetitions?

Which kind of regular expressions do you want to use? POSIX?

If all the answers to the above are "yes", you might try something like

  "(?:[[:upper:]][[:lower:]]*[[:space:]]+)*[[:upper:]][[:lower:]]*"

(Caveat: untested). This would match a single word or more than one
word, separated by one or more spaces, where a word starts with one
upper-case character and continues with zero or more lowercases.

HTH
 - t


signature.asc
Description: PGP signature


Re: PQexecParams, placeholders and variable lists of params

2021-11-23 Thread tomas
On Tue, Nov 23, 2021 at 06:39:27PM +0500, Дмитрий Иванов wrote:
> Hi
> A function cannot have an undefined signature, but can accept an array of
> arguments:

I see. So you propose passing an array as a single param to
PQexecParams, in PostgreSQL's syntax for arrays, e.g.. "{42, 45, 50}".

Makes sense. Problem is, that, again, the application would be
responsible of making sure the individual values don't contain nasty
stuff (for example, if they are strings) before consolidating them to
one PostgreSQL array literal.

I was hoping to get away "on the cheap" on this, letting PostgreSQL take
care of the injection avoidance ;-)

I'm converging in building the query dynamically, but still with
placeholders. I /know/ how many values are coming, and how many
placeholders used so-far in the query, so it'd be fairly easy to just
insert "$m"..."$n" as needed.

Thanks a lot, Dmitri

Cheers
 - t


signature.asc
Description: PGP signature


Re: PQexecParams, placeholders and variable lists of params

2021-11-23 Thread tomas
On Tue, Nov 23, 2021 at 10:43:03AM -0500, Tom Lane wrote:
> "David G. Johnston"  writes:
> > On Tue, Nov 23, 2021 at 7:21 AM  wrote:
> >> Makes sense. Problem is, that, again, the application would be
> >> responsible of making sure the individual values don't contain nasty
> >> stuff (for example, if they are strings) before consolidating them to
> >> one PostgreSQL array literal.
> 
> > So long as you actually pass the literal value via a parameter the worst
> > problem you can have is a syntax error in converting the literal into
> > whatever type is being cast to.
> 
> PG's array quoting rules are odd enough that I can sympathize with not
> wanting to deal with them.  (Although, if you only have to build an
> array and not parse one, taking the always-quote-even-if-not-necessary
> approach makes it easier.)
> 
> I don't see many other alternatives though.  *Somehow* you have to
> separate one value from the next.  If you don't want to pass 'em as
> distinct parameters, then you have to obey some kind of composite-value
> syntax.

Yes, that is my conclusion, too. Tentatively, I'll go with dynamically
building the query string, but with "$n" placeholders -- counting args
as I go, and pass the args to PQexecParams.

This seems to afford injection protection in exchange of minimal fuss.

Thank you all for your input!

Cheers
 - t


signature.asc
Description: PGP signature


Re: PQexecParams, placeholders and variable lists of params

2021-11-23 Thread tomas
On Tue, Nov 23, 2021 at 05:14:44PM +0100, Daniel Frey wrote:
> > On 23. Nov 2021, at 16:43, Tom Lane  wrote:
> > 
> > PG's array quoting rules are odd enough that I can sympathize with not
> > wanting to deal with them.  (Although, if you only have to build an
> > array and not parse one, taking the always-quote-even-if-not-necessary
> > approach makes it easier.)
> > 
> > I don't see many other alternatives though.  *Somehow* you have to
> > separate one value from the next.  If you don't want to pass 'em as
> > distinct parameters, then you have to obey some kind of composite-value
> > syntax.
> 
> Would it be possible to extend PQexecParams() et.al. like this:
> 
> You currently have paramValues[], paramLengths[], and paramFormats[] (plus 
> other parameters that I'll ignore here).
> 
> The format may currently be 0 or 1 (TEXT or BINARY). What if we allow 2 for 
> ARRAY? The corresponding  length then specifies how many parameters following 
> are part of the array. The value should point to a structure, that contains 
> pointers to the values, lengths, and formats of the elements. This also 
> allows nested arrays.

That sounds attractive; I think for my particular case it'd be
overengineering, though...

> If the client library knows that the server is too old to understand it, it 
> may temporarily assemble a string for those (correctly escaped) values and 
> replace the entries in the original values/lengths/formats arrays temporarily 
> before passing it to the old PQexecParams() implementation.
> 
> If the server is new enough the protocol itself can be extended to send the 
> array more efficiently instead of quoting and copying data around.
> 
> This would also hide the quoting rules for arrays nicely, as it doesn't 
> require additional methods for escaping. (Currently, escaping for arrays is 
> different from other escaping methods, it needs to be done manually and, 
> frankly, it's a PITA).

...but in the general case it sounds useful, yes :)

Cheers
 - t


signature.asc
Description: PGP signature


Re: dirty_ratio & dirty_background_ratio settings with huge memory

2018-03-06 Thread Tomas Vondra


On 03/06/2018 01:16 PM, Laurenz Albe wrote:
> pinker wrote:
>> I've got cutomer with really huge RAM, now it's:
>>  total   used   free sharedbuffers cached
>> Mem:   31021113052596  49515 2088019922961185
>> -/+ buffers/cache:  904183011693
>> Swap: 8191  1   8190
>>
>> (free -m)
>>
>> and before it was twice more (6TB).
>>
>> and trying to figure out how to set dirty_ratio & dirty_background_ratio
>> parameters. Even for normal sized server those default sizes are to high,
>> but now would be ridiculously huge, respectively 1,24TB and 300GB. 
>> I'm thinking about 1 percent for dirty_ratio and not using
>> dirty_background_ratio because it's not possible to set it below 1% but to
>> set dirty_background_bytes instead to about 1-2GB.
>> Maybe someone has got other experience with RAM of this size and those
>> settings?
> 
> Yes, you should set vm.dirty_background_bytes and vm.dirty_bytes
> and not use the *_ratio settings.
> 
> 2 GB for vm.dirty_background_bytes and 1 GB for vm.dirty_bytes sounds fine.
> 

It should be the other way around: dirty_background_bytes < dirty_bytes

regards
-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

2018-03-21 Thread Tomas Vondra


On 03/21/2018 05:09 PM, Alessandro Aste wrote:
> Hi there, we are using postgresql 10.3 and we're facing an issue with a
> query. The query (full query below)  completes only  when: 
> 
> 1 - LIMIT 10 is removed
> or
> 2 -  show max_parallel_workers_per_gather  is set to 0, so parallel
> processing is disabled.
> 
> With  max_parallel_workers_per_gather   set to the default value (8) I'm
> not even able to get the query plan.
> 
> Notes: 
> 
>   * We're experiencing the issue in any server of ours but I've
> reproduced the issue in a fresh restored database with full
> vacuum/reindex of the tables.
>   * We didn't touch any parameter concering the parallel processing,
> we're running the defaults: 
> 
> 
> cmdstaging=# show max_parallel_workers_per_gather ;
>  max_parallel_workers_per_gather
> -
>  8
> (1 row)
> 
> cmdstaging=# show max_worker_processes ;
>  max_worker_processes
> --
>  8
> (1 row)
> 
> cmdstaging=# show max_parallel_workers;
>  max_parallel_workers
> --
>  8
> (1 row)
> 
> 
> 
> 
> The query completes only omitting the LIMIT clause or when I disable
> parallel processing: 
> 
>  id   | vendor_id | gaa |   pop_name    | pop_status | pop_location |
> pop_provider_id | pop_provider |     pop_street      | pop_city | pop
> _postal_code | pop_state | pop_country | pop_country_id
> +---+-+---++--+-+--+-+--+
> -+---+-+
>  684807 |     12346 |     | GTT/POP/LON1T | Active     | LON1T        | 
>          12288 | Telehouse UK | 14 Coriander Avenue | London   | E14
>  2AA         |           | GB          |            219
> (1 row)
> 
> Time: 4374.759 ms (00:04.375)
> cmdstaging=# show max_parallel_workers_per_gather ;
>  max_parallel_workers_per_gather
> -
>  0
> (1 row)
> 
> Time: 0.097 ms
> 
> 
> Otherwise it just keep running for forever. 
> 

When you say "running forever" is it actually using CPU, or does it get
stuck on something?

> 
> This is the full query: 
> 
> 
> SELECT * FROM (
> SELECT
> seg.circuit_id AS id,
> vendor_gtt_pop.vendor_id,
> CASE WHEN vendor_gtt_pop.push_to_gaa = 1 THEN 'Y' END as gaa,
> pop.gii_circuitid AS pop_name,
> cst.label AS pop_status,
> seg.a_company_name AS pop_location,
> seg.vendor_id AS pop_provider_id,
> seg.vendor_name AS pop_provider,
> cs.address1 AS pop_street,
> cs.city AS pop_city,
> cs.postal_code AS pop_postal_code,
> cs.state AS pop_state,
> csc.code AS pop_country,
> cs.country_id AS pop_country_id
> FROM (
> SELECT c.gii_circuitid, max(so.id <http://so.id>) AS service_order_id
> FROM service_order so
> join circuit c on c.product_id=so.product_id
> join master_service_order mso on mso.id 
> <http://mso.id>=so.master_service_order_id
> WHERE NOT (so.ordertype_id = 2 AND so.status <> 999) AND
> NOT (so.ordertype_id = 3 AND so.status <> 999) AND
> c.status >= 20 AND
> c.status not in (160,999) AND
> mso.client_id=11615 AND
> c.service_description=28 AND
> c.status!=160
> GROUP BY c.gii_circuitid
> ) pop
> JOIN service_order so ON so.id <http://so.id> = pop.service_order_id
> left JOIN client_site cs on cs.id <http://cs.id>=so.a_site_id
> left JOIN country csc on csc.id <http://csc.id>=cs.country_id
> JOIN circuit c ON so.product_id=c.product_id
> JOIN circuit_status cst ON cst.id <http://cst.id>=c.status
> JOIN (
> SELECT c.id <http://c.id> AS circuit_id, sg.id <http://sg.id> AS segment_id, 
> c.pop_support_vendor_id AS vendor_id,
> v.name <http://v.name> AS vendor_name, sg.a_company_name
> FROM segment sg
> JOIN circuit_layout cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal 
> = 1
> JOIN circuit c ON c.id <http://c.id> = cl.circuit_id
> JOIN vendor v ON v.id <http://v.id> = c.pop_support_vendor_id
> ) seg ON seg.circuit_id = c.id <http://c.id>
> JOIN vendor_gtt_pop on vendor_gtt_pop.gtt_pop_id = seg.circuit_id
> ) foo where vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id LIMIT 10
> 
> 
> 
> Execution plan with  max_parallel_workers_per_gather =0 , 
> max_parallel_workers_per_gather =8 and no LIMIT clause : 
> 
> 
We really need to see the execution plan that causes issues, i.e.
max_parallel_workers_per_gather=8 with LIMIT clause. Plain explain
(without analyze), at least.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

2018-03-21 Thread Tomas Vondra

On 03/21/2018 08:44 PM, Alessandro Aste wrote:
> Thanks for your reply Tomas.  The query just got stuck for forever.  I
> observed no CPU spikes, it is currently running and I see 89 of the CPU
> idle. I'm using a relative powerfull machine (120 GB ram, 56 CPU).  
> 

That doesn't really answer the question, I'm afraid. I suppose "89 of
CPU" means that 89% idle in total, but 11% with 56 CPUs still means
about 6 cores 100% busy. But maybe you meant something else?

Is there something else running on the machine? If you look at "top" are
the processes (the one you're connected to and the parallel workers)
doing something on the CPU?

> 
> Plain analyze as requested. : 
> 

I don't see anything obviously broken with the query plan, and it's
difficult to compare with the other plans because they are quite different.

But there's one part of the plan interesting:

 Limit  (cost=253523.56..253523.57 rows=1 width=176)
   ->  Sort  (cost=253523.56..253523.57 rows=1 width=176)
 Sort Key: c_2.gii_circuitid, c_1.id
 ->  Nested Loop  (cost=33190.89..253523.55 rows=1 width=176)
   Join Filter: (c_1.id = c.id)
   ->  Nested Loop  (cost=31724.87..31736.29 rows=1 ...)
 ...
   ->  Gather  (cost=1466.02..221787.23 rows=3 width=75)
 Workers Planned: 5
 ->  Hash Join  (cost=466.02..220786.93 rows=1 ...)
 ...

That is, there's a Gather on the inner side of a Nested Loop. I wonder
if that might cause issues in case of under-estimate (in which case we'd
be restarting the Gather many times) ...


BTW one of the plans you sent earlier is incomplete, because it ends
like this:

->  Nested Loop  (cost=42469.41..42480.82 rows=1 width=85) (...)
Join Filter: (c.status = cst.id)
Time: 3016.688 ms (00:03.017)

That is, it's missing the part below the join.


That being said, I'm not sure what's the issue here. Can you prepare a
self-contained test case that we might use to reproduce the issue? For
example by dumping the relevant part of the tables?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

2018-03-22 Thread Tomas Vondra


On 03/22/2018 11:33 AM, Alessandro Aste wrote:
> Tomas, thank you. This machine  is abare metal server running only a
> staging postgresql 10.3  instance. Nobody is using it beside me. 
> 
> I'm attaching 4 files.  
> 
> every_30_seconds_top_stats_during_query.txt    - this is a caputure of
> the top command every 30 seconds(more or less) for 10+ minutes while I'm
> running the query. Let me know if this helps to answere your question.
> EXPLAIN_WITH_LIMIT_AND_max_parallel_workers_per_gather_SET_TO_0.txt   - 
> query plan  with full query and  max_parallel_workers_per_gather  force
> to 0. Full output.
> EXPLAIN_WITH_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.txt  - 
>  query plan with full query and  default parellel processing settings. 
> Full output.
> EXPLAIN_WITHOUT_LIMIT_AND_DEFAULT_PARALLEL_PROCESSING_SETTINGS.tx  -
> query plan of the query omitting the LIMIT clause and default parellel
> processing settings. Full output.
> 

OK. Looking at the top output, I see this:

   PID USER   VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
104880 postgres  30.8g 1.9g 1.9g R 92.0  1.5  15:15.60 postmaster
111732 postgres  30.8g 476m 473m R 88.2  0.4   0:00.47 postmaster
111730 postgres  30.8g 473m 470m R 86.4  0.4   0:00.46 postmaster
111731 postgres  30.8g 476m 473m R 86.4  0.4   0:00.46 postmaster
111733 postgres  30.8g 476m 473m R 86.4  0.4   0:00.46 postmaster
111734 postgres  30.8g 476m 473m R 86.4  0.4   0:00.46 postmaster
111728 root  15824 1912  828 R  3.8  0.0   0:00.04 top

That means it certainly is not stuck, it's simply doing a lot of work on
CPU. The question is why and what it's doing ...

Can you collect some CPU profiles using perf? There's a howto here:

https://wiki.postgresql.org/wiki/Profiling_with_perf

But in short - install perf, install debuginfo packages for postgres,
and then do

perf record -a -g -s sleep 60

while running the query. Once the profile data is collected, do

perf report > report.txt

and share the report.txt with us (well, if it's very large you may need
to only cut the first 1000 lines or so).

That should tell us in which functions most of the time is spent. That
will give us some hints, hopefully.

> 
> For what concerns the  self-contained test case - I'll do my best to
> prepare it.
> 

Yeah, that would be helpful.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

2018-03-22 Thread Tomas Vondra
On 03/22/2018 11:29 PM, Alessandro Aste wrote:
> Thanks Tomas. We're currently building postgres from source. In order to
> enable symbols, you want me to re-configure postres with  --enable-debug
> then run perf?
> 

Yes.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgresql 10.3 , query never completes if LIMIT clause is specified and paralle processing is on

2018-03-28 Thread Tomas Vondra
Hi,

the perf profiles look really weird - clearly, there's a lot of lock
contention, because the top item is this

13.49%13.40%  postmaster   postgres  [.]   LWLockAcquire
 |

 ---LWLockAcquire



That's a sign of lock contention - not sure which one, though. And then
there's heap_hot_search_buffer

12.65% 7.92%  postmaster   postgres [.] heap_hot_search_buffer
 |

 ---heap_hot_search_buffer


So either there's a table with many HOT-updated rows (repeatedly), or we
search the HOT chains very frequently for some reason. Considering it
only affects the non-LIMIT case, I'd guess the latter.

There also seems to be quite a few page faults, for some reason.

Trying to reproduce this without the actual data is rather futile. We
need some sort of reproducer (sample data to test the query on).

regards

On 03/28/2018 10:30 AM, Alessandro Aste wrote:
> Hello,  any news ?
> 
> Thank you,
> 
> Alessandro.
> 
> On Fri, Mar 23, 2018 at 8:22 PM, Alessandro Aste
> mailto:alessandro.a...@gmail.com>> wrote:
> 
> PS , in the meanwhile I discovered a 2nd workaround(beside disabling
> parallel processing) . I added offset  0 to the subquery , and,
> according to the documentation, “OFFSET 0 is the same as omitting
> the OFFSET clause”
> - https://www.postgresql.org/docs/current/static/queries-limit.html
> <https://www.postgresql.org/docs/current/static/queries-limit.html>
> 
> cmd3dev=# show max_parallel_workers_per_gather ;
> 
> *max_parallel_workers_per_gather*
> 
> *-*
> 
> *8*
> 
> (1 row)
> 
>  
> 
> cmd3dev=# \timing
> 
> Timing is on.
> 
> cmd3dev=#  SELECT * FROM (SELECT seg.circuit_id AS id,
> vendor_gtt_pop.vendor_id, CASE WHEN vendor_gtt_pop.push_to_gaa = 1
> THEN 'Y' END as gaa, pop.gii_circuitid AS pop_name, cst.label AS
> pop_status, seg.a_company_name AS pop_location, seg.vendor_id AS
> pop_provider_id, seg.vendor_name AS pop_provider, cs.address1 AS
> pop_street, cs.city AS pop_city, cs.postal_code AS pop_postal_code,
> cs.state AS pop_state, csc.code AS pop_country, cs.country_id AS
> pop_country_id FROM ( SELECT c.gii_circuitid, max(so.id
> <http://so.id>) AS service_order_id FROM service_order so join
> circuit c on c.product_id=so.product_id join master_service_order
> mso on mso.id <http://mso.id>=so.master_service_order_id WHERE NOT
> (so.ordertype_id = 2 AND so.status <> 999) AND NOT (so.ordertype_id
> = 3 AND so.status <> 999) AND c.status >= 20 AND c.status not in
> (160,999) AND mso.client_id=11615 AND c.service_description=28 AND
> c.status!=160 GROUP BY c.gii_circuitid ) pop JOIN service_order so
> ON so.id <http://so.id> = pop.service_order_id left JOIN client_site
> cs on cs.id <http://cs.id>=so.a_site_id left JOIN country csc on
> csc.id <http://csc.id>=cs.country_id JOIN circuit c ON
> so.product_id=c.product_id JOIN circuit_status cst ON cst.id
> <http://cst.id>=c.status JOIN ( SELECT c.id <http://c.id> AS
> circuit_id, sg.id <http://sg.id> AS segment_id,
> c.pop_support_vendor_id AS vendor_id, v.name <http://v.name> AS
> vendor_name, sg.a_company_name FROM segment sg JOIN circuit_layout
> cl ON cl.segment_id = sg.id <http://sg.id> AND cl.ordinal = 1 JOIN
> circuit c ON c.id <http://c.id> = cl.circuit_id JOIN vendor v ON
> v.id <http://v.id> = c.pop_support_vendor_id ) seg ON seg.circuit_id
> = c.id <http://c.id> JOIN vendor_gtt_pop on
> vendor_gtt_pop.gtt_pop_id = seg.circuit_id offset 0) foo where
> vendor_id = 12346 AND (1=1) ORDER BY pop_name ASC,id desc LIMIT 10;
> 
>    id   | vendor_id | gaa |   pop_name    | pop_status |
> pop_location | pop_provider_id | pop_provider | pop_street 
> | pop_city | pop
> 
> _postal_code | pop_state | pop_country | pop_country_id
> 
> 
> +---+-+---++--+-+--+-+--+
> 
> -+---+-+
> 
> 684807 | 12346 | Y   | GTT/POP/LON1T | Active     | LON1T   
> |   12288 | Telehouse UK | 14 Coriander Avenue | London   | E14
> 
> 2AA |   | GB  |    219
> 
> (1 row)
> 
>  
> 
> *Time: 2245.073 ms (00:02.245)*
> 
> 
> 
> 
> On Fri, Mar 23, 2018 at 9:31 AM, Alessandro Aste
> mailto:alessandro.a...@gmail.com>> wrote:
>

Re: Seems like there is an issue with reltuples showing twice the number of rows

2018-04-03 Thread Tomas Vondra
On 04/03/2018 11:14 AM, Ranjith Ramachandra wrote:
> I am relying on reltuples on my web app to get fast row counts.
> 
> This was recommended by this article to get fast approx row
> counts: https://wiki.postgresql.org/wiki/Count_estimate
> 
> 
> However for some table I am getting twice as many values when I try to
> do this. I did some more research and came up with this query.
> 
> select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname =
> 'main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d';
> 
> 
> it returns
> 
> 
>  reltuples  | n_live_tup | n_dead_tup
> -++
> 2.7209e+06 |    1360448 |    1360448
> 
> 
> If I run analyze main_csv_0f247511f5c247c8813ba3cec90c6ae1_yf34fbb38d
> 
> and I run the same query again,
> 
>   reltuples  | n_live_tup | n_dead_tup
> -++
>  1.36045e+06 |    1360448 |    1360448
> 
> But after some time the value goes back to being double the value. This
> is causing me a lot of problems since this inaccuracy does not make any
> sense to me.
> 

There was a difference between VACUUM and ANALYZE in handling recently
dead rows (essentially deleted rows that can't be removed yet), causing
similar changes to reltuples. Essentially if you do VACUUM and ANALYZE,
it may set reltuples to rather different estimates. That is fixed now
and should be in the next minor release.

Hard to say if this is a case of that, but judging by the number of dead
tuples chances are it is.

It's probably better to use n_live_tup instead, though. I'd say that's
closer to the "live tuples" definition.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ERROR: found multixact from before relminmxid

2018-04-06 Thread Tomas Vondra


On 04/06/2018 02:09 AM, Alexandre Arruda wrote:
> Hi,
> 
> Some time ago, I had this errors frequently showed in logs after some
> autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables
> show the same and not complete the tasks (showed by some table bloat
> select).
> Then, I did a full dump/restore into a new version (10.2) and everything
> is ok for a couple of months. Now, I have errors like this again:
> 
> db1=# cluster pc44t;
> 
> ERROR:  found multixact 134100944 from before relminmxid 192042633
> 
>  
> 
> Like before, the only way to make the errors to desapear is by
> dump/reload the whole table.
> 
> 
> Thanks for any help.
> 

That's going to be hard, unless you still have the cluster around.

This surely seems like some sort of data corruption issue, but without
being able to inspect the data pages it's nearly impossible to determine
what went wrong.

We'd also need more information about what happened to the hardware and
cluster before the issues started to appear - crashes, hardware issues.
And so on.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ERROR: found multixact from before relminmxid

2018-04-06 Thread Tomas Vondra


On 04/06/2018 04:29 PM, Alexandre Arruda wrote:
> 2018-04-06 9:39 GMT-03:00 Tomas Vondra  <mailto:tomas.von...@2ndquadrant.com>>:
> 
> 
> 
> On 04/06/2018 02:09 AM, Alexandre Arruda wrote:
> > Hi,
> >
> > Some time ago, I had this errors frequently showed in logs after some
> > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in this tables
> > show the same and not complete the tasks (showed by some table bloat
> > select).
> > Then, I did a full dump/restore into a new version (10.2) and everything
> > is ok for a couple of months. Now, I have errors like this again:
> >
> > db1=# cluster pc44t;
> >
> > ERROR:  found multixact 134100944 from before relminmxid 192042633
> >
> >  
> >
> > Like before, the only way to make the errors to desapear is by
> > dump/reload the whole table.
> >
> >
> > Thanks for any help.
> >
> 
> That's going to be hard, unless you still have the cluster around.
> 
> This surely seems like some sort of data corruption issue, but without
> being able to inspect the data pages it's nearly impossible to determine
> what went wrong.
> 
> We'd also need more information about what happened to the hardware and
> cluster before the issues started to appear - crashes, hardware issues.
> And so on.
> 
> regards
> 
> --
> Tomas Vondra                  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> 
> 
> Hi Tomas,
> The old cluster are gone, unfortunatly.
> 
> This server is a 128GB ECC RAM with a dual redundant hba fiber channel
> connect to a sotorage with Raid 6 and I don't have (apparently) any
> errors reported.
> Yesterday I did a test with one table:  some sum aggragates, count(*),
> etc, then dump/reload and repeat the tests the results (of querys) are
> the same, regarding the vacuum problem
> thats disapeared.
> 

I'm not sure I understand correctly. So you can reproduce the issue? If
yes, how can you share the scripts/data you use (and how large is it)?
If we could reproduce it locally, it would make the investigation much
easier.

BTW you mentioned you're using PostgreSQL 9.6 - which minor version,
specifically?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: ERROR: found multixact from before relminmxid

2018-04-09 Thread Tomas Vondra


On 04/09/2018 01:49 PM, Alexandre Arruda wrote:
> 
> 
> 2018-04-06 13:11 GMT-03:00 Tomas Vondra  <mailto:tomas.von...@2ndquadrant.com>>:
> 
> 
> 
> On 04/06/2018 04:29 PM, Alexandre Arruda wrote:
> > 2018-04-06 9:39 GMT-03:00 Tomas Vondra  <mailto:tomas.von...@2ndquadrant.com>
> > <mailto:tomas.von...@2ndquadrant.com
> <mailto:tomas.von...@2ndquadrant.com>>>:
> >
> >
> >
> >     On 04/06/2018 02:09 AM, Alexandre Arruda wrote:
> >     > Hi,
> >     >
> >     > Some time ago, I had this errors frequently showed in logs
> after some
> >     > autovacuum in some tables(pg 9.6). VACUUM FULL or CLUSTER in
> this tables
> >     > show the same and not complete the tasks (showed by some
> table bloat
> >     > select).
> >     > Then, I did a full dump/restore into a new version (10.2)
> and everything
> >     > is ok for a couple of months. Now, I have errors like this
> again:
> >     >
> >     > db1=# cluster pc44t;
> >     >
> >     > ERROR:  found multixact 134100944 from before relminmxid
> 192042633
> >     >
> >     >  
> >     >
> >     > Like before, the only way to make the errors to desapear is by
> >     > dump/reload the whole table.
> >     >
> >     >
> >     > Thanks for any help.
> >     >
> >
> >     That's going to be hard, unless you still have the cluster around.
> >
> >     This surely seems like some sort of data corruption issue, but
> without
> >     being able to inspect the data pages it's nearly impossible to
> determine
> >     what went wrong.
> >
> >     We'd also need more information about what happened to the
>     hardware and
> >     cluster before the issues started to appear - crashes,
> hardware issues.
> >     And so on.
> >
> >     regards
> >
> >     --
> >     Tomas Vondra                  http://www.2ndQuadrant.com
> >     PostgreSQL Development, 24x7 Support, Remote DBA, Training &
> Services
> >
> >
> > Hi Tomas,
> > The old cluster are gone, unfortunatly.
> >
> > This server is a 128GB ECC RAM with a dual redundant hba fiber channel
> > connect to a sotorage with Raid 6 and I don't have (apparently) any
> > errors reported.
> > Yesterday I did a test with one table:  some sum aggragates, count(*),
> > etc, then dump/reload and repeat the tests the results (of querys) are
> > the same, regarding the vacuum problem
> > thats disapeared.
> >
> 
> I'm not sure I understand correctly. So you can reproduce the issue? If
> yes, how can you share the scripts/data you use (and how large is it)?
> If we could reproduce it locally, it would make the investigation much
> easier.
> 
> BTW you mentioned you're using PostgreSQL 9.6 - which minor version,
> specifically?
> 
> 
> regards
> 
> 
> Hi Tomas,
> 
> No, I can't reproduce. What I did is a simple way to "validate" the
> current table data to see if a dump/reload
> preserve them. Old postgresql was 9.6.5. The problem returns now in new
> 10.3 installation.
> 

I'm confused. First you say you can't reproduce the issue, then you say
it got back on a new installation. So which is it?

> There is a way to correct this tables without a dump/reload ?
> 
> I'm thinking to reinstall cluster doing a initdb --data-checksums,
> but I'm affraid about a severe performance impact.
>  

The performance impact should be non-measurable for most workloads. It
might be a couple of percent on certain workloads/hardware, but that's
probably not your case.

The bigger question is whether this can actually detect the issue. If
it's due to an storage issue, then perhaps yes. But if you only see
multixact issues consistently and nothing else, it might easily be a
PostgreSQL bug (in which case the checksum will be correct).


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Shared buffers increased but cache hit ratio is still 85%

2018-07-18 Thread Tomas Vondra
On 07/18/2018 10:43 AM, Andreas Kretschmer wrote:
> 
> 
> Am 18.07.2018 um 10:26 schrieb Hans Schou:
>> Am I doing something wrong or should some history be cleared?
> 
> Reset the stats for that database. You can check the date of last reset
> with:
> 
> select stats_reset from pg_stat_database where datname = 'database_name';
> 
> and reset it with:
> 
> ||pg_stat_reset()
> ||
> ||Reset all statistics counters for the current database to zero
> (requires superuser privileges by default, but EXECUTE for this function
> can be granted to others.)||
> 

It might be better to note current values of the counters somewhere, and
compute a delta later (and use that to compute the cache hit ratio). The
issue is that pg_stat_reset() throws away all sorts of interesting and
important stats, including those driving autovacuum/autoanalyze.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?

2018-08-13 Thread Tomas Vondra




On 08/13/2018 11:07 AM, Raghavendra Rao J S V wrote:

Hi All,

We are using postgres *9.2*  version on *Centos *operating system.  We 
have around *1300+* tables.We have following auto vacuum settings are 
enables. Still few of the tables(84 tables) which are always busy are 
not vacuumed.Dead tuples in those tables are more than 5000.  Due to 
that tables are bloating and observed few areas has performance 
degradation.




You don't say how large the tables are, so it's impossible to say 
whether 5000 dead tuples is excessive or not. IMHO it's a negligible 
amount and should not lead to excessive bloat or issues.


A certain amount of wasted is expected - it's a trade-off between 
immediate and delayed cleanup. If you delay the cleanup a bit, it's 
going to be more efficient overall.


It's also unclear why the tables are not vacuumed - it may easily be due 
to all the autovacuum workers being constantly busy, unable to cleanup 
all tables in a timely manner. In that case lowering the threshold is 
not going to help, on the contrary.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: is there any adverse effect on DB if I set autovacuum scale factor to zero?

2018-08-13 Thread Tomas Vondra




On 08/13/2018 03:41 PM, Raghavendra Rao J S V wrote:


Hi Tomas,

Thank you very much for your response.

As we  know table becomes a candidate for autovacuum  process based on 
below formula.



*Autovacuum VACUUM thresold for a table = autovacuum_vacuum_scale_factor 
* number of tuples + autovacuum_vacuum_threshold*




*Current settings in my database are as follows.*


/autovacuum_vacuum_scale_factor = 0.1/

/autovacuum_vacuum_threshold = 40/



Due to above formula the dead tuples are accumulating based on the 
number of live tuples as show below picture.




select relname,n_live_tup,n_dead_tup,(n_live_tup*.1+40) 
expected_to_autovacuum,* from pg_stat_user_tables

where  n_dead_tup>800
order by n_live_tup desc
limit 100;




In order to avoid the dead tuples accumulation I wold like to change the 
auto vacuum  settings in *"postgresql.conf"* as below.


/autovacuum_vacuum_scale_factor = 0.01/
//autovacuum_vacuum_threshold = 100/
/


OK, so the tables apparently have enough dead tuples to trigger vacuum. 
That mean the autovacuum throughput is insufficient to do all the 
cleanup. If you lower the scale factor, the amount of cleanup will 
*increase* (more tables being eligible for cleanup) making it less 
likely autovacuum can keep up.


You need to increase the throughtput, by increasing vacuum_cost_limit or 
something like that.



*Kindly guide me your views. Does it cause any adverse effect on DB.*
*


Well, it forces the database to do more stuff / more often, so it may 
have adverse impact, of course. It's hard to say if it's going to be a 
win overall, because we don't know how serious is the bloat.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Fwd: is there any adverse effect on DB if I set autovacuum scale factor to zero?

2018-08-13 Thread Tomas Vondra

On 08/13/2018 04:24 PM, Ashu Pachauri wrote:

+ pgsql-general

Thanks and Regards,
Ashu Pachauri


-- Forwarded message -
From: *Ashu Pachauri* mailto:ashu210...@gmail.com>>
Date: Mon, Aug 13, 2018 at 7:53 PM
Subject: Re: is there any adverse effect on DB if I set autovacuum scale 
factor to zero?

To: mailto:raghavendra...@gmail.com>>


The way I see *autovacuum_vacuum_scale_factor* is not in terms of 
absolute number but as the percentage of any table that can consist of 
updated / deleted tuples to make it eligible for vacuuming. A factor of 
0.1 ensures that your tables would be eligible for vacuuming if more 
than 10% of the tuples are deleted/updated.
1. If you think that 10% is too high for you in terms of storage cost, 
you can decrease the number or set it to zero. But, I would advise to 
increase the value of *autovacuum_vacuum_threshold* to something 
reasonable if you do that, otherwise you pay the CPU cost frequent 
vacuuming across all tables.
2. However, if your issue is not the fixed 10% overhead but the lack of 
throughput i.e. you see the number of deleted/updated tuples keeps 
increasing in an unbounded fashion, the right way to deal with it is a) 
Having higher value of *autovacuum_max_workers* b) lower value for 
*autovacuum_naptime*.




Increasing autovacuum_max_workers is unlikely to solve the issue with 
throughput, because all the workers are throttled together - there's a 
limit on the amount of work that can be done per second. Increasing the 
number of workers is akin to allowing more cars on a highway, but also 
lowering the speed limit.


You need to increase the limit on amount of work, and lowering naptime 
is one way to do that.



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Code of Conduct

2018-09-18 Thread Tomas Vondra

On 09/18/2018 01:47 PM, James Keener wrote:

 > following a long consultation process

It's not a consultation if any dissenting voice is simply ignored.
Don't sugar-coat or politicize it like this -- it was rammed down
everyone's throats. That is core's right, but don't act as everyone's
opinions and concerns were taken into consideration.


I respectfully disagree.

I'm not sure which dissenting voices you think were ignored, but from 
what I've observed in the various CoC threads the core team took the 
time to respond to all comments. That does not necessarily mean the 
resulting CoC makes everyone happy, but unfortunately that's not quite 
possible. And it does not mean it was not an honest consultation.


IMO the core team did a good job in listening to comments, tweaking the 
wording and/or explaining the reasoning. Kudos to them.



There are a good number of folks who are concerned that this CoC is
overreaching and is ripe for abuse. Those concerns were always
simply, plainly, and purposely ignored.
No, they were not. There were multiple long discussions about exactly 
these dangers, You may dislike the outcome, but it was not ignored.


 > Please take time to read and understand the CoC, which is intended to 
ensure that PostgreSQL remains an open and enjoyable project for anyone 
to join and participate in.


I sincerely hope so, and that it doesn't become a tool to enforce social 
ideology like in other groups I've been part of. Especially since this 
is the main place to come to get help for PostgreSQL and not a social club.




Ultimately, it's a matter of trust that the CoC committee and core team 
apply the CoC in a careful and cautious way. Based on my personal 
experience with most of the people involved in both groups I'm not 
worried about this part.



regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Help to understand Actual Rows vs Plan Rows from the query planner output

2018-09-23 Thread Tomas Vondra
          Plan Rows: 50                                              
>                                   +
>              Plan Width: 0                                              
>                                   +
>              Actual Startup Time: 0.048                                 
>                                   +
>              Actual Total Time: 0.048                                   
>                                   +
>              Actual Rows: 1                                             
>                                   +
>              Actual Loops: 1                                            
>                                   +
>              Index Cond: "(lower((city)::text) = 'melanyfort'::text)"   
>                                   +
>            - Node Type: "Bitmap Index Scan"                             
>                                   +
>              Parent Relationship: "Member"                              
>                                   +
>              Parallel Aware: false                                      
>                                   +
>              Index Name: "lower_state_users_idx"                        
>                                   +
>              Startup Cost: 0.00                                         
>                                   +
>              Total Cost: 5.87                                           
>                                   +
>              Plan Rows: 211                                             
>                                   +
>              Plan Width: 0                                              
>                                   +
>              Actual Startup Time: 0.102                                 
>                                   +
>              Actual Total Time: 0.102                                   
>                                   +
>              Actual Rows: 211                                           
>                                   +
>              Actual Loops: 1                                            
>                                   +
>              Index Cond: "(lower((state)::text) = 'ohio'::text)"        
>                                   +
>    Planning Time: 0.260                                                 
>                                   +
>    Triggers:                                                            
>                                   +
>    Execution Time: 0.249
> (1 row)
> 
> aruprakshit=# 
> 
> 
> --
> 
> In the first node type of "Bitmap Index Scan” on “users_lower_idx”, I
> see the plan rows are 50, but actual rows it got 1. In the second node
> type of "Bitmap Index Scan” on “ lower_state_users_idx”, I see the plan
> rows are 211, and actual rows 211. Both are same. Based on what
> conditions planner estimated the planned and actual rows count? 
>
> 
> In node type “BitmapAnd”, I see again the actual rows 1, then why on the
> final plan i.e. Node Type: "Bitmap Heap Scan” again planner estimated
> rows 1? How does it counts these? What does the Loops count says us
> about the query?  
>

It's not very clear to me whether you're asking how the planner computes
estimates in general, or how it computed these particular estimates (or
what issues you see there).

Perhaps this would give you at least some answers:

https://www.postgresql.org/docs/11/static/row-estimation-examples.html

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Why the index is not used ?

2018-10-07 Thread Tomas Vondra
Hi,

On 10/07/2018 08:32 PM, ROS Didier wrote:
> Hi Francisco
> 
>   Thank you for your remark. 
> You're right, but it's the only procedure I found to make search on
> encrypted fields with good response times (using index) !
> 

Unfortunately, that kinda invalidates the whole purpose of in-database
encryption - you'll have encrypted on-disk data in one place, and then
plaintext right next to it. If you're dealing with credit card numbers,
then you presumably care about PCI DSS, and this is likely a direct
violation of that.

> Regarding access to the file system, our servers are in protected
network areas. few people can connect to it.
> 

Then why do you need encryption at all? If you assume access to the
filesystem / storage is protected, why do you bother with encryption?
What is your threat model?

> it's not the best solution, but we have data encryption needs and
> good performance needs too. I do not know how to do it except the
> specified procedure..
>
> if anyone has any proposals to put this in place, I'm interested.
> 

One thing you could do is hashing the value and then searching by the
hash. So aside from having the encrypted column you'll also have a short
hash, and you may use it in the query *together* with the original
condition. It does not need to be unique (in fact it should not be to
make it impossible to reverse the hash), but it needs to have enough
distinct values to make the index efficient. Say, 10k values should be
enough, because that means 0.01% selectivity.

So the function might look like this, for example:

  CREATE FUNCTION cchash(text) RETURNS int AS $$
SELECT abs(hashtext($1)) % 1;
  $$ LANGUAGE sql;

and then be used like this:

  CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc));

and in the query

  SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit
   WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32'
 AND cchash(cc) = cchash('test value 32');

Obviously, this does not really solve the issues with having to pass the
password to the query, making it visible in pg_stat_activity, various
logs etc.

Which is why people generally use FDE for the whole disk, which is
transparent and provides the same level of protection.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Why the index is not used ?

2018-10-08 Thread Tomas Vondra
Hi,

On 10/08/2018 04:10 PM, ROS Didier wrote:
> Hi Tomas
>    
>     Thank you for your answer and recommendation which is very
> interesting. I'm going to study the PCI DSS document right now.
> 
>   * Here are my answer to your question :
> 
> />>/
> /What is your threat model?/
> /< we want to prevent access to sensitive data for everyone except those
> who have the encryption key.
> in case of files theft, backups theft, dumps theft, we do not want
> anyone to access sensitive data.
>    

The thing is - encryption is not panacea. The interesting question is
whether this improves security compared to simply using FDE and regular
access rights (which are grantable at the column level).

Using those two pieces properly may very well be a better defense than
not well designed encryption scheme - and based on this discussion, it
does not seem very polished / resilient.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: COPY command returns "ERROR: invalid XML content"

2019-10-07 Thread Tomas Vondra

On Sun, Oct 06, 2019 at 08:45:40PM -0700, Konstantin Izmailov wrote:

Hi,
I'm using libpq (v10) to import lots of xml files into a PG10 table. I
noticed if number of records imported exceeds 2100 then the following error
is returned:
ERROR:  invalid XML content
DETAIL:  line 1: Couldn't find end of Start Tag
timeBasedFileNamingAndTriggerin line 1
logFile.%d{-MM-dd}.%i.html

My guess is this is an issue/limitation in libxml2, which we use to
parse and process XML. What libxml2 version you have installed? Can you
share an example of a XML document to reproduce the issue?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Install postgres on rhel 7

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 01:00:32AM -0400, Mageshwaran Janarthanam wrote:

Hi Team,

I have question on the postgres install process. Tried to read the archives
but couldn't find it.

I wanted to install postgres server via rpm and wanted to store the
binaries under non default location like /pgbin.



The rpm package (at least not those available at yum.postgresql.org) are
not relocatable:

 rpm -qpi postgresql12-12.0-1PGDG.rhel7.x86_64.rpm  | grep Relocations
 Relocations : (not relocatable)

So unfortunately, you'll have to build your own packages, or something
like that.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PG 12 not yet for mac

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 11:11:43AM -0400, Ravi Krishna wrote:

https://postgresapp.com/downloads.html

The link which says PG 12 is actually PG 11.



Not sure if the link is correct or not (it seems to point to
Postgres-2.3-12.dmg, which seems like it might be v12). But more
imporantly, that's not a project/page managed by the PostgreSQL
community, you need to report the issues to the authors (most likely
through github issues).

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgcrypto question

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 12:05:16PM -0400, Erik Aronesty wrote:

Currently, it is my understanding that the pgcrypto library requires
the user to send a password or private key up to the server for
decryption.



Correct. In the naive case the key is included in each SQL query, which
does have various issues. Bruce Momjian has a nice extension that allows
you to fix that by loading the key into backend memory:

 http://momjian.us/download/pgcryptokey/


Is there a notion of a client-side encrypt/decrypt plugin when doing a
postgres query?

For example, a user could query postgres, get back data of type
"encrypted", and a "libpq" plugin could decode/decrypt those columns
that are of data type "encrypted" in a manner transparent to the
user of the client

Obviously I could write this by intercepting the relevant libpq calls
using LD_PRELOAD or Microsoft's "Detours" ... but it there a better
way to do that?



AFAIk that's usually done at the application level, i.e. the application
is sending/receiving encrypted data, and the database simply sees bytea
columns. I'm not aware of a driver doing that transparently, but it
seems like an interesting idea - I wonder if it could be done e.g. in
psycopg as an extension, or something like that.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgcrypto question

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 02:51:30PM -0400, Erik Aronesty wrote:

Good idea for "psycopg".  It would be easy for a POC, but I think the
only meaningful layer to operate at would be a libpq drop-in
replacement that intercepts PQgetvalue, PQprepare, PQexecParams,
PQexecPrepared ... etc.   That way odbc, python, node, etc would "just
work" as long as you used LD_PRELOAD appropriately.



It's not clear to me how would that know which columns are encrypted,
with what key, etc. Because those encrypted columns are essentially just
regular bytea columns, so there's no easy way to distinguish them.

I'm no psycopg2 expert, but it does have some infrastructure for casting
PostgreSQL types to Python types, and I guess that could be used for the
encryption.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: temporary files

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 01:52:41PM -0700, dangal wrote:

Hello everyone, I have a question to see if you can help me, I have set
work_mem in 100 MB but I have the following in the pgbadger

Queries generating the most temporary files (N)

Count Total size   Min size  Max sizeAvg size
   58   3.24 GiB 57.15 MiB  57.15 MiB 57.15 MiB

How could it be that if you require 57 MB, 100 MB will not reach ?
thanks a lot



The on-disk and in-memory representations are not the same, and the
on-disk one is often significantly more efficient. Data that needs 57MB
temporary file may beed ~150MB memory to perform in-memory sort, for
example.

Yet another reason may be that e.g. hash join splits the memory into
batches, and each one has to fit into work_mem. And whevener we hit the
limit, we double the number of batches, i.e. we cut the batch size in
half. Thus the size is somewhere between 50 and 100MB, with ~75MB on
average. Furthermore, we also estimate the batch size before running the
query, so we may start with unnecessarily many batches, pushing the
average batch size down even more.


FWIW it's somewhat suspicious you have 58 temp files with almost no
variability in size. That seems as if a single query was executed
repeatedly. Maybe try looking into what query that is.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Table locking during backup

2019-10-07 Thread Tomas Vondra

On Mon, Oct 07, 2019 at 11:00:08PM +0200, Artur Zając wrote:

Hi,



I cannot reproduce some scenario  I found in my PostgreSQL logs.



I have two connections/processes:



Process 24755 is standard pg_backup connection with:



.

BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ ONLY

.

LOCK TABLE gm.tableabc;

.

COPY FROM gm.tableabc



Process 5969 is normal connection where I have:



LOG: proces 5969 still waiting for AccessExclusiveLock on relations
562888531 of database 16429 after 1000.066 ms

DETAILT: Process holding the lock: 24755. Wait queue: 5969



Query that is waiting (from proces 5969) is:



CREATE TEMP TABLE IF NOT EXISTS tableabc

(

Id BIGINT DEFAULT gm.top()

) INHERITS (gm.tableabc);



I cannot reproduce what pg_dump is doing that causes waiting on proces 5969
(which lock is acquired and why). When pg_dump ends lock is released and
proces 5969 continues.



I know that doing in two parallel connections:



BEGIN;



CREATE TEMP TABLE IF NOT EXISTS tableabc

(

Id BIGINT DEFAULT gm.top()

) INHERITS (gm.tableabc);



causes  waiting state on second connection until the first finished, but
pg_dump connection is read only.



Not sure why would it matter that the pg_dump connection is read-only,
this is about locking because pg_dump needs to ensure the schema does
not change while it's running.

pg_dump does not do

 LOCK TABLE gm.tableabc;

but

 LOCK TABLE gm.tableabc IN ACCESS SHARE MODE;

Which should be visible in pg_locks system view. And it does conflict
with the ACCESS EXCLUSIVE mode, used by the second query.




Could you suggest me which part of pg_dump (which query)  might cause that
behaviour.



It's this line:

https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dump.c#L6676


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: timescaleDB & WAL replication

2019-10-15 Thread Tomas Vondra

On Tue, Oct 15, 2019 at 09:28:27AM +0200, basti wrote:

Hello,
I have a Master / Slave Postgres setup with WAL Replication. Now I want
to add timescaleDB.

I found this todo: https://docs.timescale.com/latest/tutorials/replication

As I understand that in the right way I just need to add the timescaleDB
extention on the master side?



AFAIK timescale is "just" a regular extension, in which case yes, all
you need to do is installing it on the master side.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Fri, Oct 18, 2019 at 09:14:09PM -0500, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost  wrote:


Greetings,

* Ariadne Conill (aria...@dereferenced.org) wrote:
> On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
wrote:
> > https://www.postgresql.org/docs/11/functions-json.html
> > " The field/element/path extraction operators return NULL, rather than
> > failing, if the JSON input does not have the right structure to match
> > the request; for example if no such element exists"
>
> It is known that the extraction operators return NULL.  The problem
> here is jsonb_set() returning NULL when it encounters SQL NULL.
>
> > Just trying to figure why one is worse then the other.
>
> Any time a user loses data, it is worse.  The preference for not
> having data loss is why Pleroma uses PostgreSQL as it's database of
> choice, as PostgreSQL has traditionally valued durability.  If we
> should not use PostgreSQL, just say so.

Your contention that the documented, clear, and easily addressed
behavior of a particular strict function equates to "the database system
loses data and isn't durable" is really hurting your arguments here, not
helping it.

The argument about how it's unintuitive and can cause application
developers to misuse the function (which is clearly an application bug,
but perhaps an understandable one if the function interface isn't
intuitive or is confusing) is a reasonable one and might be convincing
enough to result in a change here.

I'd suggest sticking to the latter argument when making this case.

> > > I believe that anything that can be catastrophically broken by users
> > > not following upgrade instructions precisely is a serious problem, and
> > > can lead to serious problems.  I am sure that this is not the only
> > > project using JSONB which have had users destroy their own data in
> > > such a completely preventable fashion.

Let's be clear here that the issue with the upgrade instructions was
that the user didn't follow your *application's* upgrade instructions,
and your later code wasn't written to use the function, as documented,
properly- this isn't a case of PG destroying your data.  It's fine to
contend that the interface sucks and that we should change it, but the
argument that PG is eating data because the application sent a query to
the database telling it, based on our documentation, to eat the data,
isn't appropriate.  Again, let's have a reasonable discussion here about
if it makes sense to make a change here because the interface isn't
intuitive and doesn't match what other systems do (I'm guessing it isn't
in the SQL standard either, so we unfortunately can't look to that for
help; though I'd hardly be surprised if they supported what PG does
today anyway).


Okay, I will admit that saying PG is eating data is perhaps
hyperbolic,


My experience is that using such hyperbole is pretty detrimental, even
when one is trying to make a pretty sensible case. The problem is that
people often respond in a similarly hyperbolic claims, particularly when
you hit a nerve. And that's exactly what happened here, becase we're
*extremely* sensitive about data corruption issues, so when you claim
PostgreSQL is "eating data" people are likely to jump on you, beating
you with the documentation stick. It's unfortunate, but it's also
entirely predictable.


but I will also say that the behaviour of jsonb_set()
under this type of edge case is unintuitive and frequently results in
unintended data loss.  So, while PostgreSQL is not actually eating the
data, it is putting the user in a position where they may suffer data
loss if they are not extremely careful.

Here is how other implementations handle this case:

MySQL/MariaDB:

select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
  {"a":null,"b":2,"c":3}

Microsoft SQL Server:

select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
  {"b":2,"c":3}

Both of these outcomes make sense, given the nature of JSON objects.
I am actually more in favor of what MSSQL does however, I think that
makes the most sense of all.



I do mostly agree with this. The json[b]_set behavior seems rather
surprising, and I think I've seen a couple of cases running into exactly
this issue. I've solved that with a simple CASE, but maybe changing the
behavior would be better. That's unlikely to be back-patchable, though,
so maybe a better option is to create a non-strict wrappers. But that
does not work when the user is unaware of the behavior :-(

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Replication of Replication

2019-10-19 Thread Tomas Vondra

On Fri, Oct 18, 2019 at 05:54:34PM -0400, Edilmar Alves wrote:
I have 3 servers running CentOS+PG 11 
(postgresql11-server-11.5-1PGDG.rhel7.x86_64):

- s1: main db + publication
- s2: subscription of the main db
all works fine until here...
Now, I tried to config this:
- s2: publication of the same db that was replicated from s1
- s3: subscription of the db from s2
When I did this, replication s1=>s2 stopped, and replication s2=>s3 
never worked.

Is this not possible?


You'll have to share more details - error messages from the server log,
how you check that the replication stopped, etc.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: releasing space

2019-10-19 Thread Tomas Vondra

On Thu, Oct 17, 2019 at 05:20:09PM +, Julie Nishimura wrote:

Hello everybody, We are running PostgreSQL 9.6.2 cluster master ->
standby (streaming replication). 22 tb of space (constantly struggling
with the space, pruning the old data, but not fast enough). The biggest
db takes 16 tb. So, we've copied it to another server, and now we would
like to delete it from our original source, to free up the space. What
would be the right approach for this?  Just issue drop database command
(16tb). How long it might take? Should we do it gradually (drop biggest
tables first)? Any suggestions? Caveats?



Generally speaking, DROP DATABASE simply recursively drops all the
various objects - indexes, tables, etc. It mostly just deleting the
files, which should not be very expensive (we certainly don't need to
delete all the data or anything), but there's certain number of I/O
involved. But it does depend on the OS / filesystem / hardware if that's
an issue.

So if you want to be on the safe side, you can drop the objects one by
one, with a bit of delay between them, to throttle the I/O a bit.

FWIW the latest minor release for 9.6 is 9.6.15, you're 13 minor
versions (~30 months) of fixes behind. You might want to consider
upgrading ...


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Has there been any discussion of custom dictionaries being defined in the database?

2019-10-19 Thread Tomas Vondra

On Thu, Oct 17, 2019 at 11:52:39AM +0200, Tom Lane wrote:

Morris de Oryx  writes:

Given that Amazon is bragging this week about turning off Oracle, it seems
like they could kick some resources towards contributing something to the
Postgres project. With that in mind, is the idea of defining dictionaries
within a table somehow meritless, or unexpectedly difficult?


Well, it'd just be totally different.  I don't think anybody cares to
provide two separate definitions of common dictionaries (which'd have to
somehow be kept in sync).

As for why we did it with external text files in the first place ---
for at least some of the dictionary types, the point is that you can
drop in data files that are available from upstream sources, without any
modification.  Getting the same info into a table would require some
nonzero amount of data transformation.



IMHO being able to load dictionaries from a table would be quite
useful, and not just because of RDS. For example, it's not entirely true
we're just using the upstream dictionaries verbatim - it's quite common
to add new words, particularly in specialized fields. That's way easier
when you can do that through a table and not through a file.


Having said that ... in the end a dictionary is really just a set of
functions implementing the dictionary API; where they get their data
from is their business.  So in theory you could roll your own
dictionary that gets its data out of a table.  But the dictionary API
would be pretty hard to implement except in C, and I bet RDS doesn't
let you install your own C functions either :-(



Not sure. Of course, if we expect the dictionary to work just like the
ispell one, with preprocessing the dictionary into shmem, then that
requires C. I don't think that's entirely necessary, thoug - we could
use the table directly. Yes, that would be slower, but maybe it'd be
sufficient.

But I think the idea is ultimately that we'd implement a new dict type
in core, and people would just specify which table to load data from.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote:


...

The hyperbole here is misplaced. There is a difference between a bug and
a POLA violation. This might be the latter, but it isn't the former. So
please tone it down a bit. It's not the function that's unsafe, but the
ill-informed use of it.


We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
since 9.5. That's five releases ago.  So it's a bit late to be coming to
us telling us it's not safe (according to your preconceptions of what it
should be doing).


We could change it prospectively (i.e. from release 13 on) if we choose.
But absent an actual bug (i.e. acting contrary to documented behaviour)
we do not normally backpatch such changes, especially when there is a
simple workaround for the perceived problem. And it's that policy that
is in large measure responsible for Postgres' deserved reputation for
stability.



Yeah.



Incidentally, why is your function written in plpgsql? Wouldn't a simple
SQL wrapper be better?


   create or replace function safe_jsonb_set
       (target jsonb, path text[], new_value jsonb, create_missing
   boolean default true)
   returns jsonb as
   $func$
       select case when new_value is null then target else
   jsonb_set(target, path, new_value, create_missing) end
   $func$ language sql;


And if we were to change it I'm not at all sure that we should do it the
way that's suggested here, which strikes me as no more intuitive than
the current behaviour. Rather I think we should possibly fill in a json
null in the indicated place.



Not sure, but that seems rather confusing to me, because it's mixing SQL
NULL and JSON null, i.e. it's not clear to me why

   jsonb_set(..., "...", NULL)

should do the same thing as

   jsonb_set(..., "...", 'null':jsonb)

I'm not entirely surprised it's what MySQL does ;-) but I'd say treating
it as a deletion of the key (just like MSSQL) is somewhat more sensible.
But I admit it's quite subjective.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 11:21:26AM -0400, Stephen Frost wrote:

Greetings,

* Dmitry Dolgov (9erthali...@gmail.com) wrote:

If we want to change it, the question is where to stop? Essentially we have:

update table set data = some_func(data, some_args_with_null);

where some_func happened to be jsonb_set, but could be any strict function.


I don't think it makes any sense to try and extrapolate this out to
other strict functions.  Functions should be strict when it makes sense
for them to be- in this case, it sounds like it doesn't really make
sense for jsonb_set to be strict, and that's where we stop it.



Yeah. I think the issue here is (partially) that other databases adopted
similar functions after us, but decided to use a different behavior. It
might be more natural for the users, but that does not mean we should
change the other strict functions.

Plus I'm not sure if SQL standard says anything about strict functions
(I found nothing, but I looked only very quickly), but I'm pretty sure
we can't change how basic operators change, and we translate them to
function calls (e.g. 1+2 is int4pl(1,2)).


I wonder if in this case it makes sense to think about an alternative? For
example, there is generic type subscripting patch, that allows to update a
jsonb in the following way:

update table set jsonb_data[key] = 'value';

It doesn't look like a function, so it's not a big deal if it will handle NULL
values differently. And at the same time one can argue, that people, who are
not aware about this caveat with jsonb_set and NULL values, will most likely
use it due to a bit simpler syntax (more similar to some popular programming
languages).


This seems like an entirely independent thing ...



Right. Useful, but entirely separate feature.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote:


On 10/19/19 12:32 PM, David G. Johnston wrote:

On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
mailto:tomas.von...@2ndquadrant.com>>
wrote:

>
>We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>since 9.5. That's five releases ago.  So it's a bit late to be
coming to
>us telling us it's not safe (according to your preconceptions of
what it
>should be doing).
>


There have been numerous complaints and questions about this behavior
in those five years; and none of the responses to those defenses has
actually made the current behavior sound beneficial but rather have
simply said "this is how it works, deal with it".



I haven't seen a patch, which for most possible solutions should be
fairly simple to code. This is open source. Code speaks louder than
complaints.



IMHO that might be a bit too harsh - I'm not surprised no one sent a
patch when we're repeatedly telling people "you're holding it wrong".
Without a clear consensus what the "correct" behavior is, I wouldn't
send a patch either.





>
>We could change it prospectively (i.e. from release 13 on) if we
choose.
>But absent an actual bug (i.e. acting contrary to documented
behaviour)
>we do not normally backpatch such changes, especially when there is a
>simple workaround for the perceived problem. And it's that policy
that
>is in large measure responsible for Postgres' deserved reputation for
>stability.
>

Yeah.


Agreed, this is v13 material if enough people come on board to support
making a change.




We have changed such things in the past. But maybe a new function might
be a better way to go. I haven't given it enough thought yet.



I think the #1 thing we should certainly do is explaining the behavior
in the docs.






>And if we were to change it I'm not at all sure that we should do
it the
>way that's suggested here, which strikes me as no more intuitive than
>the current behaviour. Rather I think we should possibly fill in
a json
>null in the indicated place.
>

Not sure, but that seems rather confusing to me, because it's
mixing SQL
NULL and JSON null, i.e. it's not clear to me why

[...]

But I admit it's quite subjective.


Providing SQL NULL to this function and asking it to do something with
that is indeed subjective - with no obvious reasonable default, and I
agree that "return a NULL" while possible consistent is probably the
least useful behavior that could have been chosen.  We should never
have allowed an SQL NULL to be an acceptable argument in the first
place, and can reasonably safely and effectively prevent it going
forward.  Then people will have to explicitly code what they want to
do if their data and queries present this invalid unknown data to the
function.




How exactly do we prevent a NULL being passed as an argument? The only
thing we could do would be to raise an exception, I think. That seems
like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.



I don't know, but if we don't know what the "right" behavior with NULL
is, is raising an exception really that ugly?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: releasing space

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 04:34:32PM -0700, Adrian Klaver wrote:

On 10/19/19 4:17 PM, Julie Nishimura wrote:
Thank you, Thomas. Do you know if it is safe to replicate 9.6.2 
(smaller) db to 9.6.15 (larger capacity) using pg_basebackup? Would 
it be considered as an upgrade?


pg_basebackup backups an entire Postgres cluster which will be many 
databases. So when you say db do mean a Postgres cluster or an 
individual database?




My understanding is Julie wants to create a copy of a 9.6.2 cluster
using pg_basebackup and then run 9.6.15 on it. That's OK, it's
essentially a minor version upgrade.

FWIW Julie, please don't top post - it just makes it harder to follow
the discussion. Also, this seems like a completely separate question,
unrelated to the DROP DATABLASE one. It might be better to start a new
thread instead of repurposing an existing one.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Tomas Vondra

On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 1:14 PM, David G. Johnston wrote:

On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
mailto:andrew.duns...@2ndquadrant.com>> wrote:

And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to
ensure
they never pass in a NULL parameter.


That's kinda the point - if they never pass NULL they won't encounter
any problems but as soon as the data and their application don't see
eye-to-eye the application developer has to decide what they want to
do about it.  We are in no position to decide for them and making it
obvious they have a decision to make and implement here doesn't seem
like a improper position to take.



The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like this.



I think the general premise of this thread is that the application
developer does not realize that may be necessary, because it's a bit
surprising behavior, particularly when having more experience with other
databases that behave differently. It's also pretty easy to not notice
this issue for a long time, resulting in significant data loss.

Let's say you're used to the MSSQL or MySQL behavior, you migrate your
application to PostgreSQL or whatever - how do you find out about this
behavior? Users are likely to visit

   https://www.postgresql.org/docs/12/functions-json.html

but that says nothing about how jsonb_set works with NULL values :-(

You're right raising an exception may not be the "right behavior" for
whatever definition of "right". But I kinda agree with David that it's
somewhat reasonable when we don't know what the "universally correct"
thing is (or when there's no such thing). IMHO that's better than
silently discarding some of the data.

FWIW I think the JSON/JSONB part of our code base is amazing, and the
fact that various other databases adopted something very similar over
the last couple of years just confirms that. And if this is the only
speck of dust in the API, I think that's pretty amazing.

I'm not sure how significant this issue actually is - it's true we got a
couple of complaints over the years (judging by a quick search for
jsonb_set and NULL in the archives), but I'm not sure that's enough to
justify any changes in backbranches. I'd say no, but I have no idea how
many people are affected by this but don't know about it ...

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: CPU SPIKE

2019-10-20 Thread Tomas Vondra

On Mon, Oct 21, 2019 at 10:45:33AM +0530, Pawan Sharma wrote:

Hi All,

Having real high CPU issue (95-98%), with SELECT statements and select
queries contains multiple AND operator, is it will cause any CPU Spike..???

apps team is using sub-partition, PG11, CPU:24, Mem: 16GB



Any query execution is using CPU, so yes - it can cause a spike in CPU
usage. The question is whether the query plan is efficient, or whether
it's using the CPU unenecessarily. You need to show us EXPLAIN ANALYZE
and perhaps explain why you think it shouldn't use this much CPU.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Tomas Vondra

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 4:18 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 1:14 PM, David G. Johnston wrote:

On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
mailto:andrew.duns...@2ndquadrant.com>> wrote:

    And yet another is to
    raise an exception, which is easy to write but really punts the
issue
    back to the application programmer who will have to decide how to
    ensure
    they never pass in a NULL parameter.


That's kinda the point - if they never pass NULL they won't encounter
any problems but as soon as the data and their application don't see
eye-to-eye the application developer has to decide what they want to
do about it.  We are in no position to decide for them and making it
obvious they have a decision to make and implement here doesn't seem
like a improper position to take.



The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like
this.



I think the general premise of this thread is that the application
developer does not realize that may be necessary, because it's a bit
surprising behavior, particularly when having more experience with other
databases that behave differently. It's also pretty easy to not notice
this issue for a long time, resulting in significant data loss.

Let's say you're used to the MSSQL or MySQL behavior, you migrate your
application to PostgreSQL or whatever - how do you find out about this
behavior? Users are likely to visit

   https://www.postgresql.org/docs/12/functions-json.html

but that says nothing about how jsonb_set works with NULL values :-(




We should certainly fix that. I accept some responsibility for the omission.



+1





You're right raising an exception may not be the "right behavior" for
whatever definition of "right". But I kinda agree with David that it's
somewhat reasonable when we don't know what the "universally correct"
thing is (or when there's no such thing). IMHO that's better than
silently discarding some of the data.



I'm not arguing against the idea of improving the situation. But I am
arguing against a minimal fix that will not provide much of value to a
careful app developer. i.e. I want to do more to support app devs.
Ideally they would not need to use wrapper functions. There will be
plenty of situations where it is mighty inconvenient to catch an
exception thrown by jsonb_set(). And catching exceptions can be
expensive. You want to avoid that if possible in your
performance-critical plpgsql code.



True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).





FWIW I think the JSON/JSONB part of our code base is amazing, and the
fact that various other databases adopted something very similar over
the last couple of years just confirms that. And if this is the only
speck of dust in the API, I think that's pretty amazing.



TY. When I first saw the SQL/JSON spec I thought I should send a request
to the SQL standards committee for a royalty payment, since it looked so
familiar ;-)



;-)





I'm not sure how significant this issue actually is - it's true we got a
couple of complaints over the years (judging by a quick search for
jsonb_set and NULL in the archives), but I'm not sure that's enough to
justify any changes in backbranches. I'd say no, but I have no idea how
many people are affected by this but don't know about it ...




No, no backpatching. As I said upthread, this isn't a bug, but it is
arguably a POLA violation, which is why we should do something for
release 13.



WFM


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Tomas Vondra

On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote:

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception 
blocks in stored functions and now sub-transactions in stored 
procedures.




Sorry for the confusion - I've not been particularly careful when
writing that response.

Let me illustrate the issue with this example:

   CREATE TABLE t (a int);

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 -- SAVEPOINT s1;
 INSERT INTO t VALUES (1);
 -- COMMIT;
   EXCEPTION
 WHEN others THEN
   msg := SUBSTR(SQLERRM, 1, 100);
   RAISE NOTICE 'error: %', msg;
   END; $$;

   CALL test();

If you uncomment the SAVEPOINT, you get

   NOTICE:  error: unsupported transaction command in PL/pgSQL

because savepoints are not allowed in stored procedures. Fine.

If you uncomment the COMMIT, you get

   NOTICE:  error: cannot commit while a subtransaction is active

which happens because the EXCEPTION block creates a subtransaction, and
we can't commit when it's active.

But we can commit outside the exception block:

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 BEGIN
   INSERT INTO t VALUES (1);
 EXCEPTION
   WHEN others THEN
 msg := SUBSTR(SQLERRM, 1, 100);
 RAISE NOTICE 'error: %', msg;
  END;
  COMMIT;
   END; $$;


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Barman

2019-10-31 Thread Tomas Vondra

On Thu, Oct 31, 2019 at 05:29:34PM +, Daulat Ram wrote:

Hi All,

We have configured postgres 11.2 in streaming replication (primary &
Standby)  on docker and I am looking to initiate the Postgres backup
using barman. As I know there are few options for taking backup using
barman.

RSYNC backup
Incremental Backups
Streaming Backup with continuous WAL streaming
Centralized and Catalogued Backups

Which is the best option for backup using barman? So that we can keep
the database safe in case of disaster? I feel the Incremental Backups
are most useful to perform the PITR but I want to know the experts
suggestions.



You're mixing a number of topics, here. Firstly, all backups done by
barman are centralized and catalogued, that's pretty much one of the
main purposes of barman.

When it comes to backup methods, there are two basic methods. rsync and
postgres (which means pg_basebackup). This is about creating the initial
base backup. Both methods then can replicate WAL by either streaming or
archive_command.

So first you need to decide whether to use rsync and pg_basebackup,
where rsync allows advanced features like incremental backup, parallel
backup and deduplication.

Then you need to decide whether to use archive_command or streaming
(i.e. pg_receivexlog).

The "right" backup method very much depends on the size of your
database, activity, and so on. By default you should probably go with
the default option, described as "scenario 1" in the barman docs, i.e.
pg_basebackup (backup_method = postgres) and WAL streaming.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: How to import Apache parquet files?

2019-11-05 Thread Tomas Vondra

On Tue, Nov 05, 2019 at 04:21:45PM +0100, Softwarelimits wrote:

Hi Imre, thanks for the quick response - yes, I found that, but I was not
sure if it is already production ready - also I would like to use the data
with the timescale extension, that is why I need a full import.



Well, we're not in the position to decide if parquet_fdw is production
ready, that's something you need to ask author of the extension (and
then also judge yourself).

That being said, I think FDW is probably the best way to do this. It's
explicitly designed to work with foreign data, so using it to access
parquet files seems somewhat natural.

The alternative is probably transforming the data into COPY format, and
then load it into Postgres using COPY (either as a file, or stdin).

Which of these options is the right one depends on your requirements.
FDW is more convenient, but row-based and probably significantly less
efficient than COPY. So if you have a lot of these parquet files, I'd
probably use the COPY. But maybe the ability to query the parquet files
directly (with FDW) is useful for you.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: BigSQL pgc alternative

2019-12-21 Thread Tomas Vondra

On Fri, Dec 20, 2019 at 12:15:06PM -0300, Samuel Teixeira Santos wrote:

Hi all.

BigSQL still allow to install Postgres and others resources as like a
portable install.

But today, it's only offer your tool (pgc) for newer Postgresql versions.

I would like to install as portable option because it's more easy to config
and use in my own user in my development station.

What you recommend to do to replace pgc tool as alternative to install
postgresql 10 and the respective postgis version as like portable option?



I'm not familiar with BigSQL, so I'm not sure what exactly you mean when
you say "portable option". Can you explain?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2020-01-17 Thread Tomas Vondra

On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote:

On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule  wrote:


Hi

po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan 
 napsal:



Updated version including docco and better error message.

cheers

andrew



I think so my objections are solved. I have small objection

+ errdetail("exception raised due to \"null_value_treatment := 
'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or ensure that 
an SQL NULL is not used")));

"null_value_treatment := 'raise_exception'\""

it use proprietary PostgreSQL syntax for named parameters. Better to use 
ANSI/SQL syntax

"null_value_treatment => 'raise_exception'\""

It is fixed in attached patch

source compilation without warnings,
compilation docs without warnings
check-world passed without any problems

I'll mark this patch as ready for commiter

Thank you for your work




Thanks for the review. I propose to commit this shortly.



Now that this was committed, I've updated the patch status accordingly.

Thanks!

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Avoiding out of date statistics / planner

2020-02-12 Thread Tomas Vondra

On Wed, Feb 12, 2020 at 10:23:22AM -0700, Michael Lewis wrote:

It may also be worth noting that it is possible to make autovacuum/analyze
more aggressive, perhaps only on the tables that see large changes in data
that might result in a statistics issue. If you could share a query,
explain analyze output, and pseudo code or at least description of what
sort of bulk operations are being done, then more insight could be offered.


Another thing you can do is deploy auto_explain, and log explain plan
for long-runnning queries. That won't fix the root cause, but it will
help you with confirming the root cause - you'll see the query plan,
which should give you enough context.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

2020-02-16 Thread Tomas Vondra

On Sun, Feb 16, 2020 at 10:38:29PM +0100, maxzor wrote:

Hello,

1. I was told that M$ SQLServer provides huge performance deltas over 
PostgreSQL when dealing with index-unaligned queries :

create index i on t (a,b, c);
select * from t where b=... and c=...;


Perhaps index-only scans might help here, but that generally does not
work for "SELECT *" queries.

Columnar storage has been tried by various companies, CitusData, 
EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been 
discussed quite a lot, last thread that I was able to find being in 
2017, https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com 
where Fujitsu's patch made it quite far.

What is the status on such a storage manager extension interface ?



I think you're looking for threads about zheap and (especially)
zedstore. Those are two "storage manager" implementations various people
are currently working on. Neither of those is likely to make it into
pg13, though :-(

2. What do you think of adding a new syntax : 'from t join t2 using 
(fk_constraint)' ? And further graph algorithms to make automatic 
joins ?
Both 'natural join' and 'using (column_name)' are useless when the 
columns are not the same in source and destination.
Plus it is often the case that the fk_constraints are over numerous 
columns, even though this is usually advised against. But when this 
case happens there will be a significant writing speedup.


I'm not really sure what's the point / benefit here. Initially it seemed
you simply propose a syntax saying "do a join using the columns in the
FK constraint" but it's unclear to me how this implies any writing
speedup? 

I have been bothered by this to the point that I developed a 
graphical-query-builder plugin for pgModeler,
https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode 
,

but I believe such a syntax would be much better in the core!



Hm, maybe.

3. What is the status of making the internal parser of PostgreSQL less 
coupled to the core, and easier to cherry-pick from outside?
It would be great to incorporate it into companion projects : 
pgAdmin4, pgModeler, pgFormatter...




I have no idea what you mean by "less coupled" here. What are the
requirements / use cases you're thinking about?


FWIW I think it's pretty bad idea to post questions about three very
different topics into a single pgsql-hackers thread. That'll just lead
to a lot of confusion.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: DB running out of memory issues after upgrade

2020-02-18 Thread Tomas Vondra

On Tue, Feb 18, 2020 at 05:46:28PM +, Nagaraj Raj wrote:

after upgrade Postgres to v9.6.11 from v9.6.9 DB running out of memory issues 
no world load has changed before and after upgrade. 

spec: RAM 16gb,4vCore
Any bug reported like this or suggestions on how to fix this issue? I 
appreciate the response..!! 



This bug report (in fact, we don't know if it's a bug, but OK) is
woefully incomplete :-(

The server log is mostly useless, unfortunately - it just says a bunch
of processes were killed (by OOM killer, most likely) so the server has
to restart. It tells us nothing about why the backends consumed so much
memory etc.

What would help us is knowing how much memory was the backend (killed by
OOM) consuming, which should be in dmesg.

And then MemoryContextStats output - you need to connect to a backend
consuming a lot of memory using gdb (before it gets killed) and do

 (gdb) p MemoryContextStats(TopMemoryContext)
 (gdb) q

and show us the output printed into server log. If it's a backend
running a query, it'd help knowing the execution plan.

It would also help knowing the non-default configuration, i.e. stuff
tweaked in postgresql.conf.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Check constraints do not seem to be working!!!

2020-11-11 Thread Tomas Vondra


On 11/11/20 10:06 AM, Jitendra Loyal wrote:
> Thanks Nikolay
> 
> I read that but is there a way to meet the above requirement. And I will
> like to add that IS NULL and IS NOT NULL should evaluate to true/false.
> These operators are made for this and should not be returning NULL.
> 

This has nothing to do with IS [NOT] NULL, it's the first part of the
expression (b = TRUE) causing trouble. Essentially, the constraint

(b = true) and (c is not null)

is evaluated in two steps. First we evaluate the two parts individually,
and for (null, true) the results would look like this:

(b = true) => null
(c is not null) => true

and then we combine those results using 'AND'

null AND true => null

which is considered as if the constraint matches. If you want to handle
NULL for the first expression, you may do this, for example:

(b it not null and b = true) and (c is not null)

Or something like that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Extended statistics for correlated columns, row estimates when values are not in MCVs list

2020-12-07 Thread Tomas Vondra



On 12/4/20 11:53 PM, Michael Lewis wrote:
> On version 12.5, I have a query similar to the below where I am getting
> rather unfortunate row estimates and a sub-optimal index choice as a result.
> 
> SELECT
>   id
> FROM
>   messages
> WHERE
>   client_id = 1234
>   and site_id = 889977
>   and message_type_id in ( 59, 62, 102, 162 )
>   and sent_on > NOW() - INTERVAL '3 days';
> 
> I created extended statistics (all types) on client_id and site_id,
> analyzed the columns, and in pg_stats_ext.dependencies (4 = client_id, 7 =
> site_id) have:
> {"4 => 7": 0.002997, "7 => 4": 0.896230}
> 
> When I check row estimates like below, I get results that indicate the
> columns are expected to be independent still.
> explain SELECT id FROM messages; --889594304
> explain SELECT id FROM messages where client_id = 1234; --133439
> explain SELECT id FROM messages where site_id = 889977; --28800
> explain SELECT id FROM messages where client_id = 1234 and site_id =
> 889977; --4
> 

I was a bit confused about this initially, because why would it still be
estimated as independent clauses, with the extended stats defined? But I
think the explanation is fairly simple - the combination of values is
simply rare enough not to be included in the MCV list, so it probably
gets estimated using the "regular" logic as if independent. But we still
mark the clauses as estimated, so the functional dependencies are not
really considered.

(I wonder if the code in PG14 would work better, but that's of little
use of course.)

> However, I pick a client & site ID pair which show up in the MCVs list,
> then I get the same estimate when querying for that site_id with or without
> including the client_id. That is great.
> 

I think this is consistent with the above explanation - in this case the
MCV actually kicks in, significantly improving the estimate.

> Is it reasonable to expect that if the correlation between two columns is
> rather high, then the optimizer might figure the columns are not
> independent and perhaps would give less weight to the value derived from
> independent column statistics? With table statistics, it is possible to set
> a static value or ratio for something like ndistinct. Any chance for
> something similar on dependency someday?
> 
> Perhaps I am expecting too much or have a poor understanding of what
> extended statistics can or someday might do. I deal with under estimates
> from correlations between client_id and sites or other similar
> dependent objects a fair bit and am hopeful to resolve some of those
> planning problems with extended stats, without the maintenance overhead of
> migrating everything to a single client per database to get more specific
> statistics.
> 

What you might try is defining the statistics with only the functional
dependencies. That should consider the column-level correlation even
when the combination of values is not in the MCV. It might make the
"good" estimate worse, but that's an inherent trade-off.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Potential BRIN Index Corruption

2020-12-09 Thread Tomas Vondra
On 12/9/20 12:07 AM, Huan Ruan wrote:
> HI Alvaro
> 
> Unfortunately those records were written a while ago and we no longer keep
> their WAL logs. Thanks for your help anyway.
> 

Can you estimate when roughly the records were written? E.g. by using a
rough estimate of WAL or XIDs generated per day, or something like that.
Maybe there was some I/O issue in that time period and a couple writes
got lost, or something like that.

Are there any other corrupted indexes on the table?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Is there a way to dump schema to files on disk and keep them in sync

2020-12-13 Thread Tomas Vondra
On 12/13/20 6:34 PM, Adrian Klaver wrote:
> On 12/12/20 6:58 PM, Tim Uckun wrote:
>> I want to dump my postgres schema to disk in neat directories like
>> pgadmin presents. Then I want to be able to edit the files and sync
>> changes to the database and ideally if changes were made in the
>> database to sync them back to the disk.
>>
>> Is there a tool that does this? Is there a tool that will dump the
>> schema into separate directories and files like pgadmin does?
> 
> pgAdmin does not create directories, it just organizes the contents of
> the system catalogs into GUI elements.
> 
> For schema management I would suggest a tool like the one I use
> Sqitch(https://sqitch.org/). It will organize the process of schema
> creation and management.
> 

Yeah, that was my thought too. Or maybe look at the other schema
versioning tools available - we have a list on the wiki:

https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques

I'm sure it's incomplete, but it's helpful nevertheless.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Reg: User creation script/List of privileges

2022-03-30 Thread Tomas Pospisek

On 30.03.22 05:09, Sai Ch wrote:

Hi Experts,

I am looking for a query or possibility to generate all the privileges a 
user/role has.


I need this to create a user/role from one database to another with 
exact same privileges.


Kindly, share the query or way to proceed further.

Thanks & Regards,


Maybe have a look at 
https://github.com/tpo/little_shell_scripts/blob/master/psql_access_priv_decoder





ERROR: new collation (en_US.UTF-8) is incompatible with the collation of the template database (en_US.utf-8)

2022-06-22 Thread Tomas Pospisek

Hi all,

while doing `cat pg_dump.dump | psql` I get the above message. Note that 
`pg_dump.dump` contains:


CREATE DATABASE some_db WITH TEMPLATE = my_own_template ENCODING = 
'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';


What is exactly the problem? I understand that en_US.UTF-8 and 
en_US.utf-8 is not *exactly* the same string.


However I do not understand how the difference came to be. And I do not 
know what the "right" way is and how to proceed from here.


If I `pg_dump --create` some DB on the new server (13.7-1.pgdg18.04+1) I 
get:


CREATE DATABASE ... ENCODING = 'UTF8' LOCALE = 'en_US.utf-8';

When I do the same on the old server (12.8-1.pgdg20.04+1) I get:

CREATE DATABASE ... ENCODING = 'UTF8' LC_COLLATE = 'en_US.UTF-8' 
LC_CTYPE = 'en_US.UTF-8';


Any hints or help?
*t




Re: ERROR: new collation (en_US.UTF-8) is incompatible with the collation of the template database (en_US.utf-8)

2022-06-22 Thread Tomas Pospisek

On 22.06.22 21:25, Adrian Klaver wrote:

On 6/22/22 12:17, Tomas Pospisek wrote:

Hi all,

while doing `cat pg_dump.dump | psql` I get the above message. Note 
that `pg_dump.dump` contains:


 CREATE DATABASE some_db WITH TEMPLATE = my_own_template ENCODING 
= 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';


What is exactly the problem? I understand that en_US.UTF-8 and 
en_US.utf-8 is not *exactly* the same string.


However I do not understand how the difference came to be. And I do 
not know what the "right" way is and how to proceed from here.


If I `pg_dump --create` some DB on the new server (13.7-1.pgdg18.04+1) 
I get:


 CREATE DATABASE ... ENCODING = 'UTF8' LOCALE = 'en_US.utf-8';

When I do the same on the old server (12.8-1.pgdg20.04+1) I get:

 CREATE DATABASE ... ENCODING = 'UTF8' LC_COLLATE = 'en_US.UTF-8' 
LC_CTYPE = 'en_US.UTF-8';


Any hints or help?


Are dumping/restoring from one version of Postgres to another?


Yes, indeed!

If from older to newer then use the new version of pg_dump(13) to dump 
the older(12) database. Then the 13 version of restore to load the 
version 13 database.


I will. Thanks a lot Adrian!
*t




Re: ERROR: new collation (en_US.UTF-8) is incompatible with the collation of the template database (en_US.utf-8)

2022-06-23 Thread Tomas Pospisek

On 22.06.22 22:18, Tomas Pospisek wrote:

On 22.06.22 21:25, Adrian Klaver wrote:

On 6/22/22 12:17, Tomas Pospisek wrote:

Hi all,

while doing `cat pg_dump.dump | psql` I get the above message. Note 
that `pg_dump.dump` contains:


 CREATE DATABASE some_db WITH TEMPLATE = my_own_template ENCODING 
= 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';


What is exactly the problem? I understand that en_US.UTF-8 and 
en_US.utf-8 is not *exactly* the same string.


However I do not understand how the difference came to be. And I do 
not know what the "right" way is and how to proceed from here.


If I `pg_dump --create` some DB on the new server 
(13.7-1.pgdg18.04+1) I get:


 CREATE DATABASE ... ENCODING = 'UTF8' LOCALE = 'en_US.utf-8';

When I do the same on the old server (12.8-1.pgdg20.04+1) I get:

 CREATE DATABASE ... ENCODING = 'UTF8' LC_COLLATE = 'en_US.UTF-8' 
LC_CTYPE = 'en_US.UTF-8';


Any hints or help?


Are dumping/restoring from one version of Postgres to another?


Yes, indeed!

If from older to newer then use the new version of pg_dump(13) to dump 
the older(12) database. Then the 13 version of restore to load the 
version 13 database.


I will. Thanks a lot Adrian!


So I used both pg_dump and pg_restore from the newer machine. Result is 
still the same.  So I'll use Tom Lane's suggestion too and fix the 
'UTF-8' spelling in the dump file:


Tom Lane wrote:

> This is probably more about dumping from different operating systems.
> The spelling of the locale name is under the control of the OS,
> and Postgres doesn't know very much about the semantics of it
> (so I think we conservatively assume that any difference in
> spelling is significant).
>
> Best bet might be to edit the dump file to adjust the locale
> spellings to match your new system.

Many thanks to both Tom & Adrian!!!
*t




Re: ERROR: new collation (en_US.UTF-8) is incompatible with the collation of the template database (en_US.utf-8)

2022-06-24 Thread Tomas Pospisek

On 23.06.22 20:21, Adrian Klaver wrote:

On 6/23/22 10:11, Adrian Klaver wrote:

On 6/23/22 00:37, Tomas Pospisek wrote:

On 22.06.22 22:18, Tomas Pospisek wrote:

On 22.06.22 21:25, Adrian Klaver wrote:

On 6/22/22 12:17, Tomas Pospisek wrote:




So I used both pg_dump and pg_restore from the newer machine. Result 
is still the same.  So I'll use Tom Lane's suggestion too and fix the 
'UTF-8' spelling in the dump file:


Not sure why that is necessary? Is seems this is low hanging fruit 
that could dealt with by the equivalent of lower('en_US.UTF-8') = 
lower('en_US.utf-8').


Well that was clear as mud.


:-D

My point was that I don't see why the end 
user should have to do this when it could be handled internally in the 
pg_restore code.


That would indeed be very friendly of pg_restore if it'd take that 
little task off the user :-)


+1 of course :-)

*t




Re: PostgreSQL 14.4 ERROR: out of memory issues

2022-07-14 Thread Tomas Vondra



On 7/14/22 21:25, Aleš Zelený wrote:
> 
> st 13. 7. 2022 v 2:20 odesílatel Michael Paquier  <mailto:mich...@paquier.xyz>> napsal:
> 
> On Mon, Jul 11, 2022 at 10:50:23AM +0200, Aleš Zelený wrote:
> > So far, it has happened three times (during a single week) from
> the 14.3 ->
> > 14.4 upgrade, before 14.4 we haven't suffered from such an issue.
> >
> > Questions:
> > 1)  Can we safely downgrade from 14.4 to 14.3 by shutting down the
> instance
> > and reinstalling 14.3 PG packages (to prove, that the issue
> disappear)?
> > 2) What is the best way to diagnose what is the root cause?
> 
> Hmm.  14.4 has nothing in its release notes that would point to a
> change in the vacuum or autovacuum's code paths:
> https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4
> <https://www.postgresql.org/docs/14/release-14-4.html#id-1.11.6.5.4>
> 
> There is nothing specific after a look at the changes as of, and I am
> not grabbing anything that would imply a change in memory context
> handling either:
> `git log --stat REL_14_3..REL_14_4`
> `git diff REL_14_3..REL_14_4 -- *.c`
> 
> Saying that, you should be able to downgrade safely as there are no
> changes in WAL format or such that would break things.  Saying that,
> the corruption issue caused by CONCURRENTLY is something you'd still
> have to face.
> 
> 
> Thanks, good to know that, we can use it for a test case, since we
> already hit the CONCURRENTLY bug on 14.3.
> 
> > 2022-07-02 14:48:07 CEST [3930]: [3-1] user=,db=,host=,app=
> ERROR:  out of
> > memory
> > 2022-07-02 14:48:07 CEST [3930]: [4-1] user=,db=,host=,app=
> DETAIL:  Failed
> > on request of size 152094068 in memory context
> "TopTransactionContext".
> > 2022-07-02 14:48:07 CEST [3930]: [5-1] user=,db=,host=,app= CONTEXT:
> >  automatic vacuum of table "prematch.replication.tab_queue_tmp"
> 
> This is the interesting part.  Do you happen to use logical
> replication in a custom C++ plugin?
> 
> 
> We are using logical replication to other instances (pg_output) and
> decoderbufs
> https://github.com/debezium/postgres-decoderbufs
> <https://github.com/debezium/postgres-decoderbufs> for other applications.
>  

This is probably just a red herring - std:bad_alloc is what the process
that runs into the overcommit limit gets. But the real issue (e.g.
memory leak) is likely somewhere else - different part of the code,
different process ...

> ...
> 
> Checking the RssAnon from proc/pid/status I've found some points where
> RssAnon memory usage grew very steep for a minute, but no "suspicious"
> queries/arguments were found in the instance logfile.
>  
> Any hint, on how to get the root cause would be appreciated since so far
> I've failed to isolate the issue reproducible testcase.
> At least I hope that looking for the RssAnon process memory is an
> appropriate metric, if not, let me know and I'll try to update the
> monitoring to get the root cause.
> 
> I can imagine a workaround with client application regular reconnect...,
> but u to 14.3 it works, so I'd like to fix the issue either on our
> application side or at PG side if it is a PG problem.
> 

I think it's be interesting to get memory context stats from the
processes consuming a lot of memory. If you know which processes are
suspect (and it seems you know, bacause if a reconnect helps it's the
backend handling the connection), you can attach a debugger and do

   $ gdb -p $PID
   call MemoryContextStats(TopMemoryContext)

which will log info about memory contexts, just like autovacuum.
Hopefully that tells us memory context is bloated, and that might point
us to particular part of the code.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Expr. extended stats are skipped with equality operator

2022-08-07 Thread Tomas Vondra



On 8/5/22 11:50, Danny Shemesh wrote:
> Hey all !
> 
> I'm on a quest to help the planner (on pg14) use the best of several
> partial, expressional indices we have on some large tables (few TBs in
> size, billions of records).
> 
> As we know, stats for expressions in partial indices aren't gathered by
> default - so I'm tinkering with expressional extended stats to cover for
> those.
> 
> I've tackled two interesting points there:
> 1. Seems like expressional stats involving the equality operator are
> skipped or mismatched (fiddle
> <https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/5379>)
> Let's take the following naive example:
> /create table t1 (x integer[]);
> insert into t1 select array[1]::integer[] from generate_series(1,
> 10, 1);
> create statistics s1 on (x[1] = 1) from t1;
> analyze t1;
> /
> /explain analyze select * from t1 where x[1] = 1;/
> /> Seq Scan on t1 (cost=0.00..1986.00 rows=500 width=29) (actual
> time=0.009..36.035 rows=10 loops=1)/
> /
> /
> Now, of course one can just create the stat on x[1] directly in this
> case, but I have a more complex use case where an equality operator is
> beneficial; 
> should the above case be supported ? feels like I'm just missing
> something fundamental.
> 

Hmmm. The problem here is that the expression may be interpreted either
as an operator clause (A=B), or as a boolean expression ((A=B)=TRUE). In
principle we might check which option matches extended statistics, but
that's not for free :-(

So what the current code does is trying to match the more specific cases
first, leaving the "bool expression" as a last resort. That seems like a
reasonable trade off to me, but I'll think about this a bit more.

There are probably other ways to match expressions, and we can't
possibly explore all of them. For example you may create statistics on
(x=1 AND y=2) and I doubt we'll match that, because we'll try matching
individual clauses not some arbitrary combinations of clauses. (Maybe we
shouldn't even allow creating such statistics ...)

> 2. Less important, just a minor note - feel free to ignore - although
> the eq. operator above seems to be skipped when matching the ext. stats,
> I can work around this by using a CASE expression (fiddle
> <https://www.db-fiddle.com/f/wJZNH1rNwJSo3D5aByQiWX/1>);
> Building on the above example, we can:
> /create statistics s2 on (case x[1] when 1 then true else false end)
> from t1;/
> /explain analyze select * from t1 where (case x[1] when 1 then true else
> false end/
> />  Seq Scan on t1 (cost=0.00..1986.00 rows=10 width=25) (actual
> time=0.011..33.721 rows=10 loops=1)/
> /

Yes, because this end ups not being matches as opclause, and therefore
goes all the way to the boolvarsel() in clause_selectivity_ext.

> /
> What's a bit problematic here, though, is that if we mix other dependent
> columns to the extended stat, and specifically if we create an mcv, 
> queries involving the CASE expression throw with `error: unknown clause
> type 130`, where clause type == T_CaseExpr.
> 
> The second point for me would be that I've found it a bit non intuitive
> that creating an extended statistic can fail queries at query time; it
> makes sense that the mcv wouldn't work for case expressions, but it
> might've been a bit clearer to:
> 
> a. Fail this at statistic creation time, potentially, or 
> b. Convert the type numeric in the above error to its text
> representation, if we can extract it out at runtime somehow - 
> I couldn't find a mapping of clause type numerics to their names, and as
> the node tags are generated at compile time, it could be build-dependent
> and a bit hard to track down if one doesn't control the build flags
> 

Yeah, this seems like a clear bug - we should not fail queries like
this. It's a sign statext_is_compatible_clause() and the MCV code
disagrees which clauses are compatible.

Can you share an example triggering this?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Expr. extended stats are skipped with equality operator

2022-08-07 Thread Tomas Vondra
On 8/7/22 19:28, Tom Lane wrote:
> Tomas Vondra  writes:
>> Yeah, this seems like a clear bug - we should not fail queries like
>> this. It's a sign statext_is_compatible_clause() and the MCV code
>> disagrees which clauses are compatible.
> 
> Indeed.  I attempted to fix that at e33ae53dd.
> 

Thanks! That probably explains why I've been unable to reproduce that, I
haven't realized there's a fix already.



-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




can't get psql authentication against Active Directory working

2023-02-18 Thread Tomas Pospisek

Hello all,

so I'm trying to authenticate psql (on Windows) -> postgres (on Linux) 
via Active Directory.


psql (Linux) -> postgres (Linux) with authentication against Active 
Directory does work.


However the same with psql.exe on Windows does not. I get:

D:\>C:\OSGeo4W\bin\psql.exe service=the_db
psql: error: connection to server at "dbserver.example.lan
(192.168.4.104), port 5432 failed: could not initiate GSSAPI
security context: No credentials were supplied, or the credentials
were unavailable or inaccessible: Internal credentials cache error

psql.exe from the OSGeo4W QGIS Installer *does* include GSS support. (I 
have tried with a different psql.exe without GSS support and it would 
tell me that it does not support GSS).


The .pg_service.conf file in the users $HOME directory looks like this:

[the_db]
host=dbserver.example.lan
port=5432
user=u...@example.lan
gssencmode=require

This same pg_service.conf does work for psql (Linux).

Also getting a Kerberos ticket for the service on Windows does work:

D:\> klist get postgres/dbserver.example@example.lan
[...]
This will list the ticket

But when using psql.exe it will not get a ticket for the service nor 
will it apparently use the existing service ticket.


I have tried to trace psql.exe with Window's Process Monitor and I can't 
see it accessing no keytab file (I'm not sure whether a keytab file 
exists at all under Windows or if psql.exe doesn't instead need to 
access some Windows service). I see that psql.exe will open and close a 
TCP connection to dbserver.example.lan, however as far as I can see that 
connection is completely irrelevant for the aquisition of a Kerberos 
ticket for the service since that is a business purely between psql.exe 
and Active Directory or respectively between psql.exe and the 
credentials cache. And there is no other TCP connection being opened to 
anywhere from psql.exe.


What I find suspicious about the error above is "... the credentials 
were unavailable or inaccessible: Internal credentials cache error", 
since that looks like either psql.exe can't access the (inexisting) 
keytab file, or it can't access Window's Kerberos service.


Also, I see that psql.exe is trying to access a ccapiserver.exe which 
does not exist. Should psql.exe be able to access that ccapiserver.exe 
file? That means is the OSGeo4W QGIS installer, that also installs all 
things necessary for psql missing that ccapiserver.exe executable?


* has anybody ever succeeded in authenticating with psql.exe against 
Active Directory?

* can you maybe tell me what's wrong from the error message above?
* how can I proceed from here?

Thanks a lot for any pointers and/or help!!!
*t

PS: Any way to make GSS more talkative? At this moment all that I can 
get as logs is the above "computer says no".





Re: can't get psql authentication against Active Directory working

2023-02-18 Thread Tomas Pospisek

On 18.02.23 17:16, Erik Wienhold wrote:

On 18/02/2023 15:02 CET Tomas Pospisek  wrote:

so I'm trying to authenticate psql (on Windows) -> postgres (on Linux)
via Active Directory.

psql (Linux) -> postgres (Linux) with authentication against Active
Directory does work.

However the same with psql.exe on Windows does not. I get:

  D:\>C:\OSGeo4W\bin\psql.exe service=the_db
  psql: error: connection to server at "dbserver.example.lan
  (192.168.4.104), port 5432 failed: could not initiate GSSAPI
  security context: No credentials were supplied, or the credentials
  were unavailable or inaccessible: Internal credentials cache error

psql.exe from the OSGeo4W QGIS Installer *does* include GSS support. (I
have tried with a different psql.exe without GSS support and it would
tell me that it does not support GSS).

The .pg_service.conf file in the users $HOME directory looks like this:

  [the_db]
  host=dbserver.example.lan
  port=5432
  user=u...@example.lan
  gssencmode=require

This same pg_service.conf does work for psql (Linux).


On Windows the service file is not read from $home/.pg_service.conf but
$env:appdata/postgresql/.pg_service.conf (or 
%appdata%/postgresql/.pg_service.conf
when using cmd.exe.)


Thank you Erik,

the location of the .pg_service.conf file is not the problem, as I am 
seeing that psql.exe is reacting to its contents, that is authenticating 
differently depending on whether gssencmode is set or not (maybe I noted 
its location wrongly, since i've written the post from my notes, as I do 
not have access to the machines in question all the time).


Do you have authentication from psql.exe against Active Directory working?
*t




Re: can't get psql authentication against Active Directory working

2023-02-21 Thread Tomas Pospisek

Hi Stephen,

first: thanks a lot for replying!

On 20.02.23 15:17, Stephen Frost wrote:


* Tomas Pospisek (t...@sourcepole.ch) wrote:

so I'm trying to authenticate psql (on Windows) -> postgres (on Linux) via
Active Directory.


Looks like you're trying to do a bit more than that as you're using
GSSAPI (not SSPI, which would be the more typical method on Windows..)
and you're asking for an encrypted connection.  Is there a reason you're
going down this particular route..?


The reason I'm using the GSSAPI is it's the only documented way to do 
authentication from psql.exe against Active Directory that was able to 
find. Even now I am unable to find *any* documentation on how to go 
about doing auth from psql.exe to AD with SSPI. Would you have any 
pointers to documentation or a howto (or a blog post or a stackoverflow 
answer or...).


The reason I am apparently asking for encrypted communications is that I 
apparently misinterpreted what `gssencmode`: I thought that it would 
force the use of GSS but apparently it's forcing a connection encryption?


How do I force psql.exe to use GSS and/or SSPI then (instead of 
username/password authentication that it seems to be doing by default).


Maybe the reason I'm using GSS is because what I really want to achieve 
is to authenticate from QGIS with AD. QGIS (from osgeo4w) is shipping a 
libpq that is linked against the kerberos library, so that's what I 
have. I have no idea how libpq needs to be compiled so as to be able to 
do SSPI?


Also, just in case I would find out how to get my hand (or build) a 
psql/libpsql that *can* do SSPI - would I be able to swap out the 
existing libpq for the libpq with SSPI support and dependend executable 
would still be supposed to work?


Or am I completely misunderstanding everything?


psql (Linux) -> postgres (Linux) with authentication against Active
Directory does work.


That's good.  One thing to know about Linux vs. Windows when it comes to
Kerberos (and GSSAPI/SSPI) is that the default service name is different
between them.  See the libpq parameter krbsrvname documentation here:

https://www.postgresql.org/docs/current/libpq-connect.html


I quote that document:

"krbsrvname

Kerberos service name to use when authenticating with GSSAPI. This 
must match the service name specified in the server configuration for 
Kerberos authentication to succeed. (See also Section 21.6.) The default 
value is normally postgres, but that can be changed when building 
PostgreSQL via the --with-krb-srvnam option of configure. In most 
environments, this parameter never needs to be changed. Some Kerberos 
implementations might require a different service name, such as 
Microsoft Active Directory which requires the service name to be in 
upper case (POSTGRES).

"

I'm using the postgres server from apt.postgresql.org. I do not know 
whether it has been built with `--with-krb-srvnam` but as far as I can 
find out it was not.


Could you please give me a working example? Assuming I need to set 
`krbsrvname=POSTGRES` then where would I have to set that? In 
pg_service.conf? Or in krb5.conf? Or somewhere else?



However the same with psql.exe on Windows does not. I get:

 D:\>C:\OSGeo4W\bin\psql.exe service=the_db
 psql: error: connection to server at "dbserver.example.lan
 (192.168.4.104), port 5432 failed: could not initiate GSSAPI
 security context: No credentials were supplied, or the credentials
 were unavailable or inaccessible: Internal credentials cache error


That's a bit of an odd message to get if krbsrvname is this issue
though.


psql.exe from the OSGeo4W QGIS Installer *does* include GSS support. (I have
tried with a different psql.exe without GSS support and it would tell me
that it does not support GSS).


Providing the specific messages would possibly be helpful here..


The message I get when libpq is not compiled against GSS is:

gssencmode value "require" invalid when GSSAPI support is not 
compiled in



The .pg_service.conf file in the users $HOME directory looks like this:

 [the_db]
 host=dbserver.example.lan
 port=5432
 user=u...@example.lan
 gssencmode=require

This same pg_service.conf does work for psql (Linux).

Also getting a Kerberos ticket for the service on Windows does work:

 D:\> klist get postgres/dbserver.example@example.lan
 [...]
 This will list the ticket


Is this using MIT klist or Windows klist though?


It's using Windows klist.


But when using psql.exe it will not get a ticket for the service nor will it
apparently use the existing service ticket.


As mentioned above, GSSAPI and SSPI aren't the same thing and what I
suspect you're seeing here is that the Windows klist is using SSPI but
the psql you have is built with GSSAPI.  There used to be a Kerberos For
Windows utility that would make the bridge b

Re: can't get psql authentication against Active Directory working

2023-02-24 Thread Tomas Pospisek

Hi Stephen,

again thanks a lot for trying to help me!

On 21.02.23 16:29, Stephen Frost wrote:


* Tomas Pospisek (t...@sourcepole.ch) wrote:

On 20.02.23 15:17, Stephen Frost wrote:

* Tomas Pospisek (t...@sourcepole.ch) wrote:

so I'm trying to authenticate psql (on Windows) -> postgres (on Linux) via
Active Directory.


Looks like you're trying to do a bit more than that as you're using
GSSAPI (not SSPI, which would be the more typical method on Windows..)
and you're asking for an encrypted connection.  Is there a reason you're
going down this particular route..?


The reason I'm using the GSSAPI is it's the only documented way to do
authentication from psql.exe against Active Directory that was able to find.


I mean ... https://www.postgresql.org/docs/current/sspi-auth.html


Were should the settings that are discussed in that document be 
configured? In pg_hba.conf?


That document says:

"SSPI authentication only works when both server and client are running 
Windows, or, on non-Windows platforms, when GSSAPI is available."


I interpret that phrase like this:

* there's a case where both server and client are running Windows 
(doesn't apply to me)
* there's a case where both are running non-Windows (doesn't apply to me 
either - server is Linux, client is Windows)


So that's an extremely terse docu that is not clear at all to me.

I'd suggest to change that phrase into:

"SSPI authentication works when both server and client are running 
Windows. When server or client are on non-Windows platforms then those 
need to use GSSAPI."


- assuming that my interpretation of that super terse docu is correct of 
course.



Even now I am unable to find *any* documentation on how to go about doing
auth from psql.exe to AD with SSPI. Would you have any pointers to
documentation or a howto (or a blog post or a stackoverflow answer or...).


Sure, here's a blog post that I wrote about doing exactly that:

https://www.crunchydata.com/blog/windows-active-directory-postgresql-gssapi-kerberos-authentication


Note that that document doesn't mention SSPI a single time.

The ultra terse Postgres docu is using terms such as "SAM", "negotiate 
mode", "UPN", without a reference to their definition, which leaves to 
Windows noobs like me (and equaly Windows non-noobs) without a clue. 
Searching "kerberos SAM" or "SSPI negotiate mode" on duckduckgo doesn't 
seem to lead to useful hits. So the documentation seems to be geared to 
experts of both Kerberos and Active Directory and Windows technology 
alike which as far as I can tell is a very rare specimen.



The reason I am apparently asking for encrypted communications is that I
apparently misinterpreted what `gssencmode`: I thought that it would force
the use of GSS but apparently it's forcing a connection encryption?


gssencmode indeed requires a GSS encrypted connection.  Unfortunately,
there isn't a way to force GSS or SSPI be used for authentication today
(outside of GSS with encryption) from the client side, though that's
something that's been discussed and I think most people feel would be
good to have.  That said, if the server is configured in pg_hba for gss
(or sspi), then the client will try to authenticate that way.


As far as I can see it is *not possible* to configure the server for 
SSPI ("or sspi")? It's only possible to configure it for GSS *and* SSPI 
at the same time (?) assuming the server is running on a non-Windows 
platform?



How do I force psql.exe to use GSS and/or SSPI then (instead of
username/password authentication that it seems to be doing by default).


The server's hba needs to be configured for gss (if the server is Linux)
or sspi (if the server is on Windows) and then the client will be
informed that's what the server wants when it connects and it'll try to
authenticate that way.  It's not possible currently to force the client
to only try GSS or SSPI, what it will try is mostly driven by the server
side saying what the server wants the client to authenticate with.


Maybe the reason I'm using GSS is because what I really want to achieve is
to authenticate from QGIS with AD. QGIS (from osgeo4w) is shipping a libpq
that is linked against the kerberos library, so that's what I have. I have
no idea how libpq needs to be compiled so as to be able to do SSPI?


PostgreSQL, when built on Windows (including libpq), is always built
with SSPI support.  That said, if it's also built with GSSAPI, then SSPI
is preferred but you can set 'gsslib' to 'gssapi' to prefer gssapi be
used instead.  Now, if you're setting gssencmode to prefer or require
and the library is built with GSSAPI then we're going to try GSSAPI
first to try to set up the encrypted connection.


Also, just in case I would find ou

Re: can't get psql authentication against Active Directory working

2023-03-10 Thread Tomas Pospisek

Hi Stephen,

On 25.02.23 00:52, Stephen Frost wrote:


* Tomas Pospisek (t...@sourcepole.ch) wrote:

>>

On 21.02.23 16:29, Stephen Frost wrote:

* Tomas Pospisek (t...@sourcepole.ch) wrote:

On 20.02.23 15:17, Stephen Frost wrote:

* Tomas Pospisek (t...@sourcepole.ch) wrote:

>>

That [SSPI] document says:

"SSPI authentication only works when both server and client are running
Windows, or, on non-Windows platforms, when GSSAPI is available."

I interpret that phrase like this:

* there's a case where both server and client are running Windows (doesn't
apply to me)
* there's a case where both are running non-Windows (doesn't apply to me
either - server is Linux, client is Windows)

So that's an extremely terse docu that is not clear at all to me.

I'd suggest to change that phrase into:

"SSPI authentication works when both server and client are running Windows.
When server or client are on non-Windows platforms then those need to use
GSSAPI."

- assuming that my interpretation of that super terse docu is correct of
course.


No, that's not correct.  The blog post that I referenced is actually
using SSPI on the client and GSS on the server and it works because
they're compatible with each other on the wire.  Perhaps we could
improve the documentation you reference above but it's not actually
wrong as-is.  Perhaps this would be clearer:

SSPI and GSS are wireline compatible and can be mixed and matched
between clients and servers (where support for GSS is built into the
client library or the server), provided that there is an encryption
method which both will accept.

The encryption method bit is more of a historical artifact at this
point as modern systems have compatible AES-based encryption methods,
but that wasn't always the case.  It's also possible that it'll become
an issue in the future but at least for the moment most installations
have an AES-based compatible encryption method.


I have posted a suggestion for an improvement (via the form) to the 
pgsql-docs mailing list. Lets see what comes out of that.


I like the idea to document the *why* behind SSPI and GSS, however I 
don't understand enough about it to propose a documentation improvement.



Even now I am unable to find *any* documentation on how to go about doing
auth from psql.exe to AD with SSPI. Would you have any pointers to
documentation or a howto (or a blog post or a stackoverflow answer or...).


Sure, here's a blog post that I wrote about doing exactly that:

https://www.crunchydata.com/blog/windows-active-directory-postgresql-gssapi-kerberos-authentication


Note that that document doesn't mention SSPI a single time.


It doesn't actually need to because SSPI is the default and it all just
works..


I can confirm the "just works" part since my setup now indeed *does* work.

However the existing documentation on GSS/Kerberos/SSPI/Active directory 
is extremely terse and sparse. So being more explicit and being clearer 
would improve the situation I think. Are you able and interested to 
change the Blog article?


I'd suggest to add a paragraph like this:

Using a Postgresql client on Windows

If you use a Postgresql client on Windows then that client should
be able to automatically use Windows' SSPI API, which will do the
authentication via Active Directory without any further
configuration necessary.


I suppose I could have added in somewhere "SSPI is used on the
client for this", perhaps, but the blog post was more focused on "this
is what you do to make it work" and you don't really need to know that
the API that the client is using in this case happens to be called SSPI.


The ultra terse Postgres docu is using terms such as "SAM", "negotiate
mode", "UPN", without a reference to their definition, which leaves to
Windows noobs like me (and equaly Windows non-noobs) without a clue.


I'm certainly all for improving the documentation.  Not sure that just
spelling those out would really be enough and I'm rather confident that
trying to explain all of how Kerberos and AD work wouldn't really be
maintainable in our documentation.  Still, specific suggestion on how we
can improve would be great.


+1 - see my post to the pgsql-docs mailing list for a starter.


Searching "kerberos SAM" or "SSPI negotiate mode" on duckduckgo doesn't seem
to lead to useful hits. So the documentation seems to be geared to experts
of both Kerberos and Active Directory and Windows technology alike which as
far as I can tell is a very rare specimen.


Active Directory experts certainly should be generally familiar with
Kerberos (and the terms you reference above are actually all AD ones,
really..), so I'd argue that the documentation is mainly focused on
helping those individuals.  I do generally 

Re: Getting error while upgrading postgres from version 12 to 13

2023-03-18 Thread Tomas Pospisek

On 18.03.23 22:24, shashidhar Reddy wrote:
and adding to this the following is also showing in the error, we tried 
changing the file permissions but still no luck
FATAL:  could not load private key file 
"/etc/ssl/private/ssl-cert-snakeoil.key": key values mismatch


This sounds like the private SSL key you have configured is not the 
private key for the public key you have configured - they do not match.

*t





Re: Getting error while upgrading postgres from version 12 to 13

2023-03-18 Thread Tomas Pospisek
On Sun, Mar 19, 2023 at 2:09 AM shashidhar Reddy 
mailto:shashidharreddy...@gmail.com>> wrote:


Hello,

I am in a process of upgrading postgresql from version 12 to 13
using pg_upgrdrade but I am getting error stating

  FATAL:  database files are incompatible with server
  DETAIL:  The data directory was initialized by PostgreSQL version 13
, which is not compatible with this version 12.14 (Ubuntu
12.14-1.pgdg18.04+1)

Please help me on this issue.


You have configured the postgres v12 server to use a directory where 
there are data from a postgres v13 server. Evidently that can not work...

*t





Re: Properly handle OOM death?

2023-03-18 Thread Tomas Pospisek

On 13.03.23 21:25, Joe Conway wrote:

Hmm, well big +1 for having swap turned on, but I recommend setting 
"vm.overcommit_memory=2" even so.


I've snipped out the context here, since my advice is very unspecific: 
do use swap only as a safety net. Once your system starts swapping 
performance goes down the toilet.

*t





Re: backend crash on DELETE, reproducible locally

2018-11-06 Thread Tomas Vondra
On 11/6/18 10:54 PM, Andres Freund wrote:
> On 2018-11-06 16:47:20 -0500, Tom Lane wrote:
>> =?UTF-8?Q?Ond=c5=99ej_Bouda?=  writes:
>>>> Ondřej, as a short-term workaround you could prevent the crash
>>>> by setting that index's recheck_on_update property to false.
>>
>>> Thanks for the tip. I am unsuccessful using it, though:
>>> # ALTER INDEX public.schedulecard_overlap_idx SET (recheck_on_update = 
>>> FALSE);
>>> ERROR:  unrecognized parameter "recheck_on_update"
>>
>> Oh, for crying out loud.  That's yet a different bug.
>> I'm not sure that it's the fault of the recheck_on_update
>> feature proper though; it might be a pre-existing bug in
>> the reloptions code.  Looks like somebody forgot to list
>> RELOPT_KIND_GIST in RELOPT_KIND_INDEX, but is that the
>> fault of commit c203d6cf8 or was it busted before?
> 
> Looks new:
> +   RELOPT_KIND_INDEX = 
> RELOPT_KIND_BTREE|RELOPT_KIND_HASH|RELOPT_KIND_GIN|RELOPT_KIND_SPGIST,
> 
> there aren't any other "for all indexes" type options, so the whole
> category didn't exist before.
> 
> It also strikes me as a really bad idea, even if RELOPT_KIND_GIST
> wouldn't have been omitted: It breaks index am extensibility.
> 

Does it? The RELOPT_KIND_* stuff is hard-coded in reloptions.h anyway,
so I'm not sure how this particular thing makes it less extensible?

That being said, we also have RELOPT_KIND_BRIN, and that seems to be
missing from RELOPT_KIND_INDEX too (and AFAICS the optimization works
for all index types).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: integrate Postgres Users Authentication with our own LDAP Server

2019-05-09 Thread Tomas Vondra

On Thu, May 09, 2019 at 07:11:24AM +, M Tarkeshwar Rao wrote:

We  want to setup ldap authentication in pg_hba.conf, for Postgresql
users(other than  postgres super user).

We are getting issue with special characters by following steps given in
postgres documentation.  It is not accepting any special characters as
special characters are mandatory in our use case.

Can you please help us or have you any steps by which we can configure
any postgres with LDAP?


Please don't cross-post - this is a fairly generic question, it has
nothing to do with performance or development, so the right thing is to
send it to pgsql-general. Likewise, it makes little sense to send
questions to the "owner". I've removed the other lists from CC.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Regarding SSL Enablement in PostgreSQL Database on different port

2023-05-02 Thread Tomas Pospisek

On 02.05.23 12:13, Magnus Hagander wrote:

On Tue, May 2, 2023 at 11:43 AM sujay kadam  wrote:


Hi PostgreSQL Team,


I want to enable SSL in PostgreSQL Database on a new port.

I don’t want the default port that is 5432 SSL enabled, but I want to configure 
another port to enable SSL on it.

As per my requirement, I cannot use the same port for normal connection and SSL 
connection.

Hence, we require a new port to be SSL enabled.


Please guide us with proper information and links to achieve the above task.


That is now how SSL in PostgreSQL works. It will always run on the
same port, and PostgreSQL will only listen on one port.

You can probably do some hacky solution to it by running something
like pgbouncer on a different port and enable SSL only in that one.
But it will be a hack. I would recommend instead reviewing your
requirements and see if you can make them work with how PostgreSQL is
designed.


Oh, I think your idea to use pgbouncer to take care of the SSL 
termination is elegant. I don't think me I'd characterize it as a hack 
if properly set up. Why do you consider it a hack?

*t





Re: Question regarding specifics of GIN and pg_trgm performance and potential use of show_trgm to improve it

2023-05-25 Thread Tomas Vondra
2273.256 ms
> (13 rows)
> 
> 
> Thank you for any sort of insight into this.

I don't think indexing the number of trigrams like this can help, and
I'm not sure how to improve this (at least for the built-in GIN). It
seem similarity searches are bound to be proportional to the most
frequent trigram in the query.

I wonder if the "newer" GIN variants like RUM [1] could improve this,
but I don't think it has trgm opclasses.


regards

[1] https://github.com/postgrespro/rum

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pb with join plan

2023-06-21 Thread Tomas Vondra
On 6/21/23 00:26, Marc Millas wrote:
> 
> 
> On Tue, Jun 20, 2023 at 11:19 PM David Rowley  <mailto:dgrowle...@gmail.com>> wrote:
> 
> On Wed, 21 Jun 2023 at 08:34, Marc Millas  <mailto:marc.mil...@mokadb.com>> wrote:
> >
> > On Tue, Jun 20, 2023 at 10:14 PM David Rowley
> mailto:dgrowle...@gmail.com>> wrote:
> >>
> >> On Wed, 21 Jun 2023 at 07:42, Marc Millas  <mailto:marc.mil...@mokadb.com>> wrote:
> >> > But if I do the same with clause one OR clause 2, I have to 
> kill the request after an hour, seeing the filesystem showing more
> than 140 Mb of increased usage.
> >>
> >>
> > link to the anonymized plan of the req with one clause :
> https://explain.depesz.com/s/TWp4 <https://explain.depesz.com/s/TWp4>
> 
> link to the plan with the second
> clause alone: https://explain.depesz.com/s/byW5
> <https://explain.depesz.com/s/byW5> 
> link to the plan with both clauses ORed (the one not
> finishing) https://explain.depesz.com/s/jHO2
> <https://explain.depesz.com/s/jHO2>
> 
> 
> 
> It's quite difficult to know what the problem is you want to fix here.
> Your initial post indicated it was the query with the OR condition
> that was causing you the problems, but the plan you've posted has no
> OR condition?!
> 
> You're more likely to get help here if you take time to properly
> explain the situation and post the information that's actually
> relevant to the problem you're having, or state the problem more
> clearly, as there's a mismatch somewhere.
> 
> It might also be worth having a look at
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
> <https://wiki.postgresql.org/wiki/Slow_Query_Questions> . EXPLAIN is not
> going to tell us what part of the query is slow. I'll let the wiki
> page guide you into what to do instead.
> 
>  
> I know that page. obviously, as I have to kill the request, I cannot
> provide a explain analyze... 
> 

It's a bit weird the "victor" table is joined seemingly without any join
conditions, leading to a cross join (which massively inflates the cost
for joins above it). Maybe the anonymized plan mangles it somehow.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: move databases from a MySQL server to Postgresql.

2023-06-23 Thread Tomas Vondra
On 6/23/23 13:45, Alfredo Alcala wrote:
> Hello
> 
> I need to move some databases from a MySQL server to Postgresql.
> 
> Can someone tell me the migration procedure, tools and recommendations?
> 

I'm not an expect on this, but migrations tend to be somewhat
application-specific. I'd suggest you take a look at this wiki page:

https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL#MySQL

and maybe try some of the tools mentioned there (pgloader, mysql2pgsql,
and so on).

You'll have to give it a try on your databases, and then ask questions
about practical issues you run ran into.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [Beginner Question]What is query jumbling?

2023-07-17 Thread Tomas Vondra



On 7/17/23 05:23, Wen Yi wrote:
> Hi community,
> I am reading the 'Generate code for query jumbling through
> gen_node_support.pl'
> (https://www.postgresql.org/message-id/E1pMk51-000puf-55%40gemulon.postgresql.org)
>  to understand the principle of the postgres's node, but I really don't know 
> the meaning of the word 'query jumbling'.
> 
> Can someone provide me some advice?
> Thanks in advance!
> 

Stuff like this is usually explained somewhere in the code base, so "git
grep" is your friend. In this case it's probably good to read the
comment in src/backend/nodes/queryjumblefuncs.c which says:

 * Normalization is implemented by fingerprinting queries, selectively
 * serializing those fields of each query tree's nodes that are judged
 * to be essential to the query. This is referred to as a query jumble.

So "jumbling" is a process to generate a "normalized" query.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Query take a long time and use no index

2023-07-17 Thread Tomas Vondra
arallel Seq Scan on data  (cost=0.00..188457.44
> rows=7452844 width=8) (actual time=0.137..2121.695 rows=5962263 loops=3)
>  Planning Time: 0.247 ms
>  JIT:
>    Functions: 11
>    Options: Inlining false, Optimization false, Expressions true,
> Deforming true
>    Timing: Generation 2.665 ms, Inlining 0.000 ms, Optimization 1.612
> ms, Emission 48.580 ms, Total 52.857 ms
>  Execution Time: 3525.656 ms
> (12 rows)
> 
> Why are the indexes not used ?

It's a common misconception that indexes necessarily improve performance
and that slow queries are slow because of missing indexes. In practice
indexes are most useful for selective queries, i.e. queries that scan
only a fraction of the table - say, a couple percent.

But this query scans ~17M rows from the table, and my guess is that's a
significant part of the table. In which case the index is useless.

Also, notice the seqscan took ~4s. Even if that drops to ~1s with an
index, the overall query will still take 12s.

> Is it a Problem of the DATE_TRUNC ?

My guess is it's more about the DISTINCT, which forces a sort before the
aggregate function. Which also means the query can't be parallel. You
can try simplifying the query step by step - remove the DISTINCT first,
then the DATE_TRUNC.

> How can I optimize?
> 

It's always going to be slow with the COUNT(DISTINCT), I'm afraid.

Not sure how much you can modify the query / database, and how accurate
results you need. If you're OK with estimates, you can try postgres-hll
extension [2] which estimates count(distinct). For exact reaults, I
wrote count_distinct extension [2] that uses hashtable. Might be worth a
try, I guess.

Another option is to precalculate stuff, and do queries on that. That
is, you're not going to add data with old timestamps, so you can
pre-aggregate that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Query take a long time and use no index

2023-07-17 Thread Tomas Vondra
On 7/17/23 13:20, Tomas Vondra wrote:
> ...
> 
> It's always going to be slow with the COUNT(DISTINCT), I'm afraid.
> 
> Not sure how much you can modify the query / database, and how accurate
> results you need. If you're OK with estimates, you can try postgres-hll
> extension [2] which estimates count(distinct). For exact reaults, I
> wrote count_distinct extension [2] that uses hashtable. Might be worth a
> try, I guess.
> 

Damn, I forgot to add the links:

[1] https://github.com/citusdata/postgresql-hll

[2] https://github.com/tvondra/count_distinct

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: How to grant read only functions execute permission to read only user

2023-07-17 Thread Tomas Vondra



On 7/17/23 16:11, Yi Sun wrote:
> Hello guys,
> 
> Our read only user is okay to be granted read only permission of tables.
> 
> How to grant read only functions execute permission to read only user,
> is there a simple way to do it please? If not, how to get the list of
> read only functions please? Then can grant one by one based on the list,
> thanks

What is read-only function? I don't think Postgres has anything like
that. Functions inherit the privileges of the user that executes them by
default. So if the user is read-only (i.e. has just SELECT privilege),
then the function can't do any writes either.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: error: connection to server on socket...

2023-08-01 Thread Tomas Pospisek

On 01.08.23 12:22, Amn Ojee Uw wrote:

Using the following command on my Debian 12 machine:

sudo apt-get install postgresql postgresql-15 postgresql-client-common 
postgresql-common postgresql-contrib postgresql-doc phppgadmin

and following the instruction on this 
 web page I have 
installed PostgreSQL-15.

The installation went smooth, until the following command was issued :

sudo -u postgres psql

... I get an error message that reads:

could not change directory to "/home/my_account": Permission denied
psql: error: connection to server on socket 
"/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
    Is the server running locally and accepting connections on that socket?

What can I do to resolve this issue?


When executing the `sudo` command above:

* as which user are you doing that (you can see that via `id`)?
* in which directory are you executing it  (you can see that via `pwd`)?
* what is the home directory of the postgres user (you can see that via 
`grep postgres /etc/passwd`)
* what does `ls -ld /var/run/postgresql/; ls -l 
/var/run/postgresql/.s.PGSQL.5432` show?


You also wrote:

> I tried this, but to no avail : `sudo systemctl start postgresql`

- what does `journalctl -xeu postgresql` say?
- what does `tail -n 20 
/var/log/postgresql/postgresql-*-qgisclouddb.log` show?


*t





Re: PORT 5432 ERROR

2023-08-01 Thread Tomas Pospisek

On 01.08.23 01:18, David Beck wrote:
I am receiving an error when I enter the server port number to listen 
to—installed on all available versions, both 32-bit and 64-bit when 
applicable, even the latest version, 15.3 x64 for Windows. I downloaded 
the latest version as an example, and as I was told, I used the 5432 
port as the default port needed. I get a pop-up afterward saying, 
"Warning: The specified port number is not available. Please enter a 
different port number," than roadblocks at entering the default port 
number. I found port 5434 available though it locks at a loading screen 
"loading server Postgre version 7.4. Version 10 works on port 5434 
though the server list tree spins eternally.


Any solution or assistance would be appreciated.


How/where do you enter the server port?
*t





Re: Intermittent Issue with WAL Segment Removal in Logical Replication

2023-12-29 Thread Tomas Vondra
On 12/27/23 16:31, Kaushik Iska wrote:
> Hi all,
> 
> I'm including additional details, as I am able to reproduce this issue a
> little more reliably.
> 
> Postgres Version: POSTGRES_14_9.R20230830.01_07
> Vendor: Google Cloud SQL
> Logical Replication Protocol version 1
> 

I don't know much about Google Cloud SQL internals. Is it relatively
close to Postgres (as e.g. RDS) or are the internals very different /
modified for cloud environments?

> Here are the logs of attempt succeeding right after it fails:
> 
> 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres
> STATEMENT:  START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL
> 6/5AE67D79 (proto_version '1', publication_names
> 'peerflow_pub_wal_testing_2') <- FAILS
> 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres
> ERROR:  requested WAL segment 000100060059 has already been
> removed
> 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres
> STATEMENT:  START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL
> 6/5AE67D79 (proto_version '1', publication_names
> 'peerflow_pub_wal_testing_2')  <- SUCCEEDS
> 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres
> LOG:  logical decoding found consistent point at 6/5A31F050
> 
> Happy to include any additional details of my setup.
> 

I personally don't see how could this fail and then succeed, unless
Google does something smart with the WAL segments under the hood. Surely
we try to open the same WAL segment (given the LSN is the same), so how
could it not exist and then exist?

As Ron already suggested, it might be useful to see information for the
replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn
value). Also, maybe show the contents of pg_wal (especially for the
segment referenced in the error message).

Can you reproduce this outside Google cloud environment?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Intermittent Issue with WAL Segment Removal in Logical Replication

2023-12-29 Thread Tomas Vondra
On 12/29/23 22:28, Kaushik Iska wrote:
> I am unfortunately not really familiar with Google Cloud SQL internals
> as well. But we have seen this happen on Amazon RDS as well.
> 

Do you have a reproducer for regular Postgres?

> Could it be possible that we are requesting a future WAL segment, say
> WAL upto X is written and we are asking for X + 1? It could be that the
> error message is misleading.
> 

I don't think that should be possible. The LSN in the START_REPLICATION
comes from the replica, where it's tracked as the last LSN received from
the upstream. So that shouldn't be in the future. And it's doesn't seem
to be suspiciously close to segment boundary either.

In fact, the LSN in the message is 6/5AE67D79, but the "failed" segment
is 000100060059, which is the *preceding* one. So it can't
be in the future.

> I do not have the information from pg_replication_slots as I have
> terminated the test. I am fairly certain that I can reproduce this
> again. I will gather both the restart_lsn and contents of pg_wal for the
> failed segment. Is there any other information that would help debug
> this further?
> 

Hard to say. The best thing would be to have a reproducer script, ofc.
If that's not possible, the information already requested seems like a
good start.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: How do you optimize the disk IO when you cannot assume a file will start at a boundary then?

2024-02-22 Thread Tomas Vondra
On 2/22/24 02:22, Siddharth Jain wrote:
> Hi All,
> 
> I understand the storage layer in databases goes to great lengths to ensure:
> - a row does not cross a block boundary
> - read/writes/allocation happen in units of blocks
> etc. The motivation is that at the OS level, it reads and writes pages
> (blocks), not individual bytes. I am only concerned about SSDs but I think
> the principle applies to HDD as well.
> 
> but how can we do all this when we are not even guaranteed that the
> beginning of a file will be aligned with a block boundary? refer this
> <https://stackoverflow.com/questions/8018449/is-it-guaranteed-that-the-beginning-of-a-file-is-aligned-with-pagesize-of-file-s>
> .
> 
> Further, I don't see any APIs exposing I/O operations in terms of blocks.
> All File I/O APIs I see expose a file as a randomly accessible contiguous
> byte buffer. Would it not have been easier if there were APIs that exposed
> I/O operations in terms of blocks?
> 
> can someone explain this to me?
> 

The short answer is that this is well outside our control. We do the
best we can - split our data files to "our" 8kB pages - and hope that
the OS / filesystem will do the right thing to map this to blocks at the
storage level.

The filesystems do the same thing, to some extent - they align stuff
with respect to the beginning of the partition, but if the partition
itself is not properly aligned, that won't really work.

As for the APIs, we work with what we have in POSIX - I don't think
there are any APIs working with blocks, and it's not clear to me how
would it fundamentally differ from the APIs we have now. Moreover, it's
not really clear which of the "block" would matter. The postgres 8kB
page? The filesytem page? The storage block/sector size?

FWIW I think for SSDs this matters way more than for HDD, because SSDs
have to erase the space before a rewrite, which makes it much more
expensive. But that's not just about the alignment, but about the page
size (with smaller pages being better).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Performance degradation after upgrading from 9.5 to 14

2024-04-17 Thread Tomas Vondra
On 4/17/24 19:13, Johnathan Tiamoh wrote:
> Hello,
> 
> 
> I performed an  upgrade from postgresql-9.5 to postgresql-14 and the
> performance has degraded drastically.
> 
> Please, is they any advice on getting performance back ?
> 

There's very little practical advice we can provide based on this
report, because it's missing any useful details. There's a number of
things that might have caused this, but we'd have to speculate.

For example:

1) How did you upgrade? pg_dump or pg_upgrade?

2) Did you run ANALYZE to collect statistics after the upgrade?

3) Did you transfer the configuration, or did you just create a new
cluster with the default values?

4) What exactly is slower? Queries? Inserts?

5) Can you quantify the impact? Is it 2x slower? 100x slower?


regards


-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Unexpected data when subscribing to logical replication slot

2024-05-08 Thread Tomas Vondra
Hi,

On 5/8/24 11:17, Daniel McKenzie wrote:
> We have a .NET application which subscribes to a logical replication slot
> using wal2json. The purpose of the application is to publish events to AWS
> SQS. We are required to first "enrich" these events by querying the
> database.
> 
> We have found that these queries will often find old data (i.e. the data
> that existed prior to the update) which is unexpected.
> 
> For example, when I use a psql terminal to update a user's last name from
> "Jones" to "Smith" then I would expect the enrichment query to find "Smith"
> but it will sometimes still find "Jones". It finds the old data perhaps 1
> in 50 times.
> 
> To reproduce this I use a psql terminal to execute an update statement
> which changes a user's last name with \watch 1.5 and monitor our
> application logs for cases where the wal2json output and the enrichment
> query output have different last names.
> 

Where/how does the enrichment query run? How does the whole process look
like? I guess an application is receiving decoded changes as JSON, and
then querying the database?

> We have compared transaction ids by adding include-xids to pg_recvlogical
> and adding txid_current() to to the enrich query and the txid_current() is
> always the xid + 1.
> 
> We have found two things that appear to resolve the problem -
> 
>- Using a more powerful EC2 instance. We can reproduce the issue with a
>r7a.medium instance but not with a r7a.large EC2 instance.
>- Changing the Postgres synchronous_commit parameter from "on" to "off".
>We cannot reproduce the issue with synchronous_commit set to "off".
> 
> We need help to understand this unexpected behaviour.
> 

Would be good to have some sort of reproducer - ideally a script that
sets up an instance + replication, and demonstrates the issue. Or at
least a sufficiently detailed steps to reproduce it without having to
guess what exactly you did.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




apt-archive.postgresql.org a bit broken?

2024-07-05 Thread Tomas Pospisek

Hello,

I get this when I try to access the archive of focal packages:

```
# apt-get update
[...]
Err:6 https://apt-archive.postgresql.org/pub/repos/apt focal-pgdg 
InRelease

  403  Forbidden [IP: 2600:9000:2761:4600:b:5965:3380:93a1 443]
[...]
E: Failed to fetch 
https://apt-archive.postgresql.org/pub/repos/apt/dists/focal-pgdg/InRelease 
 403  Forbidden [IP: 2600:9000:2761:4600:b:5965:3380:93a1 443]
E: The repository 'https://apt-archive.postgresql.org/pub/repos/apt 
focal-pgdg InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is 
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user 
configuration details.

```

And so I can't access old, archived focal packages. I guess this isn't 
wanted behavior?


In case I should direct this problem report to some different place I'd 
be glad if you would tell me.


Thank you,
*t




Re: apt-archive.postgresql.org a bit broken?

2024-07-05 Thread Tomas Pospisek

On 05.07.24 18:27, Magnus Hagander wrote:



On Fri, Jul 5, 2024 at 6:07 PM Tomas Pospisek <mailto:t...@sourcepole.ch>> wrote:


Hello,

I get this when I try to access the archive of focal packages:

```
# apt-get update
[...]
Err:6 https://apt-archive.postgresql.org/pub/repos/apt
<https://apt-archive.postgresql.org/pub/repos/apt> focal-pgdg
InRelease
    403  Forbidden [IP: 2600:9000:2761:4600:b:5965:3380:93a1 443]
[...]
E: Failed to fetch
https://apt-archive.postgresql.org/pub/repos/apt/dists/focal-pgdg/InRelease 
<https://apt-archive.postgresql.org/pub/repos/apt/dists/focal-pgdg/InRelease>
   403  Forbidden [IP: 2600:9000:2761:4600:b:5965:3380:93a1 443]
E: The repository 'https://apt-archive.postgresql.org/pub/repos/apt
<https://apt-archive.postgresql.org/pub/repos/apt>
focal-pgdg InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user
configuration details.
```

And so I can't access old, archived focal packages. I guess this isn't
wanted behavior?

In case I should direct this problem report to some different place I'd
be glad if you would tell me.


The dist name should be "focal-pgdg-archive" not "focal-pgdg".

You are right, thank you!
*t




Re: Alignment check

2024-07-05 Thread Tomas Pospisek

On 27.06.24 18:07, Marthin Laubscher wrote:

I don’t intend dissing or plugging anyone’s efforts or start a flame 
war, but I’d like to get a sense of how the PostgreSQL community feels 
about:

a) YugabyteDB, and
b) PostgreSQL on Kubernetes.

For my application I’m deeply vested in Kubernetes as a pathway to being 
cloud-agnostic and I have looked at YugabyteDB because it matches my 
application’s (distributed) architecture more closely.
But not all the ways to run PostgreSQL on Kubernetes are created equal, 
and YugabyteDB is really far behind on versions and do not support 
extensions in a way that’s useful to me.


Having no experience with it I can't comment on YugabyteDB. With respect 
to PostgreSQL on Kubernetes there are various solutions on how to run 
it, some of which are quite mature - that is, they have been around for 
quite some time, are used heavily and have a healthy maintenance 
community (see f.ex. postgres-operator [1]).


One IMHO problematic aspect of running postgres in one or more pods is 
that running a postgres cluster is already demanding as is. When a 
postgres cluster goes awry then there will be work awaiting you to get 
it all backup up and running without messing up user data... using a 
solution like postgres-operator puts another additional layer and 
wrapper around postgres so if things do not run well there's even more 
systems you have to handle. It might or might not help that you have 
additional layers doing stuff to the postgres service:


- pro: the additional software layers can contain more operational
  knowledge than you have and handle and fix operations better than you
  know how to do
- contra: or the additional software layers can hide, obscure, obstruct
  the lower layers and interfere with you trying to debug and fix stuff.

Backups and a tested procedure to get things back up and running from 
scratch can be useful then.


All that said, from operational experience: postgres by itself is very 
robust in taking care of preserving your data, so despite everything 
written above, usually you have to be messing up things **really hard** 
to make postgres lose data.


*t

[1] https://github.com/zalando/postgres-operator/

PS: Thanks to all of you that are taking care, that postgres is caring 
so well about the user's data!





Re: page is not marked all-visible but visibility map bit is set in relation "pg_statistic"

2024-07-11 Thread Tomas Vondra
On 7/11/24 19:21, Gus Spier wrote:
> AWS RDS Postgres Aurora version 14.4
> 

I believe RDS and Aurora are two separate products, so I'm a bit
confused by this ...

> Error log shows: page is not marked all-visible but visibility map bit is
> set in relation "pg_statistic"
> 
> To me, that sounds ominous. But, there does not appear to be any great
> performance hit. The applications are chugging along nicely. The end-users
> have not yet gathered at my door, waving torches and pitch-forks.
> 
> What is the correct course of action in this  case?
> 

I'd say you need to report this to AWS support - we have no idea what
changes they made in Aurora, and AFAIK they made a lot of changes in
this area.

That is not to say we don't have any bugs in this area - see for example
the discussion in [1], but we are not in position to investigate issues
on a proprietary product.

regards

[1]
https://www.postgresql.org/message-id/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A%40mail.gmail.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PG Dump on 11 - Restore on 16 - is possible?

2024-08-13 Thread Tomas Vondra
On 8/13/24 11:47, Durumdara wrote:
> Hello!
> 
> We have to upgrade our server. The PG upgrade is impossible because of
> too much data.

I'm not sure I understand. What exactly is the problem with the amount
of data? How much data are you dealing with, and is the problem in the
amount it takes to copy the data, or do you have limited amount of disk
space, or what is the issue?

I'd say pg_upgrade (in link mode) is probably orders of magnitude faster
for upgrading large clusters - probably less than a minute even for huge
clusters (multiple TB of data). And it doesn't need more disk space, it
does not copy most of the data. Typically, you create a replica on the
new system (or clone the fs volume in some way), and then switch over
and do the pg_upgrade.

pg_dump is pretty expensive - it exports and imports the database, has
to rebuild indexes, ... I'd be surprised if this wasn't more laborious
and time consuming than the pg_upgrade way.


> So we will create a new server with PG 16, and we have to migrate the
> databases one by one (from a PG 11 server).
> 
> Is it possible?
> 
> Does PG16 read PG11 dumps and restore them correctly on the PG16 server?
> 

Yes. The docs actually say this:

  Because pg_dump is used to transfer data to newer versions of
  PostgreSQL, the output of pg_dump can be expected to load into
  PostgreSQL server versions newer than pg_dump's version.

> Or is it unsupported now?
> 

It's true PG11 is EOL already, but AFAIK the dump/restore should work,
we try to support even unsupported versions - I don't recall if the
policy is ~10 years, but the docs say:

  pg_dump can also dump from PostgreSQL servers older than its own
  version. (Currently, servers back to version 9.2 are supported.)

I'd probably use pg_dump from the target version (PG16), but it should
not matter.


regards

-- 
Tomas Vondra




Re: About replication minimal disk space usage

2024-08-24 Thread Tomas Vondra
On 8/24/24 14:18, Manan Kansara wrote:
> Hello All,
> I have my self hosted postgres server on aws with 16gb disk space
> attached to it for ml stuff and analysis stuff we are using vertex ai so
> i have setup live replication of postgres using data stream service to
> BigQuery table.  We use BigQuery table as data warehouse because we have
> so many different data source so our data analysis and ml can
> happened at one place.
> but problem is there When i am starting replication in there pg_wal take
> whole space about 15.8gb in some days of starting replication 
> 
> *_Question_ *:  how can i setup something like that that optimally use
> disk space so old pg_wal data that are not usable can we delete  i think
> i should create one cron job which taken care whole that things but i
> don't know any approach can you please guide
> In future if as data grew i will attached more disk space to that
> instance but i want to make optimal setup so my whole disk is not in
> full usage any time and my server crash again.
> 

Why don't you just give it more disk space? I'm not a fan of blindly
throwing hardware at an issue, but 16GB is tiny these days, especially
if it's shared by both data and WAL, and the time you spend optimizing
this is likely more expensive than any savings.

If you really want to keep this on 16GB, I think we'll need more details
about what exactly you see on the instance / how it runs out of disk
space. AFAIK datastream relies on logical replication, and there's a
couple ways how that may consume disk space.

For example, the datastream replication may pause for a while, in which
case the replication slot will block removal of still-needed WAL, and if
the pause is long enough, that may be an issue. Of course, we have no
idea how much data you're dealing with (clearly not much, if it fits
onto 16GB of disk space with everything else).

Another option is that you have a huge transaction (inserting and/or
modifying a lot of data at once), and the logical decoding ends up
spilling the decoded transaction to disk.

If you want a better answer, I think you'll have to provide a lot more
details. For example, which PostgreSQL version are you using, and how is
it configured? What config parameters have non-default values?


regards

-- 
Tomas Vondra




Re: ERROR: could not open relation with OID XXXX

2024-08-25 Thread Tomas Vondra
On 8/25/24 15:42, Marcelo Zabani wrote:
> Hi all,
> 
> I can reproduce the error in the subject from time to time when querying
> catalog tables while DDL is happening concurrently. Here's a bash script
> that reproduces it (not always, you might have to run it many times
> until you see ERROR:  could not open relation with OID ):
> 
> #!/usr/bin/env bash
> psql -c "create table test(x serial primary key); select oid, relname
> from pg_class where relname='test'"
> # The next two queries will run concurrently
> psql -c "select oid, relname, pg_sleep(3), pg_get_indexdef(oid) from
> pg_class join pg_index on indexrelid=pg_class.oid WHERE
> relname='test_pkey';" 2>&1 1>/tmp/pgbug.log &
> sleep 1
> psql -c "drop table test"
> cat /tmp/pgbug.log
> wait
> 
> I am confused as to how this is possible. I assume if the row with the
> test_pkey index exists in the pg_index catalog table, that the snapshot
> of the catalog tables contains the test table itself and is generally
> consistent, so querying the catalog should not run into such errors.
> 

I think you're assuming the whole query runs with a single snapshot, and
AFAIK that's not quite accurate - we do some special stuff for catalogs,
for example. There's also the additional complexity of maintaining a
cache on catalogs, invalidating it, etc.

I don't have a great simple "this happens because X" explanation, but a
lot of this relies on proper locking - in particular, that we lock all
the objects before execution, which also invalidates all the caches etc.

But that can't happen here, because we only realize we need to access
the OID very late in the execution, when we get to pg_get_indexdef.

> I've seen this happen in Production without pg_sleep in the mix, too. I
> added pg_sleep to the example above only because it makes the error
> easier to reproduce.
> 

It's a race condition, essentially. The sleep just makes it easier to
hit, but it can happen without it.

> Is there something I can do to avoid this? Is my understanding of how
> the catalog tables work wrong?
> 

I believe you could actually lock the pg_class rows for update. Just add
FOR UPDATE at the end of the query.


regards

-- 
Tomas Vondra




Re: ERROR: could not open relation with OID XXXX

2024-08-25 Thread Tomas Vondra



On 8/25/24 17:36, Marcelo Zabani wrote:
>> we do some special stuff for catalogs
> 
> That is good to know, thanks!
> 
>> I believe you could actually lock the pg_class rows for update. Just
> add FOR UPDATE at the end of the query.
> 
> Thanks, but I tried that and got "ERROR:  permission denied for table
> pg_class", even if I try it only for tables the user owns.
> 

Yeah, I tried that with a superuser, I didn't realize it'd fail for
regular users who don't have rights to modify pg_class directly.

> At least considering the use-case of avoiding this error due to
> temporary tables/indexes (which are a part of normal application
> execution), I was thinking of using materialized CTEs that filters those
> out, and only after that using other functions that for example take
> OIDs and return definitions. Other kinds of DDL that create
> non-temporary tables can be "blamed" on developers in my case.
> Do you think using those materialized CTEs could help? And do you think
> this can be considered a bug that I should report or is it just too
> edge-casey to consider?

I have no idea what you mean - what would be filtered out, etc. But you
can give it a try - you have a reproducer, so testing if it fails should
not be that difficult.

Other than that, I think you can do two things:

1) Make the application to simply retry in it sees this error.

2) Lock the objects using LOCK TABLE before running the query.


regards

-- 
Tomas Vondra




Re: BRIN index on timestamptz

2021-04-23 Thread Tomas Vondra




On 4/23/21 10:31 AM, Mohan Radhakrishnan wrote:

Hi,
          I am planning to use as I search based on timestamptz fields. 
There are millions of records.I refer 
https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits 
<https://www.percona.com/blog/2019/07/16/brin-index-for-postgresql-dont-forget-the-benefits>


I execute this on the AWS RDS instance. Is there something in the plan I 
should pay attention to ? I notice the Execution Time.


Thanks,
Mohan

INSERT INTO testtab (id, date, level, msg) SELECT g, CURRENT_TIMESTAMP + 
( g || 'minute' ) :: interval, random() * 6, md5(g::text) FROM 
generate_series(1,800) as g;


Two things:

a) You need to do ANALYZE, otherwise there are no statistics the 
optimizer could use (which is why the row estimates in the plans are 
entirely bogus).


b) BRIN indexes don't work on random data, because the whole idea is 
about eliminating large blocks of data (e.g. 1MB). But with random data 
that's not going to happen, because each such range will match anything. 
Which is why seqscan is a bit faster than when using BRIN index.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




very long secondary->primary switch time

2021-04-27 Thread Tomas Pospisek

Hello all,

I maintain a postgresql cluster that does failover via patroni. The 
problem is that after a failover happens it takes the secondary too long 
(that is about 35min) to come up and answer queries. The log of the 
secondary looks like this:



04:00:29.777 [9679] LOG:  received promote request
04:00:29.780 [9693] FATAL:  terminating walreceiver process due to 
administrator command
04:00:29.780 [9679] LOG:  invalid record length at 320/B95A1EE0: wanted 
24, got 0

04:00:29.783 [9679] LOG:  redo done at 320/B95A1EA8
04:00:29.783 [9679] LOG:  last completed transaction was at log time 
2021-03-03 03:57:46.466342+01


04:35:00.982 [9679] LOG:  selected new timeline ID: 15
04:35:01.404 [9679] LOG:  archive recovery complete
04:35:02.337 [9662] LOG:  database system is ready to accept connections

The cluster is "fairly large" with thousands of DBs (sic!) and ~1TB of data.

I would like to shorten the failover/startup time drastically. Why does 
it take the secondary that much time to switch to the primary state? 
There are no logs between 04:00 and 04:35. What is postgresql doing 
during those 35min?


I am *guessing* that postgresql *might* be doing some consistency check 
or replaying the WAL (max_wal_size: 16 GB, wal_keep_segments: 100). I am 
also *guessing* that startup time *might* have to do with the size of 
the data (~1T) or/and with the numbers of DBs (thousands). If that would 
be the case, then splitting the cluster into multiple clusters should 
allow for faster startup times?


I have tried to duckduck why the secondary takes that much time to 
switch to primary mode, but have failed to find information that would 
enlighten me. So any pointers to information, hints or help are very 
wellcome.


Thanks & greets,
*t




Re: very long secondary->primary switch time

2021-04-27 Thread Tomas Pospisek

Hi Tom, hi list participants,

thanks a lot for replying Tom.

On 27.04.21 22:16, Tom Lane wrote:

Tomas Pospisek  writes:

I maintain a postgresql cluster that does failover via patroni. The
problem is that after a failover happens it takes the secondary too long
(that is about 35min) to come up and answer queries. The log of the
secondary looks like this:



04:00:29.777 [9679] LOG:  received promote request
04:00:29.780 [9693] FATAL:  terminating walreceiver process due to
administrator command
04:00:29.780 [9679] LOG:  invalid record length at 320/B95A1EE0: wanted
24, got 0
04:00:29.783 [9679] LOG:  redo done at 320/B95A1EA8
04:00:29.783 [9679] LOG:  last completed transaction was at log time
2021-03-03 03:57:46.466342+01



04:35:00.982 [9679] LOG:  selected new timeline ID: 15
04:35:01.404 [9679] LOG:  archive recovery complete
04:35:02.337 [9662] LOG:  database system is ready to accept connections



The cluster is "fairly large" with thousands of DBs (sic!) and ~1TB of data.


Hm.  WAL replay is already done at the "redo done" entry.  There is a
checkpoint after that, I believe, and there may be some effort to search
for dead files as well.  Still, if your I/O subsystem is better than
a wet noodle, 35 minutes is a long time to finish that.

One thing I'm not sure about is whether we try to do the checkpoint
at maximum speed.  If you have set GUC options to throttle checkpoint
I/O hard, that could perhaps explain this.


I didn't do much customization of postgresql settings and am mostly 
running what was provided by the upstreams. These are my explicit 
settings (all the rest is defaults):


allow_system_table_mods: 'off'
autovacuum: 'off'
checkpoint_completion_target: '0.7'
effective_cache_size: 16 GB
maintenance_work_mem: 1 GB
max_connections: 200
max_wal_size: 16 GB
shared_buffers: 8 GB
shared_preload_libraries: pg_stat_statements
track_activities: 'on'
track_activity_query_size: 32 kB
track_functions: all
wal_keep_segments: 100
work_mem: 64 MB


You could possibly learn more by strace'ing the startup process to
see what it's doing.


Will do, thanks, however I'm dreading the next failover downtime :-(


Also, what PG version is that exactly?


12.6-1.pgdg20.04+1

Thankful for any help or pointers regarding the long promotion time,
*t




  1   2   >