Plan regression from 12 to 13

2022-02-23 Thread Radu Radutiu
Hi,
I have a query (generated by RequestTracker) that has regressed after
upgrading from postgresql 12 to 13. The strange thing is that decreasing
the effective_cache_size causes postgresql 13 to generate an acceptable
plan.
My setup is: 1 VM, 4 cores, 16 GB RAM (server has around 10 GB for
filesystem cache)
shared_buffers 2.4 GB

With effective_cache_size 8G, I have the following plan:
https://explain.depesz.com/s/lIcV (the run time is after several tries,
with most of the data in memory; the initial run reading from disk is
around 10 times slower)
If I decrease the effective_cache_size to '2GB' or less, I get a much
better plan: https://explain.depesz.com/s/Z6C7W
The old VM running postgresql 12 (with a somewhat smaller database) always
generate a good plan,regardless of the effective_cache_size :
https://explain.depesz.com/s/iawn

Are there any defaults that have changed from 12 to 13 that would explain
the different plans?

Thanks,
Radu


tsvector string representation and parsing

2022-02-23 Thread Johannes Graën



Hi,

I am constructing a tsvector representation of some text manually 
because I need to provide explicit positional parameters. In some cases, 
the conversion fails and I get a "syntax error in tsvector".


This is a minimal example that goes wrong (but shouldn't IMHO:


SELECT format('%L:1', '\:')::tsvector


My expectation here is that %L would transform '\:' into a correctly 
formatted character string, including single quotes around the string. 
The documentation reads:



L quotes the argument value as an SQL literal



SELECT $$'\:':1$$


returns «'\:':1», but


SELECT $$'\:':1$$::tsvector


returns «':':1», so something is happening here inside the single-quoted 
lexemes to the backslash. My understanding is that everything inside 
single quotes is taken as a lexeme. From the documentation:



To represent lexemes containing whitespace or punctuation, surround them with 
quotes


So having everything surrounded by single quotes (followed by a 
positional argument after the colon), «$$'\:':1$$::tsvector» should 
actually return a vector with a single lexeme «\:» at position 1.


Am I missing something?

Regards
  Johannes




Re: tsvector string representation and parsing

2022-02-23 Thread Tom Lane
=?UTF-8?Q?Johannes_Gra=c3=abn?=  writes:
> This is a minimal example that goes wrong (but shouldn't IMHO:

>> SELECT format('%L:1', '\:')::tsvector

format(%L) is designed to produce a SQL literal, which does not
have the same requirements as a tsvector element ... yeah, they're
close, but not close enough.  In this particular example,
what you get is

=# SELECT format('%L:1', '\:');
  format  
--
 E'\\:':1
(1 row)

because format() adds an E prefix for the avoidance of doubt about
what to do with the backslashes.  tsvector doesn't like that.

I don't think we have any prefab function that does what you're
looking for here, and TBH I'm not sure I see the point of it.
Pretty much any tsvector you'd be dealing with in practice is
going to have come from one of the to_tsvector family of
functions, and those tend to drop punctuation.

> My understanding is that everything inside 
> single quotes is taken as a lexeme. From the documentation:

>> To represent lexemes containing whitespace or punctuation, surround them 
>> with quotes

See also the next bit about having to double quotes and backslashes
within those quotes.  So what you'd actually need is

=# select $$'\\:':1$$::tsvector;
 tsvector 
--
 '\\:':1
(1 row)

If you write just one backslash, it has the effect of quoting
the next character, which in this case doesn't need quoting.

regards, tom lane




Re: tsvector string representation and parsing

2022-02-23 Thread Johannes Graën



Thanks, Tom.

On 23/02/2022 23.30, Tom Lane wrote:

=?UTF-8?Q?Johannes_Gra=c3=abn?=  writes:

This is a minimal example that goes wrong (but shouldn't IMHO:



SELECT format('%L:1', '\:')::tsvector


format(%L) is designed to produce a SQL literal, which does not
have the same requirements as a tsvector element ... yeah, they're
close, but not close enough.  In this particular example,
what you get is

=# SELECT format('%L:1', '\:');
   format
--
  E'\\:':1
(1 row)

because format() adds an E prefix for the avoidance of doubt about
what to do with the backslashes.  tsvector doesn't like that.


I see.



I don't think we have any prefab function that does what you're
looking for here, and TBH I'm not sure I see the point of it.
Pretty much any tsvector you'd be dealing with in practice is
going to have come from one of the to_tsvector family of
functions, and those tend to drop punctuation.


Applied to normal texts in a standard language, I believe that's true. 
I'd like to use FTS in a setting where I need to control the positional 
attributes as I'm specifying more than one lexeme at the same position 
(which works great btw). That's why I can't use to_tsvector() but need 
to cast it from a string to tsvector.




My understanding is that everything inside
single quotes is taken as a lexeme. From the documentation:



To represent lexemes containing whitespace or punctuation, surround them with 
quotes


See also the next bit about having to double quotes and backslashes
within those quotes.  So what you'd actually need is

=# select $$'\\:':1$$::tsvector;
  tsvector
--
  '\\:':1
(1 row)

If you write just one backslash, it has the effect of quoting
the next character, which in this case doesn't need quoting.


Before using format(), I tried just generating those strings by doubling 
any single quote or backslash and enclosing the whole string in single 
quotes, but that didn't seem a safe way, though it works in principle:



SELECT format($$'%s':%s$$, replace(replace(s, $$'$$, $$''$$), '\', '\\'), 
i)::tsvector
FROM (SELECT $$\:$$ s, 1 i) x;


Would that be the way to go if to_tsvector is not an option?

Regards
  Johannes




Re: pg_upgrade from Postgresql-12 to Postgresql-13 fails with "Creating dump of database schemas postgres *failure*"

2022-02-23 Thread Peter J. Holzer
On 2022-02-19 14:54:41 +0800, Yoong S. Chow wrote:
> Could you please clarify what core files are? I am truly a newbie to
> postgresql. 

Core files are a feature of Linux and other Unix-like operating systems.
When a process crashes, the OS records its state in a file named "core"
(sometimes withe process id attached). You can then use a debugger to
inspect this file and investigate the cause of the crash.

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature