CTE optimization fence

2018-06-26 Thread Guy Burgess

Hello,
I am running into performance issues with large CTE "WITH" queries (just 
for selecting, not updating).  I was surprised to find that the queries 
run much faster if I convert the CTEs to subqueries. From googling, I 
see that this is due to CTE acting as an optimization fence in PG. 
Unfortunately due to the application I'm dealing with, converting all 
CTE queries to subquery model is not feasible. Plus, the readability of 
CTE is a big bonus.


I see there was some discussion last year about removing the CTE 
optimization fence (e.g. 
http://www.postgresql-archive.org/CTE-inlining-td5958992.html) but can't 
find anything more recent. Does anyone know if this is still under 
consideration? For what it's worth, I would love some way to make CTEs 
inlined/optimized.


Thank you very much to the developers for a truly amazing database system.

Thanks
Guy




Transition tables for column-specific UPDATE triggers

2019-10-08 Thread Guy Burgess

Hello,

The manual says: 
https://www.postgresql.org/docs/current/sql-createtrigger.html


   A column list cannot be specified when requesting transition relations.

And (I think the same point):

   The standard allows transition tables to be used with
   column-specific |UPDATE| triggers, but then the set of rows that
   should be visible in the transition tables depends on the trigger's
   column list. This is not currently implemented by PostgreSQL.

Are there any plans to allow transition tables to be used with 
column-specific UPDATE triggers?  Or, is there another way for a trigger 
function to see the rows changed by a column-specific UPDATE trigger?


Thanks



Feature suggestion: auto-prefixing SELECT query column names with table/alias names

2020-06-21 Thread Guy Burgess

Hello,

I've seen some situations where it would be very handy if PostgreSQL 
could automatically prefix column names in a SELECT query with their 
table names (or its alias).   So for two tables, a & b, each with 
columns "id" and "title":


    SELECT * FROM a , b;

instead of returning ambiguously-named columns:

    id, title, id, title

would return:

    a.id, a.title, b.id, b.title

Of course, this can be achieved by avoiding the (often frowned-upon) 
SELECT * syntax in the first place and using explicit column names, but 
that makes otherwise short queries much longer and harder to maintain.  
Also this doesn't only apply to SELECT * queries: if tables a & b had 
more columns, but I wanted just the "title" column of each, to avoid 
ambiguous column names or having to use column position indexes, I have 
to give explicit aliases to those columns (which could be anything, but 
in practice is often the table prefix as already written in the select 
(SELECT a.title AS "a.title". b.title as "b.title" etc).


Could there be an (admittedly non-standard) option to automatically 
prefix table/alias names to columns? From searching, it seems a number 
of people would find this useful[1].  Perhaps something like:   SELECT 
#* FROM a, b    -->   a.id, a.title, b.id, d.title.  This would allow 
use per-table:   SELECT a.id, #b.*   -->   id, b.id, b.title.   Or per 
column  E.g:   SELECT a.id, #a.title, #b.title   -->   id, a.title. 
b.title.   Or alternatively it could be specified in the FROM clause as 
to which tables have their columns prefixed.


I have no idea if this is viable/sensible/anathema, especially given the 
understandable reluctance to stray from the SQL standard, but thought 
I'd ask, especially as it seems like something others would find useful too.


Thanks,

Guy

[1]

https://stackoverflow.com/questions/329931/sql-select-join-is-it-possible-to-prefix-all-columns-as-prefix

https://stackoverflow.com/questions/13153344/in-a-join-how-to-prefix-all-column-names-with-the-table-it-came-from

https://stackoverflow.com/questions/53947657/how-to-display-table-name-as-prefix-to-column-name-in-sql-result

https://stackoverflow.com/questions/33364392/sql-join-prefix-fields-with-table-name





Re: Feature suggestion: auto-prefixing SELECT query column names with table/alias names

2020-06-22 Thread Guy Burgess

On 22/06/2020 3:25 pm, Laurenz Albe wrote:

Then there is the case of "a JOIN b USING (somecol)".
Here, "somecol" will appear in the output only once.  How should it be
labeled?  If you say "not at all", then what do you want to happen for

   SELECT * FROM a JOIN b USING (id) JOIN c USING (x)

where all three tables have a column "id"?
Thanks, I've learned a tip about USING :)  In such case, could it not 
simply return all columns prefixed with their table/alias names, as that 
is what the table-prefix option would mean? To pretend-modify the 
documentation: "Also, |USING| implies that only one of each pair of 
equivalent columns will be included in the join output, not both; 
_unless table-prefixing is enabled for one or both of columns_."  In 
other words, if table-prefixing is enabled for a column in a USING 
clause, it behaves like a regular LEFT JOIN.  So SELECT #* FROM a JOIN b 
USING (id) would give:


    a.id, a.title, b.id, b.title


But the real objection I have is that the problem can be easily avoided
by spelling out the SELECT list and using aliases.  Either you are talking
about ad-hoc queries, or you want to use "SELECT *" in your code.

- For ad-hoc queries I don't see the problem so much.  Here, the tables
   will be returned in the order you specify them in the query.
   So if you have "b JOIN a", the result columns will always be
   first all columns from "b", then all columns from "a".
   So you can easily figure out which column belongs to which table.
True, but it would be nice to have an easy 'foolproof' way to see the 
table name a column belongs to (especially for wide tables).

- For queries in your code, using "SELECT *" is a very bad idea.
   There is a real risk of your code breaking if the database changes.
   In code, you are probably not ready to deal with a changing number
   of columns.


Yes. Though in the case of explicit SELECT lists, duplicate column names 
usually requires an alias to be given (unless using column index), and 
this is often just the table prefix anyway, so it would be nice to be 
able to automatically prefix the table name in the query.  It would be 
rather like the USING clause itself - just a nice convenience/shorthand 
to have.


Thanks for the comments!



PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-10 Thread Guy Burgess

Hello,

Running 13.1 on Windows Server 2019, I am getting the following log 
entries occasionally:


    2021-02-11 12:34:10.149 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900D3": Permission denied
    2021-02-11 12:40:31.377 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900D3": Permission denied
    2021-02-11 12:46:06.294 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900D3": Permission denied
    2021-02-11 12:46:16.502 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900DA": Permission denied
    2021-02-11 12:50:20.917 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900D3": Permission denied
    2021-02-11 12:50:31.098 NZDT [6072] LOG:  could not rename file 
"pg_wal/0001009900DA": Permission denied


What appears to be happening is the affected WAL files (which is usually 
only 2 or 3 WAL files at a time) are somehow "losing" their NTFS 
permissions, so the PG process can't rename them - though of course the 
PG process created them. Even running icacls as admin gives "Access is 
denied" on those files. A further oddity is the affected files do end up 
disappearing after a while.


The NTFS permissions on the pg_wal directory are correct, and most WAL 
files are unaffected. Chkdsk reports no problems, and the database is 
working fine otherwise. Have tried disabling antivirus software in case 
that was doing something but no difference.


I found another recent report of similar behaviour here: 
https://stackoverflow.com/questions/65405479/postgresql-13-log-could-not-rename-file-pg-wal-0001000100c6


WAL config as follows:

   wal_level = replica
   fsync = on
   synchronous_commit = on
   wal_sync_method = fsync
   full_page_writes = on
   wal_compression = off
   wal_log_hints = off
   wal_init_zero = on
   wal_recycle = on
   wal_buffers = -1
   wal_writer_delay = 200ms
   wal_writer_flush_after = 1MB
   wal_skip_threshold = 2MB
   commit_delay = 0
   commit_siblings = 5
   checkpoint_timeout = 5min
   max_wal_size = 2GB
   min_wal_size = 256MB
   checkpoint_completion_target = 0.7
   checkpoint_flush_after = 0
   checkpoint_warning = 30s
   archive_mode = off

I'm thinking of disabling wal_recycle as a first step to see if that 
makes any difference, but thought I'd seek some comments first.


Not sure how much of a problem this is - the database is running fine 
otherwise - but any thoughts would be appreciated.


Thanks & regards,

Guy



Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-12 Thread Guy Burgess

On 12/02/2021 12:31 am, Lionel Bouton wrote:

I haven't dealt with a Windows environment for quite some time, but from
what I remember an antivirus installs a driver intercepting file
accesses and these drivers are still active even if you disable the
antivirus (I suppose they just call a noop instead of content analysis
code) and can still interfere with your system. For example some years
ago I've seen what looked like a race condition involving rename for
MySQL on Windows that could not be fixed by disabling the antivirus but
could by uninstalling it completely.

You might want to uninstall the antivirus temporarily to check this.


Thanks Lionel for this suggestion. Its the built-in Windows Defender AV 
which I believe can't be uninstalled, so am limited to 'disabling' it 
(plus it has exclusions for the PG directories & processes already). 
Using Procmon I can't see any AV file activity when it is disabled. Will 
keep monitoring it though.


Kind regards

Guy





Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-12 Thread Guy Burgess

On 12/02/2021 4:33 am, Thorsten Schöning wrote:

If you see that somewhat frequently, use Process Monitor and Process
Explorer to see who accesses those files how. ProcExp easily allows
you to find all open handles per file. If it's not AV, it might be
something like Windows Search Indexer as well, if that is enabled by
default in Server 2019 at all.

Thanks Thorsten, that's a great idea - I'm using Procmon and Process 
Explorer and able to monitor activity on the WAL files, so hopefully 
that sheds some light. The Indexer service is disabled, so can rule that 
out.


Kind regards

Guy





Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-15 Thread Guy Burgess

On 12/02/2021 4:33 am, Thorsten Schöning wrote:

The behaviour you describe happens exactly when two processes e.g.
concurrently hold HANDLEs on the same file and one of those deletes
the file then. Windows keeps file names until all open HANDLEs are
closed and depending on how those HANDLEs have been opened by the
first app, concurrent deletion is perferctly fine for Windows.

Though, a such deleted file can't be opened easily anymore and looks
like it has lost permissions only. But that's not the case, it's
deleted already. It might be that this happens for Postgres to itself
somehow when some other app has an open HANDLE. I don't think that
some other app is deleting that file by purpose instead, reading it
for some reason seems more likely to me.


Using Process Monitor, Thorsten's explanation above appears to correctly 
diagnose what is happening. ProcMon data shows postgres.exe performing 
"CreateFile" operations on the affected WAL files, with the result 
status "DELETE PENDING". Which according to 
https://stackoverflow.com/a/29892104 means:


   "Windows allows a process to delete a file, even though it is still
   opened by another process (e.g. Windows indexing service or
   Antivirus). It gets internally marked as "delete pending". The file
   does not actually get removed from the file system, it is still
   there after the File.Delete call. Anybody that tries to open the
   file after that gets an access denied error. The file doesn't
   actually get removed until the last handle to the file object gets
   closed"

which is the same behaviour Thorsten describes above (great info, thanks 
Thorsten).


The mystery now is that the only process logged as touching the affected 
WAL files is postgres.exe (of which there are many separate processes). 
Could it be that one of the postgres.exe instances is holding the 
affected WAL files in use after another postgres.exe instance has 
flagged the file as deleted? (or to put it the other way, a postgres.exe 
instance is flagging the file as deleted while another instance still 
has an open handle to the file)? If it is some other process such as the 
indexer (disabled) or AV (excluded from pgdata) is obtaining a handle on 
the WAL files, it isn't being logged by ProcMon.


Kind regards,

Guy



Re: PostgreSQL occasionally unable to rename WAL files (NTFS)

2021-02-16 Thread Guy Burgess

On 16/02/2021 12:23 am, Thorsten Schöning wrote:

The mystery now is that the only process logged as touching the
affected WAL files is postgres.exe (of which there are many separate
processes). Could it be that one of the postgres.exe instances is
holding the affected WAL files in use after another postgres.exe
instance has flagged the file as deleted?[...]

I suggest checking your WAL-related and archive/backup settings for
Postgres again. There's e.g. "archive_command" optionally copying WALs
to some other place and postgres.exe would wait until that process has
finished, maybe locking the file to copy itself as well. Or
"archive_timeout" interfering with some other operations or alike.


Thanks Thorsten.  The WAL archive settings are out-of-the-box defaults, 
i.e. disabled: archive_mode = off; archive_command = ''; archive_timeout 
= 0.


I'm not sure there is anything else I can check at this time. The good 
thing is it doesn't seem to cause any problem other than logging "could 
not rename file" warnings, so I might have to park this for now. If I 
find anything else that might offer a new lead I will report back.


Kind regards,

Guy





Old .deleted file in pg_wal

2021-02-26 Thread Guy Burgess

Hello,

There is a file in my pg_wal directory called 
"0001003B00D0.deleted", last modified in December 2020, and 
has survived PG & server restarts.  I've been having some errors with 
WAL files 
(https://www.postgresql.org/message-id/flat/095ccf8d-7f58-d928-427c-b17ace23cae6%40burgess.co.nz) 
and would like to tidy this file if safe to do so.


I think this is where the file gets created: 
https://github.com/postgres/postgres/blob/ca3b37487be333a1d241dab1bbdd17a211a88f43/src/backend/access/transam/xlogarchive.c#L406


Based on that I'm pretty sure it is OK to delete a ".deleted" WAL file, 
but thought I would check in case anyone suggests otherwise.


Thanks

Guy





Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Guy Burgess
FWIW, this looks the same issue I am getting (reported last month: 
https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz)


I get the same Process Monitor output, including BUFFER OVERFLOW 
entries.  No sign of any process other than postgres.exe touching the 
WAL files.


Regards,

Guy


On 18/03/2021 2:26 am, Andrus wrote:


Hi!



I think what it would be most helpful to run "process monitor", so that you get
a log of the system calls and errors; perhaps that shows some details that
we cannot get from the error message.

Process monitor shows huge number of DELETE PENDING results from 
CreateFile operation:


15:22:35,1708187    postgres.exe    11800    CreateFile C:\Program 
Files\PostgreSQL\13\data\pg_wal\00010011003B    DELETE 
PENDING    Desired Access: Read Attributes, Delete, Synchronize, 
Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse 
Point, Attributes: n/a, ShareMode: Read, Write, Delete, 
AllocationSize: n/a


15:22:35,2713856    postgres.exe    11800    CreateFile C:\Program 
Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE 
PENDING    Desired Access: Read Attributes, Delete, Synchronize, 
Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse 
Point, Attributes: n/a, ShareMode: Read, Write, Delete, 
AllocationSize: n/a


15:22:35,3719483    postgres.exe    11800    CreateFile C:\Program 
Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE 
PENDING    Desired Access: Read Attributes, Delete, Synchronize, 
Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse 
Point, Attributes: n/a, ShareMode: Read, Write, Delete, 
AllocationSize: n/a


15:22:35,4723624    postgres.exe    11800    CreateFile C:\Program 
Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE 
PENDING    Desired Access: Read Attributes, Delete, Synchronize, 
Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse 
Point, Attributes: n/a, ShareMode: Read, Write, Delete, 
AllocationSize: n/a


Andrus.







Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-17 Thread Guy Burgess
Thanks - yes pleas send me a private link and I will try to reproduce it 
in a test environment and then test the patch.


Regards,

Guy


On 18/03/2021 1:08 pm, Michael Paquier wrote:

On Thu, Mar 18, 2021 at 12:25:45PM +1300, Guy Burgess wrote:

FWIW, this looks the same issue I am getting (reported last month:
https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz)

Yep.


I get the same Process Monitor output, including BUFFER OVERFLOW entries.
No sign of any process other than postgres.exe touching the WAL files.

Guy, do you have an environment where this is still happening and
where you could test a potential fix?  We are not sure yet what's
causing that, but one code path has changed in this area, involving
CreateHardLinkA()+_unlink() instead of a single rename when attempting
to recycle a segment.  And I am just in a mood to build things by
myself and send some links to people to be able to download and test
that, so one more is fine..
--
Michael