CTE optimization fence
Hello, I am running into performance issues with large CTE "WITH" queries (just for selecting, not updating). I was surprised to find that the queries run much faster if I convert the CTEs to subqueries. From googling, I see that this is due to CTE acting as an optimization fence in PG. Unfortunately due to the application I'm dealing with, converting all CTE queries to subquery model is not feasible. Plus, the readability of CTE is a big bonus. I see there was some discussion last year about removing the CTE optimization fence (e.g. http://www.postgresql-archive.org/CTE-inlining-td5958992.html) but can't find anything more recent. Does anyone know if this is still under consideration? For what it's worth, I would love some way to make CTEs inlined/optimized. Thank you very much to the developers for a truly amazing database system. Thanks Guy
Transition tables for column-specific UPDATE triggers
Hello, The manual says: https://www.postgresql.org/docs/current/sql-createtrigger.html A column list cannot be specified when requesting transition relations. And (I think the same point): The standard allows transition tables to be used with column-specific |UPDATE| triggers, but then the set of rows that should be visible in the transition tables depends on the trigger's column list. This is not currently implemented by PostgreSQL. Are there any plans to allow transition tables to be used with column-specific UPDATE triggers? Or, is there another way for a trigger function to see the rows changed by a column-specific UPDATE trigger? Thanks
Feature suggestion: auto-prefixing SELECT query column names with table/alias names
Hello, I've seen some situations where it would be very handy if PostgreSQL could automatically prefix column names in a SELECT query with their table names (or its alias). So for two tables, a & b, each with columns "id" and "title": SELECT * FROM a , b; instead of returning ambiguously-named columns: id, title, id, title would return: a.id, a.title, b.id, b.title Of course, this can be achieved by avoiding the (often frowned-upon) SELECT * syntax in the first place and using explicit column names, but that makes otherwise short queries much longer and harder to maintain. Also this doesn't only apply to SELECT * queries: if tables a & b had more columns, but I wanted just the "title" column of each, to avoid ambiguous column names or having to use column position indexes, I have to give explicit aliases to those columns (which could be anything, but in practice is often the table prefix as already written in the select (SELECT a.title AS "a.title". b.title as "b.title" etc). Could there be an (admittedly non-standard) option to automatically prefix table/alias names to columns? From searching, it seems a number of people would find this useful[1]. Perhaps something like: SELECT #* FROM a, b --> a.id, a.title, b.id, d.title. This would allow use per-table: SELECT a.id, #b.* --> id, b.id, b.title. Or per column E.g: SELECT a.id, #a.title, #b.title --> id, a.title. b.title. Or alternatively it could be specified in the FROM clause as to which tables have their columns prefixed. I have no idea if this is viable/sensible/anathema, especially given the understandable reluctance to stray from the SQL standard, but thought I'd ask, especially as it seems like something others would find useful too. Thanks, Guy [1] https://stackoverflow.com/questions/329931/sql-select-join-is-it-possible-to-prefix-all-columns-as-prefix https://stackoverflow.com/questions/13153344/in-a-join-how-to-prefix-all-column-names-with-the-table-it-came-from https://stackoverflow.com/questions/53947657/how-to-display-table-name-as-prefix-to-column-name-in-sql-result https://stackoverflow.com/questions/33364392/sql-join-prefix-fields-with-table-name
Re: Feature suggestion: auto-prefixing SELECT query column names with table/alias names
On 22/06/2020 3:25 pm, Laurenz Albe wrote: Then there is the case of "a JOIN b USING (somecol)". Here, "somecol" will appear in the output only once. How should it be labeled? If you say "not at all", then what do you want to happen for SELECT * FROM a JOIN b USING (id) JOIN c USING (x) where all three tables have a column "id"? Thanks, I've learned a tip about USING :) In such case, could it not simply return all columns prefixed with their table/alias names, as that is what the table-prefix option would mean? To pretend-modify the documentation: "Also, |USING| implies that only one of each pair of equivalent columns will be included in the join output, not both; _unless table-prefixing is enabled for one or both of columns_." In other words, if table-prefixing is enabled for a column in a USING clause, it behaves like a regular LEFT JOIN. So SELECT #* FROM a JOIN b USING (id) would give: a.id, a.title, b.id, b.title But the real objection I have is that the problem can be easily avoided by spelling out the SELECT list and using aliases. Either you are talking about ad-hoc queries, or you want to use "SELECT *" in your code. - For ad-hoc queries I don't see the problem so much. Here, the tables will be returned in the order you specify them in the query. So if you have "b JOIN a", the result columns will always be first all columns from "b", then all columns from "a". So you can easily figure out which column belongs to which table. True, but it would be nice to have an easy 'foolproof' way to see the table name a column belongs to (especially for wide tables). - For queries in your code, using "SELECT *" is a very bad idea. There is a real risk of your code breaking if the database changes. In code, you are probably not ready to deal with a changing number of columns. Yes. Though in the case of explicit SELECT lists, duplicate column names usually requires an alias to be given (unless using column index), and this is often just the table prefix anyway, so it would be nice to be able to automatically prefix the table name in the query. It would be rather like the USING clause itself - just a nice convenience/shorthand to have. Thanks for the comments!
PostgreSQL occasionally unable to rename WAL files (NTFS)
Hello, Running 13.1 on Windows Server 2019, I am getting the following log entries occasionally: 2021-02-11 12:34:10.149 NZDT [6072] LOG: could not rename file "pg_wal/0001009900D3": Permission denied 2021-02-11 12:40:31.377 NZDT [6072] LOG: could not rename file "pg_wal/0001009900D3": Permission denied 2021-02-11 12:46:06.294 NZDT [6072] LOG: could not rename file "pg_wal/0001009900D3": Permission denied 2021-02-11 12:46:16.502 NZDT [6072] LOG: could not rename file "pg_wal/0001009900DA": Permission denied 2021-02-11 12:50:20.917 NZDT [6072] LOG: could not rename file "pg_wal/0001009900D3": Permission denied 2021-02-11 12:50:31.098 NZDT [6072] LOG: could not rename file "pg_wal/0001009900DA": Permission denied What appears to be happening is the affected WAL files (which is usually only 2 or 3 WAL files at a time) are somehow "losing" their NTFS permissions, so the PG process can't rename them - though of course the PG process created them. Even running icacls as admin gives "Access is denied" on those files. A further oddity is the affected files do end up disappearing after a while. The NTFS permissions on the pg_wal directory are correct, and most WAL files are unaffected. Chkdsk reports no problems, and the database is working fine otherwise. Have tried disabling antivirus software in case that was doing something but no difference. I found another recent report of similar behaviour here: https://stackoverflow.com/questions/65405479/postgresql-13-log-could-not-rename-file-pg-wal-0001000100c6 WAL config as follows: wal_level = replica fsync = on synchronous_commit = on wal_sync_method = fsync full_page_writes = on wal_compression = off wal_log_hints = off wal_init_zero = on wal_recycle = on wal_buffers = -1 wal_writer_delay = 200ms wal_writer_flush_after = 1MB wal_skip_threshold = 2MB commit_delay = 0 commit_siblings = 5 checkpoint_timeout = 5min max_wal_size = 2GB min_wal_size = 256MB checkpoint_completion_target = 0.7 checkpoint_flush_after = 0 checkpoint_warning = 30s archive_mode = off I'm thinking of disabling wal_recycle as a first step to see if that makes any difference, but thought I'd seek some comments first. Not sure how much of a problem this is - the database is running fine otherwise - but any thoughts would be appreciated. Thanks & regards, Guy
Re: PostgreSQL occasionally unable to rename WAL files (NTFS)
On 12/02/2021 12:31 am, Lionel Bouton wrote: I haven't dealt with a Windows environment for quite some time, but from what I remember an antivirus installs a driver intercepting file accesses and these drivers are still active even if you disable the antivirus (I suppose they just call a noop instead of content analysis code) and can still interfere with your system. For example some years ago I've seen what looked like a race condition involving rename for MySQL on Windows that could not be fixed by disabling the antivirus but could by uninstalling it completely. You might want to uninstall the antivirus temporarily to check this. Thanks Lionel for this suggestion. Its the built-in Windows Defender AV which I believe can't be uninstalled, so am limited to 'disabling' it (plus it has exclusions for the PG directories & processes already). Using Procmon I can't see any AV file activity when it is disabled. Will keep monitoring it though. Kind regards Guy
Re: PostgreSQL occasionally unable to rename WAL files (NTFS)
On 12/02/2021 4:33 am, Thorsten Schöning wrote: If you see that somewhat frequently, use Process Monitor and Process Explorer to see who accesses those files how. ProcExp easily allows you to find all open handles per file. If it's not AV, it might be something like Windows Search Indexer as well, if that is enabled by default in Server 2019 at all. Thanks Thorsten, that's a great idea - I'm using Procmon and Process Explorer and able to monitor activity on the WAL files, so hopefully that sheds some light. The Indexer service is disabled, so can rule that out. Kind regards Guy
Re: PostgreSQL occasionally unable to rename WAL files (NTFS)
On 12/02/2021 4:33 am, Thorsten Schöning wrote: The behaviour you describe happens exactly when two processes e.g. concurrently hold HANDLEs on the same file and one of those deletes the file then. Windows keeps file names until all open HANDLEs are closed and depending on how those HANDLEs have been opened by the first app, concurrent deletion is perferctly fine for Windows. Though, a such deleted file can't be opened easily anymore and looks like it has lost permissions only. But that's not the case, it's deleted already. It might be that this happens for Postgres to itself somehow when some other app has an open HANDLE. I don't think that some other app is deleting that file by purpose instead, reading it for some reason seems more likely to me. Using Process Monitor, Thorsten's explanation above appears to correctly diagnose what is happening. ProcMon data shows postgres.exe performing "CreateFile" operations on the affected WAL files, with the result status "DELETE PENDING". Which according to https://stackoverflow.com/a/29892104 means: "Windows allows a process to delete a file, even though it is still opened by another process (e.g. Windows indexing service or Antivirus). It gets internally marked as "delete pending". The file does not actually get removed from the file system, it is still there after the File.Delete call. Anybody that tries to open the file after that gets an access denied error. The file doesn't actually get removed until the last handle to the file object gets closed" which is the same behaviour Thorsten describes above (great info, thanks Thorsten). The mystery now is that the only process logged as touching the affected WAL files is postgres.exe (of which there are many separate processes). Could it be that one of the postgres.exe instances is holding the affected WAL files in use after another postgres.exe instance has flagged the file as deleted? (or to put it the other way, a postgres.exe instance is flagging the file as deleted while another instance still has an open handle to the file)? If it is some other process such as the indexer (disabled) or AV (excluded from pgdata) is obtaining a handle on the WAL files, it isn't being logged by ProcMon. Kind regards, Guy
Re: PostgreSQL occasionally unable to rename WAL files (NTFS)
On 16/02/2021 12:23 am, Thorsten Schöning wrote: The mystery now is that the only process logged as touching the affected WAL files is postgres.exe (of which there are many separate processes). Could it be that one of the postgres.exe instances is holding the affected WAL files in use after another postgres.exe instance has flagged the file as deleted?[...] I suggest checking your WAL-related and archive/backup settings for Postgres again. There's e.g. "archive_command" optionally copying WALs to some other place and postgres.exe would wait until that process has finished, maybe locking the file to copy itself as well. Or "archive_timeout" interfering with some other operations or alike. Thanks Thorsten. The WAL archive settings are out-of-the-box defaults, i.e. disabled: archive_mode = off; archive_command = ''; archive_timeout = 0. I'm not sure there is anything else I can check at this time. The good thing is it doesn't seem to cause any problem other than logging "could not rename file" warnings, so I might have to park this for now. If I find anything else that might offer a new lead I will report back. Kind regards, Guy
Old .deleted file in pg_wal
Hello, There is a file in my pg_wal directory called "0001003B00D0.deleted", last modified in December 2020, and has survived PG & server restarts. I've been having some errors with WAL files (https://www.postgresql.org/message-id/flat/095ccf8d-7f58-d928-427c-b17ace23cae6%40burgess.co.nz) and would like to tidy this file if safe to do so. I think this is where the file gets created: https://github.com/postgres/postgres/blob/ca3b37487be333a1d241dab1bbdd17a211a88f43/src/backend/access/transam/xlogarchive.c#L406 Based on that I'm pretty sure it is OK to delete a ".deleted" WAL file, but thought I would check in case anyone suggests otherwise. Thanks Guy
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
FWIW, this looks the same issue I am getting (reported last month: https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz) I get the same Process Monitor output, including BUFFER OVERFLOW entries. No sign of any process other than postgres.exe touching the WAL files. Regards, Guy On 18/03/2021 2:26 am, Andrus wrote: Hi! I think what it would be most helpful to run "process monitor", so that you get a log of the system calls and errors; perhaps that shows some details that we cannot get from the error message. Process monitor shows huge number of DELETE PENDING results from CreateFile operation: 15:22:35,1708187 postgres.exe 11800 CreateFile C:\Program Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE PENDING Desired Access: Read Attributes, Delete, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a 15:22:35,2713856 postgres.exe 11800 CreateFile C:\Program Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE PENDING Desired Access: Read Attributes, Delete, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a 15:22:35,3719483 postgres.exe 11800 CreateFile C:\Program Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE PENDING Desired Access: Read Attributes, Delete, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a 15:22:35,4723624 postgres.exe 11800 CreateFile C:\Program Files\PostgreSQL\13\data\pg_wal\00010011003B DELETE PENDING Desired Access: Read Attributes, Delete, Synchronize, Disposition: Open, Options: Synchronous IO Non-Alert, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a Andrus.
Re: SV: Log files polluted with permission denied error messages after every 10 seconds
Thanks - yes pleas send me a private link and I will try to reproduce it in a test environment and then test the patch. Regards, Guy On 18/03/2021 1:08 pm, Michael Paquier wrote: On Thu, Mar 18, 2021 at 12:25:45PM +1300, Guy Burgess wrote: FWIW, this looks the same issue I am getting (reported last month: https://www.postgresql.org/message-id/f444a84e-2d29-55f9-51a6-a5dcea3bc253%40burgess.co.nz) Yep. I get the same Process Monitor output, including BUFFER OVERFLOW entries. No sign of any process other than postgres.exe touching the WAL files. Guy, do you have an environment where this is still happening and where you could test a potential fix? We are not sure yet what's causing that, but one code path has changed in this area, involving CreateHardLinkA()+_unlink() instead of a single rename when attempting to recycle a segment. And I am just in a mood to build things by myself and send some links to people to be able to download and test that, so one more is fine.. -- Michael