Log files polluted with permission denied error messages after every 10 seconds
Hi! Postgres 13.1 is installed in windows server. There are about 100 users. Log files contain huge number of permisson denied error messages in every day like : 2021-03-04 01:38:49.818 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 01:48:42.725 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 01:53:22.427 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 01:58:19.623 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 02:03:24.080 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 02:43:17.983 EET [7764] LOG: invalid length of startup packet 2021-03-04 02:43:21.634 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 02:48:25.048 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 02:53:19.229 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 02:58:19.176 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 03:03:19.136 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 03:29:33.926 EET [2460] FATAL: expected SASL response, got message type 0 2021-03-04 03:36:04.995 EET [4816] LOG: could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied 2021-03-04 03:36:09.666 EET [10340] LOG: using stale statistics instead of current ones because stats collector is not responding 2021-03-04 03:53:20.036 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied 2021-03-04 04:03:19.001 EET [4580] LOG: could not rename file "pg_wal/00010005001B": Permission denied On middle of day where usage is intensive those appear exactly after every 10 seconds: 2021-03-05 12:08:40.001 EET [4580] LOG: could not rename file "pg_wal/000100050078": Permission denied 2021-03-05 12:08:50.059 EET [4580] LOG: could not rename file "pg_wal/00010005007D": Permission denied 2021-03-05 12:09:00.115 EET [4580] LOG: could not rename file "pg_wal/000100050082": Permission denied 2021-03-05 12:09:10.171 EET [4580] LOG: could not rename file "pg_wal/000100050088": Permission denied 2021-03-05 12:09:20.224 EET [4580] LOG: could not rename file "pg_wal/0001000500A4": Permission denied 2021-03-05 12:09:30.281 EET [4580] LOG: could not rename file "pg_wal/0001000500A8": Permission denied 2021-03-05 12:09:40.339 EET [4580] LOG: could not rename file "pg_wal/0001000500B0": Permission denied 2021-03-05 12:09:50.397 EET [4580] LOG: could not rename file "pg_wal/0001000500E7": Permission denied 2021-03-05 12:10:00.456 EET [4580] LOG: could not rename file "pg_wal/0001000500E8": Permission denied 2021-03-05 12:10:10.514 EET [4580] LOG: could not rename file "pg_wal/0001000500E9": Permission denied 2021-03-05 12:10:20.570 EET [4580] LOG: could not rename file "pg_wal/0001000500EB": Permission denied 2021-03-05 12:10:30.626 EET [4580] LOG: could not rename file "pg_wal/0001000500EC": Permission denied How to fix this ? It looks like database is working normally, users havent reported any issues. Andrus.
Re: Log files polluted with permission denied error messages after every 10 seconds
On Fri, 2021-03-05 at 12:13 +0200, Andrus wrote: > Postgres 13.1 is installed in windows server. There are about 100 users. > > Log files contain huge number of permisson denied error messages in every day > like : > > 2021-03-04 01:38:49.818 EET [4580] LOG: could not rename file > "pg_wal/00010005001B": Permission denied Windows? Don't let anti-virus software mess with the data directory. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Postgres Analog of Oracle APPEND hint
On 2021-02-25 14:52:20 -0500, Rumpi Gravenstein wrote: > My use case involves complicated joins on source tables in one schema loading > a > target table in the same or a different schema. So source(s) and target are in the same database? That wasn't clear from your earlier mails. If you are doing "complicated joins on source tables" that's probably where the bottleneck will be, so you shouldn't worry about the insert speed unless (or until) you notice that the bottleneck is writing the data, not reading it. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: Log files polluted with permission denied error messages after every 10 seconds
Hi! >Windows? Don't let anti-virus software mess with the data directory. Windows default Windows Defender is active. I excluded data, pg_wal folders and postgres process: Then turned real-time protection off: Problem persists. New entry is written after every 10 seconds. pg_wal also contains files with .deleted extension like 0001000500B2.deleted Andrus.
Re: Postgres Analog of Oracle APPEND hint
On Thu, Feb 25, 2021 at 10:26 AM Rumpi Gravenstein wrote: > > Unfortunately, I am not looking to load from an external source. My process > is moving data from source PostgreSQL tables to target PostgreSQL tables. INSERT INTO ...SELECT ... is one of the fastest ways possible to move data around. merlin
Unkillable processes creating millions of tiny temp files
Greetings! We are running postgres 11.9 (were running 11.7 prior to recent restart) on a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes. Within the past few days we have started to see a few queries running for over 8 hours which we then attempt to terminate, but will not terminate. These queries are also generating hundreds of thousands of tiny/empty temp files. In fact, before the restart there were over 23 million files in pg_tmp which were removed. We also have verified no server settings have changed for at least a couple weeks, well before this issue started happening only in the past few days. Looking back awhile at the same query a couple weeks ago, we see this (large file sizes): 2021-02-18 12:01:59.195 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,3,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.0"", size 21299836",," 2021-02-18 12:01:59.446 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,4,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.5"", size 4138272",," 2021-02-18 12:01:59.496 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,5,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.1"", size 10706416",," 2021-02-18 12:01:59.747 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,6,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.3"", size 4150920",," 2021-02-18 12:01:59.797 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,7,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.2"", size 10660908",," 2021-02-18 12:02:00.050 GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,8,"SELECT",2021-02-18 12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.4"", size 4132224",," But here is what we see presently (tiny files): 2021-03-05 21:30:52.712 GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73562,"SELECT",2021-03-05 15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.177717"", size 24",," 2021-03-05 21:30:52.735 GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73563,"SELECT",2021-03-05 15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.990067"", size 92",," 2021-03-05 21:30:52.950 GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73564,"SELECT",2021-03-05 15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.490"", size 24",," 2021-03-05 21:30:53.072 GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73565,"SELECT",2021-03-05 15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.800016"", size 140",," 2021-03-05 21:30:53.522 GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73566,"SELECT",2021-03-05 15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path ""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.500266"", size 48",," After the restart of the db and upgrade to 11.9, we see the same problem. There are 2 separate queries that have had this behavior with seemingly unrelated objects. Here is the plan for one of them: QUERY PLAN -- Aggregate (cost=2654520.75..2654520.76 rows=1 width=336) -> Hash Left Join (cost=1087907.49..2019675.12 rows=14107680 width=69) Hash Cond: (t1.team_stadium_id = ra.team_stadium_id) -> Hash Left Join (cost=1079740.68..1955870.37 rows=14107680 width=69) Hash Cond: (t1.team_stadium_id = frg.team_stadium_id) -> Hash Right Join (cost=1073614.70..1894916.60 rows=14107680 width=69) Hash Cond: (sm_1.id = t1.id) -> Hash Join (cost=47373.90..695007.85 rows=170011 width=4) Hash Cond: (sm_1.id = l.id) -> Index Scan using index_foo_fact_on_id_first_win on foo_fact sm_1 (cost=0.42..647177.59 rows=174010 width=12) Filter: (CASE WHEN (basketball_renewal_date <= now()) THEN true ELSE false END AND (NOT CASE WHEN (basketball_renewal_date <= (now() - '5 days'::interval)) THEN false ELSE true END)) -> Hash (cost=34408.
Re: Unkillable processes creating millions of tiny temp files
Jeremy Finzel writes: > We are running postgres 11.9 (were running 11.7 prior to recent restart) on > a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes. > Within the past few days we have started to see a few queries running for > over 8 hours which we then attempt to terminate, but will not terminate. > These queries are also generating hundreds of thousands of tiny/empty temp > files. In fact, before the restart there were over 23 million files in > pg_tmp which were removed. We also have verified no server settings have > changed for at least a couple weeks, well before this issue started > happening only in the past few days. Hm. For the query plan you show, I think the only plausible explanation for using temp files is that a hash join's hash table is exceeding work_mem so it's spilling batches of tuples to disk. With some bad luck those could be small not large. But I don't see anything in our commit logs between 11.7 and 11.9 that looks like it would have affected any of that behavior. (There were some changes to code governing temp-file tablespace selection, but that could only affect where the files get put not how big they are.) So I doubt that this can be blamed on the update, especially since if I read you correctly it didn't start happening immediately after the update. I'm wondering about changes in table statistics possibly causing a poorly-chosen change in the hashing parameters. What have you got work_mem set to? Can you comment on whether the estimated rowcounts shown in the query plan are accurate? Does manually ANALYZE'ing the tables used in the query change anything? The "unkillable" aspect is odd, but I wonder if that's just a red herring. A query that's generated lots of temp files will try to clean them up at termination, so maybe the backend is just sitting there removing temp files before it'll give control back. regards, tom lane
Re: Unkillable processes creating millions of tiny temp files
Hi Tom, thx for the quick response and a few remarks below... I work at the same site that Jeremy does and we're both looking at this today. Tom Lane writes: > Jeremy Finzel writes: > >> We are running postgres 11.9 (were running 11.7 prior to recent restart) on >> a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes. Actually it's v11.11 now, but in any case, the issue was evident before and after the minor version update. > >> Within the past few days we have started to see a few queries running for >> over 8 hours which we then attempt to terminate, but will not terminate. >> These queries are also generating hundreds of thousands of tiny/empty temp >> files. In fact, before the restart there were over 23 million files in >> pg_tmp which were removed. We also have verified no server settings have >> changed for at least a couple weeks, well before this issue started >> happening only in the past few days. > > Hm. For the query plan you show, I think the only plausible explanation > for using temp files is that a hash join's hash table is exceeding > work_mem so it's spilling batches of tuples to disk. With some bad > luck those could be small not large. But I don't see anything in > our commit logs between 11.7 and 11.9 that looks like it would have > affected any of that behavior. (There were some changes to code > governing temp-file tablespace selection, but that could only affect > where the files get put not how big they are.) So I doubt that this > can be blamed on the update, especially since if I read you correctly > it didn't start happening immediately after the update. It started happening a couple days ago for no obvious reason, so bad stats are one of my thrtheories and w'ere running a defensive analyze through the entire system to rule it out. > I'm wondering about changes in table statistics possibly causing a > poorly-chosen change in the hashing parameters. What have you got > work_mem set to? Can you comment on whether the estimated rowcounts work_mem 1G. We are not aware of the client overriding this in their code, but it's possible. > shown in the query plan are accurate? Does manually ANALYZE'ing the > tables used in the query change anything? These other points are still under investigation. > > The "unkillable" aspect is odd, but I wonder if that's just a red > herring. A query that's generated lots of temp files will try to > clean them up at termination, so maybe the backend is just sitting > there removing temp files before it'll give control back. I believe this is confirmed. I see that the backend after being sig-term'd are now cycling through unlinks as seen by strace -p $pid. None too quickly I might add and as mentioned earlier, the number of these files is in the millions so it's hard to predict when cleanup will finish. As we did one night prior, a hard shutdown got the rogue jobs closed a lot more quickly and we noticed that on the recovery restart, Pg took about 15 minutes to clear >23M files from the temp area. I assume we'd experience the same again if we need to do it. I'm going to leave it a alone a a while longer before taking action. Thanks again > regards, tom lane > > > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consult...@comcast.net
Re: Unkillable processes creating millions of tiny temp files
Jerry Sievers writes: > Tom Lane writes: >> The "unkillable" aspect is odd, but I wonder if that's just a red >> herring. A query that's generated lots of temp files will try to >> clean them up at termination, so maybe the backend is just sitting >> there removing temp files before it'll give control back. > I believe this is confirmed. I see that the backend after being > sig-term'd are now cycling through unlinks as seen by strace -p $pid. > None too quickly I might add and as mentioned earlier, the number of > these files is in the millions so it's hard to predict when cleanup will > finish. Hm. I don't recall exactly what sort of data structure we use to track open temp files, but it's certainly plausible that it's not too efficient with millions of temp files :-( > As we did one night prior, a hard shutdown got the rogue jobs closed a > lot more quickly and we noticed that on the recovery restart, Pg took > about 15 minutes to clear >23M files from the temp area. I assume we'd > experience the same again if we need to do it. Not sure how fast that is either. If you need to do it again, you could try manually rm'ing everything under the pgsql_tmp directory before letting the postmaster start. regards, tom lane
Re: libpq pipelineing
Hello Samuel, On 2020-Jun-27, Samuel Williams wrote: > I found some discussion in the past relating to batch processing which > appears to support some kind of pipelining: > > https://2ndquadrant.github.io/postgres/libpq-batch-mode.html I just noticed this old thread of yours. I've been working on getting the work you linked to, polished a little bit and completed for PostgreSQL 14. If you'd like to give it a try, your input would be very useful to me. You can find the patch here (applies on the current master branch): https://postgr.es/m/20210306003559.GA1375@alvherre.pgsql (If you need help building, please ping me on private email). The patched libpq version is compatible with older servers. Thanks -- Álvaro Herrera Valdivia, Chile
Re: Log files polluted with permission denied error messages after every 10 seconds
On Fri, Mar 05, 2021 at 07:36:37PM +0200, Andrus wrote: > Then turned real-time protection off: > > Problem persists. New entry is written after every 10 seconds. On which files are those complaints? It seems to me that you may have more going on in this system that interacts with your data folder than you think. > pg_wal also contains files with .deleted extension like > > 0001000500B2.deleted These are generated on Windows when removing a past WAL segment, where the process involves a rename followed by durable_unlink() that would generate some LOG entries in the logs if the internal unlink() failed (see RemoveXlogFile() in xlog.c). -- Michael signature.asc Description: PGP signature