Log files polluted with permission denied error messages after every 10 seconds

2021-03-05 Thread Andrus

Hi!

Postgres 13.1 is installed in windows server. There are about 100 users.

Log files contain huge number of permisson denied error messages in 
every day like :


2021-03-04 01:38:49.818 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 01:48:42.725 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 01:53:22.427 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 01:58:19.623 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 02:03:24.080 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied

2021-03-04 02:43:17.983 EET [7764] LOG:  invalid length of startup packet
2021-03-04 02:43:21.634 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 02:48:25.048 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 02:53:19.229 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 02:58:19.176 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 03:03:19.136 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 03:29:33.926 EET [2460] FATAL:  expected SASL response, got 
message type 0
2021-03-04 03:36:04.995 EET [4816] LOG:  could not rename temporary 
statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": 
Permission denied
2021-03-04 03:36:09.666 EET [10340] LOG:  using stale statistics instead 
of current ones because stats collector is not responding
2021-03-04 03:53:20.036 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied
2021-03-04 04:03:19.001 EET [4580] LOG:  could not rename file 
"pg_wal/00010005001B": Permission denied


On middle of day where usage is intensive those appear exactly after 
every 10 seconds:


2021-03-05 12:08:40.001 EET [4580] LOG:  could not rename file 
"pg_wal/000100050078": Permission denied
2021-03-05 12:08:50.059 EET [4580] LOG:  could not rename file 
"pg_wal/00010005007D": Permission denied
2021-03-05 12:09:00.115 EET [4580] LOG:  could not rename file 
"pg_wal/000100050082": Permission denied
2021-03-05 12:09:10.171 EET [4580] LOG:  could not rename file 
"pg_wal/000100050088": Permission denied
2021-03-05 12:09:20.224 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500A4": Permission denied
2021-03-05 12:09:30.281 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500A8": Permission denied
2021-03-05 12:09:40.339 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500B0": Permission denied
2021-03-05 12:09:50.397 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500E7": Permission denied
2021-03-05 12:10:00.456 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500E8": Permission denied
2021-03-05 12:10:10.514 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500E9": Permission denied
2021-03-05 12:10:20.570 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500EB": Permission denied
2021-03-05 12:10:30.626 EET [4580] LOG:  could not rename file 
"pg_wal/0001000500EC": Permission denied


How to fix this ?

It looks like database is working normally, users havent reported any 
issues.


Andrus.



Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-05 Thread Laurenz Albe
On Fri, 2021-03-05 at 12:13 +0200, Andrus wrote:
> Postgres 13.1 is installed in windows server. There are about 100 users.
> 
> Log files contain huge number of permisson denied error messages in every day 
> like :
> 
> 2021-03-04 01:38:49.818 EET [4580] LOG:  could not rename file 
> "pg_wal/00010005001B": Permission denied

Windows?

Don't let anti-virus software mess with the data directory.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Postgres Analog of Oracle APPEND hint

2021-03-05 Thread Peter J. Holzer
On 2021-02-25 14:52:20 -0500, Rumpi Gravenstein wrote:
> My use case involves complicated joins on source tables in one schema loading 
> a
> target table in the same or a different schema.

So source(s) and target are in the same database? That wasn't clear from
your earlier mails.

If you are doing "complicated joins on source tables" that's probably
where the bottleneck will be, so you shouldn't worry about the insert
speed unless (or until) you notice that the bottleneck is writing the
data, not reading it.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-05 Thread Andrus

Hi!

>Windows? Don't let anti-virus software mess with the data directory.

Windows default Windows Defender is active. I excluded data, pg_wal 
folders and postgres process:



Then turned real-time protection off:

Problem persists. New entry is written after every 10 seconds.

pg_wal also contains files with .deleted extension like

0001000500B2.deleted

Andrus.




Re: Postgres Analog of Oracle APPEND hint

2021-03-05 Thread Merlin Moncure
On Thu, Feb 25, 2021 at 10:26 AM Rumpi Gravenstein  wrote:
>
> Unfortunately, I am not looking to load from an external source.  My process 
> is moving data from source PostgreSQL tables to target PostgreSQL tables.


INSERT INTO ...SELECT ... is one of the fastest ways possible to move
data around.

merlin




Unkillable processes creating millions of tiny temp files

2021-03-05 Thread Jeremy Finzel
Greetings!

We are running postgres 11.9 (were running 11.7 prior to recent restart) on
a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

Within the past few days we have started to see a few queries running for
over 8 hours which we then attempt to terminate, but will not terminate.
These queries are also generating hundreds of thousands of tiny/empty temp
files.  In fact, before the restart there were over 23 million files in
pg_tmp which were removed.  We also have verified no server settings have
changed for at least a couple weeks, well before this issue started
happening only in the past few days.

Looking back awhile at the same query a couple weeks ago, we see this
(large file sizes):
2021-02-18 12:01:59.195
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,3,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.0"", size
21299836",,"
2021-02-18 12:01:59.446
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,4,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.5"", size
4138272",,"
2021-02-18 12:01:59.496
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,5,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.1"", size
10706416",,"
2021-02-18 12:01:59.747
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,6,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.3"", size
4150920",,"
2021-02-18 12:01:59.797
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,7,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.2"", size
10660908",,"
2021-02-18 12:02:00.050
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,8,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.4"", size
4132224",,"

But here is what we see presently (tiny files):
2021-03-05 21:30:52.712
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73562,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.177717"", size
24",,"
2021-03-05 21:30:52.735
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73563,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.990067"", size
92",,"
2021-03-05 21:30:52.950
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73564,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.490"", size
24",,"
2021-03-05 21:30:53.072
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73565,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.800016"", size
140",,"
2021-03-05 21:30:53.522
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73566,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,0,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.500266"", size
48",,"

After the restart of the db and upgrade to 11.9, we see the same problem.

There are 2 separate queries that have had this behavior with seemingly
unrelated objects.  Here is the plan for one of them:


QUERY PLAN
--
 Aggregate  (cost=2654520.75..2654520.76 rows=1 width=336)
   ->  Hash Left Join  (cost=1087907.49..2019675.12 rows=14107680 width=69)
 Hash Cond: (t1.team_stadium_id = ra.team_stadium_id)
 ->  Hash Left Join  (cost=1079740.68..1955870.37 rows=14107680
width=69)
   Hash Cond: (t1.team_stadium_id = frg.team_stadium_id)
   ->  Hash Right Join  (cost=1073614.70..1894916.60
rows=14107680 width=69)
 Hash Cond: (sm_1.id = t1.id)
 ->  Hash Join  (cost=47373.90..695007.85 rows=170011
width=4)
   Hash Cond: (sm_1.id = l.id)
   ->  Index Scan using
index_foo_fact_on_id_first_win on foo_fact sm_1  (cost=0.42..647177.59
rows=174010 width=12)
 Filter: (CASE WHEN
(basketball_renewal_date <= now()) THEN true ELSE false END AND (NOT CASE
WHEN (basketball_renewal_date <= (now() - '5 days'::interval)) THEN false
ELSE true END))
   ->  Hash  (cost=34408.

Re: Unkillable processes creating millions of tiny temp files

2021-03-05 Thread Tom Lane
Jeremy Finzel  writes:
> We are running postgres 11.9 (were running 11.7 prior to recent restart) on
> a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

> Within the past few days we have started to see a few queries running for
> over 8 hours which we then attempt to terminate, but will not terminate.
> These queries are also generating hundreds of thousands of tiny/empty temp
> files.  In fact, before the restart there were over 23 million files in
> pg_tmp which were removed.  We also have verified no server settings have
> changed for at least a couple weeks, well before this issue started
> happening only in the past few days.

Hm.  For the query plan you show, I think the only plausible explanation
for using temp files is that a hash join's hash table is exceeding
work_mem so it's spilling batches of tuples to disk.  With some bad
luck those could be small not large.  But I don't see anything in
our commit logs between 11.7 and 11.9 that looks like it would have
affected any of that behavior.  (There were some changes to code
governing temp-file tablespace selection, but that could only affect
where the files get put not how big they are.)  So I doubt that this
can be blamed on the update, especially since if I read you correctly
it didn't start happening immediately after the update.

I'm wondering about changes in table statistics possibly causing a
poorly-chosen change in the hashing parameters.  What have you got
work_mem set to?  Can you comment on whether the estimated rowcounts
shown in the query plan are accurate?  Does manually ANALYZE'ing the
tables used in the query change anything?

The "unkillable" aspect is odd, but I wonder if that's just a red
herring.  A query that's generated lots of temp files will try to
clean them up at termination, so maybe the backend is just sitting
there removing temp files before it'll give control back.

regards, tom lane




Re: Unkillable processes creating millions of tiny temp files

2021-03-05 Thread Jerry Sievers
Hi Tom, thx for the quick response and a few remarks below...

I work at the same site that Jeremy does and we're both looking at this
today.


Tom Lane  writes:

> Jeremy Finzel  writes:
>
>> We are running postgres 11.9 (were running 11.7 prior to recent restart) on
>> a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

Actually it's v11.11 now, but in any case, the issue was evident before
and after the minor version update.

>
>> Within the past few days we have started to see a few queries running for
>> over 8 hours which we then attempt to terminate, but will not terminate.
>> These queries are also generating hundreds of thousands of tiny/empty temp
>> files.  In fact, before the restart there were over 23 million files in
>> pg_tmp which were removed.  We also have verified no server settings have
>> changed for at least a couple weeks, well before this issue started
>> happening only in the past few days.
>
> Hm.  For the query plan you show, I think the only plausible explanation
> for using temp files is that a hash join's hash table is exceeding
> work_mem so it's spilling batches of tuples to disk.  With some bad
> luck those could be small not large.  But I don't see anything in
> our commit logs between 11.7 and 11.9 that looks like it would have
> affected any of that behavior.  (There were some changes to code
> governing temp-file tablespace selection, but that could only affect
> where the files get put not how big they are.)  So I doubt that this
> can be blamed on the update, especially since if I read you correctly
> it didn't start happening immediately after the update.

It started happening a couple days ago for no obvious reason, so bad
stats are one of my thrtheories and w'ere running a defensive analyze
through the entire system to rule it out.


> I'm wondering about changes in table statistics possibly causing a
> poorly-chosen change in the hashing parameters.  What have you got
> work_mem set to?  Can you comment on whether the estimated rowcounts

work_mem 1G.  We are not aware of the client overriding this in their
code, but it's possible.
> shown in the query plan are accurate?  Does manually ANALYZE'ing the
> tables used in the query change anything?

These other points are still under investigation.

>
> The "unkillable" aspect is odd, but I wonder if that's just a red
> herring.  A query that's generated lots of temp files will try to
> clean them up at termination, so maybe the backend is just sitting
> there removing temp files before it'll give control back.

I believe this is confirmed.  I see that the backend after being
sig-term'd are now cycling through unlinks as seen by strace -p $pid.

None too quickly I might add and as mentioned earlier, the number of
these files is in the millions so it's hard to predict when cleanup will
finish.

As we did one night prior, a hard shutdown got the rogue jobs closed a
lot more quickly and we noticed that on the recovery restart, Pg took
about 15 minutes to clear >23M files from the temp area.  I assume we'd
experience the same again if we need to do it.

I'm going to leave it a alone a a while longer before taking action.

Thanks again


>   regards, tom lane



>
>
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net




Re: Unkillable processes creating millions of tiny temp files

2021-03-05 Thread Tom Lane
Jerry Sievers  writes:
> Tom Lane  writes:
>> The "unkillable" aspect is odd, but I wonder if that's just a red
>> herring.  A query that's generated lots of temp files will try to
>> clean them up at termination, so maybe the backend is just sitting
>> there removing temp files before it'll give control back.

> I believe this is confirmed.  I see that the backend after being
> sig-term'd are now cycling through unlinks as seen by strace -p $pid.

> None too quickly I might add and as mentioned earlier, the number of
> these files is in the millions so it's hard to predict when cleanup will
> finish.

Hm.  I don't recall exactly what sort of data structure we use to track
open temp files, but it's certainly plausible that it's not too efficient
with millions of temp files :-(

> As we did one night prior, a hard shutdown got the rogue jobs closed a
> lot more quickly and we noticed that on the recovery restart, Pg took
> about 15 minutes to clear >23M files from the temp area.  I assume we'd
> experience the same again if we need to do it.

Not sure how fast that is either.  If you need to do it again, you could
try manually rm'ing everything under the pgsql_tmp directory before
letting the postmaster start.

regards, tom lane




Re: libpq pipelineing

2021-03-05 Thread Alvaro Herrera
Hello Samuel,

On 2020-Jun-27, Samuel Williams wrote:

> I found some discussion in the past relating to batch processing which
> appears to support some kind of pipelining:
> 
> https://2ndquadrant.github.io/postgres/libpq-batch-mode.html

I just noticed this old thread of yours.  I've been working on getting
the work you linked to, polished a little bit and completed for
PostgreSQL 14.  If you'd like to give it a try, your input would be
very useful to me.  You can find the patch here (applies on the current
master branch): https://postgr.es/m/20210306003559.GA1375@alvherre.pgsql
(If you need help building, please ping me on private email).

The patched libpq version is compatible with older servers.

Thanks

-- 
Álvaro Herrera   Valdivia, Chile




Re: Log files polluted with permission denied error messages after every 10 seconds

2021-03-05 Thread Michael Paquier
On Fri, Mar 05, 2021 at 07:36:37PM +0200, Andrus wrote:
> Then turned real-time protection off:
> 
> Problem persists. New entry is written after every 10 seconds.

On which files are those complaints?  It seems to me that you may have
more going on in this system that interacts with your data folder than
you think.

> pg_wal also contains files with .deleted extension like
> 
> 0001000500B2.deleted

These are generated on Windows when removing a past WAL segment, where
the process involves a rename followed by durable_unlink() that would
generate some LOG entries in the logs if the internal unlink() failed
(see RemoveXlogFile() in xlog.c). 
--
Michael


signature.asc
Description: PGP signature