pg_restore fails due to foreign key violation

2018-12-10 Thread Olga Vingurt
Hi,

We are using PostgresSQL 9.5.10 and pg_dump/pg_restore to export and import
database.

We encountered an issue (which is not easily reproducible) when running
pg_restore:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3624; 2606 37504 FK
CONSTRAINT  postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  insert or
update on table "" violates foreign key constraint "fk_name"
DETAIL:  Key ()=(fbc3dd78-643f-419e-9f0f-72d81ef22cf3) is not
present in table "".
Command was: ALTER TABLE ONLY  ADD CONSTRAINT 
FOREIGN KEY

The following command was used to backup the database:
postgresql-9.5.10-2-windows-x64-binaries\pgsql\bin\pg_dump.exe -Fc
--verbose --no-password -U postgres -d, postgres -f db.dump -E utf8

The following command was used to restore the database:
postgresql-9.5.10-2-windows-x64-binaries\pgsql\bin\pg_restore.exe --verbose
--no-password --single-transaction --no-owner --no-privileges -U postgres
-d postgres db.dump

After playing with the dump and importing schema first and data next
without the triggers we indeed see that data is missing in the table i.e.
dump is not consistent.
We don't stop the application which uses database during the dump but
according to the documentation the dump still should be consistent.

How is it possible that pg_dump created dump which is not consistent? Did
it happen because we haven't stopped the application?
Is there any way to create dumps without stopping the application?

Thanks,
Olga


Re: pg_restore fails due to foreign key violation

2018-12-10 Thread Andreas Kretschmer



Am 10.12.18 um 11:15 schrieb Olga Vingurt:
After playing with the dump and importing schema first and data next 
without the triggers we indeed see that data is missing in the table 
i.e. dump is not consistent.
We don't stop the application which uses database during the dump but 
according to the documentation the dump still should be consistent.



please check if the data are consistent on the source database system.




How is it possible that pg_dump created dump which is not consistent? 
Did it happen because we haven't stopped the application?



No, but maybe you have corrupted indexes on the source system.



Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




Re: Temp tables

2018-12-10 Thread Laurenz Albe
David G. Johnston wrote:
> >  and what happens to this data after completion of the transaction ?
> > 
> 
> Your choice. See “on conflict” clause of create temp table command.

You mean the ON COMMIT clause.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: pg_restore fails due to foreign key violation

2018-12-10 Thread Martín Marqués
Hi,

El lun., 10 dic. 2018 a las 7:21, Andreas Kretschmer (<
andr...@a-kretschmer.de>) escribió:

>
> Am 10.12.18 um 11:15 schrieb Olga Vingurt:
> > After playing with the dump and importing schema first and data next
> > without the triggers we indeed see that data is missing in the table
> > i.e. dump is not consistent.
> > We don't stop the application which uses database during the dump but
> > according to the documentation the dump still should be consistent.
>
>
> please check if the data are consistent on the source database system.
>


I would start by running amcheck to see if there is index corruption
somewhere, as that can lead to data corruption (in particular if the
corrupted index is a unique or primary key index)

Regards,

-- 
Martín Marqués
It’s not that I have something to hide,
it’s that I have nothing I want you to see


Re: Tables(s) that feed pg_controldata

2018-12-10 Thread Tom Lane
Ian Barwick  writes:
> On 12/10/2018 11:50 AM, Michael Paquier wrote:
>> Perhaps.  Even with that, writing an extension for 9.2 would require
>> copy-pasting a lot of code from pg_controldata.c and taking as well the
>> code which has been introduced for the various control functions.
>> Dirty, still doable. 

> I've got some code which does just that to read pg_control as far
> back as 9.3, and it is indeed not pretty ;).

If I had to do this pre-9.6, I'd write a function in an untrusted PL
that invoked pg_controldata and parsed its output.  Ugly, but short
and easy to port across versions.

regards, tom lane



What is the tuplestore?

2018-12-10 Thread Ron

Hi,

v9.6.6


2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT 
PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore temporary 
file: No space left on device


I see this in the pg_log file, but #1 can't figure out what "tuplestore" is 
(Google doesn't help), and #2 there's lots of space on all my file systems.  
data/base, where pgsql_tmp lives, has 96GB free.)


Thanks

--
Angular momentum makes the world go 'round.



Re: pg_restore fails due to foreign key violation

2018-12-10 Thread Olga Vingurt
The data indeed wasn't consistent on the source system and foreign key
index was corrupted.
After manually cleaning not relevant records and running REINDEX on the
table pd_dump and pg_restore worked as expected.

The only question left is how we got into corrupted data state.
In the event logs (PorstgeSQL is runnign on Wondows Server) we found error
which looks relevant:

ERROR:  could not truncate file "base/12373/17254" to 19 blocks: Permission
denied
CONTEXT:  automatic vacuum of table "postgres.public."


Re: pg_restore fails due to foreign key violation

2018-12-10 Thread Tom Lane
Olga Vingurt  writes:
> The only question left is how we got into corrupted data state.
> In the event logs (PorstgeSQL is runnign on Wondows Server) we found error
> which looks relevant:

> ERROR:  could not truncate file "base/12373/17254" to 19 blocks: Permission 
> denied
> CONTEXT:  automatic vacuum of table "postgres.public."

Hm.  In theory, that truncation failure in itself shouldn't have caused a
problem --- autovacuum is just trying to remove some empty pages, and if
they don't get removed, they'd still be empty.  However, there's a problem
if the pages are empty because we just deleted some recently-dead tuples,
because the state of the pages on-disk might be different from what it
is in-memory.  In that case the truncation failure effectively results in
those pages reverting to some prior state, possibly bringing dead tuples
"back to life".  We've been poking at that problem off and on for awhile;
the most recent thread is here:

https://www.postgresql.org/message-id/flat/5BBC590AE8DF4ED1A170E4D48F1B53AC%40tunaPC

Assuming that that is the issue (which is unproven, but seems probable)
then your report makes the third report we've seen of this failure mode
in about a decade --- and it's been there, in principle, much longer than
that.  Nobody's thought of a fix that isn't either very unpleasant for
performance (eg, "uselessly" writing out pages we're about to truncate
away) or bad for robustness (eg, PANIC'ing on truncate failure).  So
while it's certainly bad news when it happens, I'm not sure that any
fix is going to get committed soon.

In the short term, what you need to do is figure out what caused the
permission failure.  The general belief among pgsql-hackers is that
shoddy antivirus products tend to cause this, but I don't know details.

regards, tom lane



RE: syntax error with alter type

2018-12-10 Thread Kevin Brannen
Tom Lane wrote:

> Kevin Brannen  writes:
> > So, can you (or anyone) help me understand what "alter type  drop 
> > attribute" is meant to do? I don't see "attribute" on the page you 
> > reference.

> IIRC, that drops a column from a composite type; it's more or less a variant 
> spelling of ALTER TABLE DROP COLUMN.

I'm going to guess this is part of the "SQL Standard", but I'd like to comment 
that this seems really ... weird. All languages have their warts or things they 
could have done better in hind sight, but this strikes me as very weird.

That being said, I can sort of see why you say that. Looking at the examples, I 
can see:

ALTER TYPE compfoo ADD ATTRIBUTE f3 int;

which looks a lot like adding a column to a table. It would have been nice to 
know what "compfoo" is as that would have cleared up some confusion. I guess 
I'll make a doc change request.

Of course, the real answer is that someone here in the past should have made a 
small table of reference values instead of creating an ENUM, then I  wouldn't 
be facing this. Again, hind sight...sigh.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.



pg_stat_replication view

2018-12-10 Thread Jeff Janes
I'm not sure if this is a documentation issue, or something else.

The description of the pg_stat_replication.state column gives:

* catchup: This WAL sender's connected standby is catching up with the
primary.

* streaming: This WAL sender is streaming changes after its connected
standby server has caught up with the primary.

What does this mean?  Is the standby "caught up" when it replays the LSN
which was current on the master as-of the time that the standby initiated
this connection?  Or is it caught up when the master receives at least one
notification that a certain LSN was replayed on the replica, and verifies
that no new WAL has been generated after that certain LSN was generated?
Neither of those things?

If a replica has caught up and then fallen behind again, is that different
from a user/dba perspective than if it never caught up in the first place?

Also, the docs say "Lag times work automatically for physical replication.
Logical decoding plugins may optionally emit tracking messages; if they do
not, the tracking mechanism will simply display NULL lag."  Does the
logical decoding plugin associated with built-in PUBLICATION/SUBSCRIPTION
mechanism introduced in v10 emit tracking messages?

Cheers,

Jeff


Memory exhaustion due to temporary tables?

2018-12-10 Thread Thomas Carroll
We have a situation where a long-persistent Postgres connection consumes more 
and more memory.  If ignored, we eventually get “Cannot allocate memory” errors 
in the Postgres log.  If still ignored, the box will eventually crash.  This 
takes about 3 weeks to happen.  It issues a call to a single function about 
once every 15 seconds.  What can I do to prevent this outcome?

What the function does: The purpose of the function is to refresh a cache in an 
application, not make meaningful updates. It does write to the database, but 
only a temporary table.  If I remove the temporary table (just as an 
experiment, it makes the function useless) the memory consumption does not 
occur.

 There are no transactions left hanging open, no locks holding resources for 
long periods of time.  The temporary table is about half a meg in size, about 
5500 rows.

The memory usage is identified by examining and totaling the lines beginning 
with “Private” in the /proc/1234/smaps file, where 1234 is the process ID for 
the connection.  The memory consumption starts out at under 20 meg, but swells 
to hundreds of megabytes over the three weeks.  I have been able to reproduce 
the issue on my own Linux workstation with an accelerated schedule.

Other loads: None, this is a dedicated Postgres server

Postgres version: 10.5.  work_mem setting: 4MB, shared_buffers setting: 800 MB, 
connections typically around 30-40.

Linux kernel version: 3.10 and CentOS 7.  Also kernel 4.19 and OpenSUSE 
Tumbleweed when I recreate the issue on my workstation.

Server: An AWS EC2 instance: t2.medium.  In other words, 2 CPUs, 4 GB of 
memory.  Not big, but we do have a bunch of them.

Workaround: We monitor the process and bounce it periodically.  I don't love 
this approach.  We could rewrite the function to avoid the temporary table.  It 
would be my shame as a DBA to ask a developer to do that :).

Thanks for any insight!

Tom

Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Laurenz Albe
Thomas Carroll wrote:
> We have a situation where a long-persistent Postgres connection consumes more
> and more memory.  If ignored, we eventually get “Cannot allocate memory”
> errors in the Postgres log.  If still ignored, the box will eventually crash.
> This takes about 3 weeks to happen.  It issues a call to a single function
> about once every 15 seconds.  What can I do to prevent this outcome?

There should be a memory context dump in the log file.
What does it look like?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Tom Lane
Thomas Carroll  writes:
> Postgres version: 10.5.  work_mem setting: 4MB, shared_buffers setting: 800 
> MB, connections typically around 30-40.

I imagine you checked this already, but ... what is temp_buffers set to?
That constrains the maximum memory used for temporary-table buffers in
each process, and an unreasonable setting for it could lead to the
described behavior.

Another thing to keep in mind with long-lived "temporary" tables is
that autovacuum can't do anything with them; so it's incumbent on your
application to periodically VACUUM and/or ANALYZE them as needed.
Otherwise such tables will bloat, which could contribute to excessive
use of temporary-table buffers.

regards, tom lane



pg_dump: ERROR: array size exceeds the maximum allowed (268435455)

2018-12-10 Thread Laurent FAILLIE
Hello,
I'm trying to save a database from a dying disk on a Gentoo/Linux box.
Unfortunately, when I'm issuing a
$ pg_dumpall --clean
I got 

pg_dump: [programme d'archivage (db)] échec de la requête : ERROR:  array size 
exceeds the maximum allowed (268435455)
pg_dump: [programme d'archivage (db)] la requête était : SELECT p.tableoid, 
p.oid, p.proname AS aggname, p.pronamespace AS aggnamespace, p.pronargs, 
p.proargtypes, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = p.proowner) 
AS rolname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, 
row_n FROM 
pg_catalog.unnest(coalesce(p.proacl,pg_catalog.acldefault('f',p.proowner))) 
WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM 
pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('f',p.proowner)))
 AS init(init_acl) WHERE acl = init_acl)) as foo) AS aggacl, (SELECT 
pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM 
pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('f',p.proowner)))
 WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM 
pg_catalog.unnest(coalesce(p.proacl,pg_catalog.acldefault('f',p.proowner))) AS 
permp(orig_acl) WHERE acl = orig_acl)) as foo) AS raggacl, NULL AS initaggacl, 
NULL AS initraggacl FROM pg_proc p LEFT JOIN pg_init_privs pip ON (p.oid = 
pip.objoid AND pip.classoid = 'pg_proc'::regclass AND pip.objsubid = 0) WHERE 
p.proisagg AND (p.pronamespace != (SELECT oid FROM pg_namespace WHERE nspname = 
'pg_catalog') OR p.proacl IS DISTINCT FROM pip.initprivs)
pg_dumpall : échec de pg_dump sur la base de données « postgres », quitte

All references I found about this error is about user made request.Any clue to 
how I can correct this issue ? (as said, my disk is dying due to its controller 
failure so, obviously time is running :) ).This database contains my smart home 
figures so nothing critical and I have a "weeks old" backup.
This box is running 
pg_dump (PostgreSQL) 10.3 
but I was on way to upgrade to 10.6 when I discovered this issue.
Best regards,
Laurent


Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
What if this error message pertains to something happening on the
application side?

Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :

> Hi,
>
> v9.6.6
>
>
> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
> temporary
> file: No space left on device
>
> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
> is
> (Google doesn't help), and #2 there's lots of space on all my file
> systems.
> data/base, where pgsql_tmp lives, has 96GB free.)
>
> Thanks
>
> --
> Angular momentum makes the world go 'round.
>
>

-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: pg_dump: ERROR: array size exceeds the maximum allowed (268435455)

2018-12-10 Thread Ron

On 12/10/2018 03:15 PM, Laurent FAILLIE wrote:

Hello,

I'm trying to save a database from a dying disk on a Gentoo/Linux box.


[snip]

This box is running
pg_dump (PostgreSQL) 10.3
but I was on way to upgrade to 10.6 when I discovered this issue.


Since 10.3 and 10.6 are the same primary version, a binary backup would work 
just as well.



--
Angular momentum makes the world go 'round.


Re: What is the tuplestore?

2018-12-10 Thread Ron
There's certainly a problem with the application, but the error is in the 
pg_log, not the application log.


On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:
What if this error message pertains to something happening on the 
application side?


Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron >:


Hi,

v9.6.6


2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
temporary
file: No space left on device

I see this in the pg_log file, but #1 can't figure out what
"tuplestore" is
(Google doesn't help), and #2 there's lots of space on all my file
systems.
data/base, where pgsql_tmp lives, has 96GB free.)

Thanks



--
Angular momentum makes the world go 'round.


Re: pg_dump: ERROR: array size exceeds the maximum allowed (268435455)

2018-12-10 Thread Tom Lane
Laurent FAILLIE  writes:
> I'm trying to save a database from a dying disk on a Gentoo/Linux box.
> Unfortunately, when I'm issuing a
> $ pg_dumpall --clean
> I got 

> pg_dump: [programme d'archivage (db)] échec de la requête : ERROR:  array 
> size exceeds the maximum allowed (268435455)
> pg_dump: [programme d'archivage (db)] la requête était : SELECT p.tableoid, 
> p.oid, p.proname AS aggname, p.pronamespace AS aggnamespace, p.pronargs, 
> p.proargtypes, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = 
> p.proowner) AS rolname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM 
> (SELECT acl, row_n FROM 
> pg_catalog.unnest(coalesce(p.proacl,pg_catalog.acldefault('f',p.proowner))) 
> WITH ORDINALITY AS perm(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM 
> pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('f',p.proowner)))
>  AS init(init_acl) WHERE acl = init_acl)) as foo) AS aggacl, (SELECT 
> pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM 
> pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('f',p.proowner)))
>  WITH ORDINALITY AS initp(acl,row_n) WHERE NOT EXISTS ( SELECT 1 FROM 
> pg_catalog.unnest(coalesce(p.proacl,pg_catalog.acldefault('f',p.proowner))) 
> AS permp(orig_acl) WHERE acl = orig_acl)) as foo) AS raggacl, NULL AS 
> initaggacl, NULL AS initraggacl FROM pg_proc p LEFT JOIN pg_init_privs pip ON 
> (p.oid = pip.objoid AND pip.classoid = 'pg_proc'::regclass AND pip.objsubid = 
> 0) WHERE p.proisagg AND (p.pronamespace != (SELECT oid FROM pg_namespace 
> WHERE nspname = 'pg_catalog') OR p.proacl IS DISTINCT FROM pip.initprivs)
> pg_dumpall : échec de pg_dump sur la base de données « postgres », quitte

This looks like a manifestation of data corruption in the system
catalogs, specifically a broken proacl field in a pg_proc row for
an aggregate function.  Now typically, all those fields would just
be null, so you might be able to recover by looking for such rows
and setting their proacl fields back to null.  At worst, if you'd
actually granted or revoked any privileges on aggregates, you'd
lose that ... but in a data recovery situation, that's probably
not your worst problem.  Try

select proname, proacl from pg_proc where proisagg and proacl is not null;

and see what you get.

It's likely that the affected row(s) have more problems than this,
in which case deleting them altogether might be your best bet.

regards, tom lane



Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Thomas Carroll
   On Monday, December 10, 2018, 3:45:33 PM EST, Tom Lane  
wrote:  
 
 Thomas Carroll  writes:
> Postgres version: 10.5.  work_mem setting: 4MB, shared_buffers setting: 800 
> MB, connections typically around 30-40.

I imagine you checked this already, but ... what is temp_buffers set to?
That constrains the maximum memory used for temporary-table buffers in
each process, and an unreasonable setting for it could lead to the
described behavior.

Another thing to keep in mind with long-lived "temporary" tables is
that autovacuum can't do anything with them; so it's incumbent on your
application to periodically VACUUM and/or ANALYZE them as needed.
Otherwise such tables will bloat, which could contribute to excessive
use of temporary-table buffers.

            regards, tom lane
On Monday, December 10, 2018, 3:45:33 PM EST, Tom Lane  
wrote: 

> I imagine you checked this already, but ... what is temp_buffers set to?

Thanks for your reply!  temp_buffers is the default 8MB, and I should have 
included that in my first email.

> Another thing to keep in mind with long-lived "temporary" tables...

WRT temp tables and autovacuum: I realize I need to add an important detail 
here: The table is created by the function using:
CREATE TEMPORARY TABLE table_name ON COMMIT DROP AS SELECT...

So my perhaps-wrong expectation is that all remnants of the old temporary table 
are discarded from the previous invocation, so there is no need to do any 
vacuuming.
Thanks again,Tom




  

Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Tom Lane
Thomas Carroll  writes:
>On Monday, December 10, 2018, 3:45:33 PM EST, Tom Lane 
>  wrote:  
>> I imagine you checked this already, but ... what is temp_buffers set to?

> Thanks for your reply!  temp_buffers is the default 8MB, and I should have 
> included that in my first email.

Hm.  Well, the temporary-buffer arena definitely won't grow any larger
than that, so the problem is somewhere else.

> WRT temp tables and autovacuum: I realize I need to add an important detail 
> here: The table is created by the function using:
> CREATE TEMPORARY TABLE table_name ON COMMIT DROP AS SELECT...
> So my perhaps-wrong expectation is that all remnants of the old temporary 
> table are discarded from the previous invocation, so there is no need to do 
> any vacuuming.

I see.  The contents of the temp tables certainly go away at commit,
then, but the system catalog entries describing a temp table are just
as real as those for a regular table.  So if you're creating and
dropping temp tables quickly, there's a potential for bloat in the
system catalogs (particularly pg_attribute), which autovacuum might
or might not keep up with at default settings.  Still, I'd only expect
that to lead to disk space growth not memory consumption.

Is the error message spelling really exactly "Cannot allocate memory"?
Because that string appears nowhere in the Postgres backend sources,
and I don't think it's strerror's canonical phrasing for ENOMEM either.
So I'm wondering just where it's coming from.

Also, as mentioned upthread, it'd be interesting to see if there's
a memory context dump showing up in your server log.  It'd look
something roughly like this:

TopMemoryContext: 67440 total in 5 blocks; 14016 free (27 chunks); 53424 used
  TopTransactionContext: 32768 total in 3 blocks; 14992 free (21 chunks); 17776 
used
Combo CIDs: 8192 total in 1 blocks; 1544 free (0 chunks); 6648 used
  LocalBufferContext: 8397248 total in 8 blocks; 7936 free (0 chunks); 8389312 
used
  Local Buffer Lookup Table: 32768 total in 3 blocks; 6368 free (7 chunks); 
26400 used
  ... lots more ...
Grand total: 9603680 bytes in 238 blocks; 283976 free (240 chunks); 9319704 used

regards, tom lane



Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
Maybe the temp space got released right after the failure?
https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
do you have space usage charts for that partition? doesn't it show a spike
during that time?

Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron :

> There's certainly a problem with the application, but the error is in the
> pg_log, not the application log.
>
> On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:
>
> What if this error message pertains to something happening on the
> application side?
>
> Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :
>
>> Hi,
>>
>> v9.6.6
>>
>>
>> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
>> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
>> temporary
>> file: No space left on device
>>
>> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
>> is
>> (Google doesn't help), and #2 there's lots of space on all my file
>> systems.
>> data/base, where pgsql_tmp lives, has 96GB free.)
>>
>> Thanks
>>
>>
> --
> Angular momentum makes the world go 'round.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: What is the tuplestore?

2018-12-10 Thread Ron
Which file system (specifically, which directory)?  Is it 
data/base/pgsql_tmp?  There's 96GB free, which is 74% of the volume.


On 12/10/2018 04:50 PM, Rene Romero Benavides wrote:

Maybe the temp space got released right after the failure?
https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
do you have space usage charts for that partition? doesn't it show a spike 
during that time?


Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron >:


There's certainly a problem with the application, but the error is in
the pg_log, not the application log.

On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:

What if this error message pertains to something happening on the
application side?

Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron
mailto:ronljohnso...@gmail.com>>:

Hi,

v9.6.6


2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW
13748 SELECT
PostgreSQL JDBC Driver 53100 ERROR:  could not write to
tuplestore temporary
file: No space left on device

I see this in the pg_log file, but #1 can't figure out what
"tuplestore" is
(Google doesn't help), and #2 there's lots of space on all my
file systems.
data/base, where pgsql_tmp lives, has 96GB free.)

Thanks





--
Angular momentum makes the world go 'round.


Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Thomas Carroll
 

  On Monday, December 10, 2018, 5:50:22 PM EST, Tom Lane  
wrote:  
> Is the error message spelling really exactly "Cannot allocate memory"?
Yes - but I think the message is forwarded from Linux.  Here is an example:
2018-12-08 00:00:00.070 EST,,,32506,,5bc71a25.7efa,25643,,2018-10-17 07:16:53 
EDT,,0,LOG,0,"could not fork autovacuum worker process: Cannot allocate 
memory",""
Should have been more thoughtful in what I chose to mention.  Here is one that 
may be more familiar, and directly related to the issue I am encountering:

2018-12-08 00:00:03.642 
EST,"schema_name","db_name",24949,"10.12.113.56:50138",5bfed8cd.6175,17637,"SELECT",2018-11-28
 13:05:01 EST,24/223502,0,ERROR,53200,"out of memory","Failed on request of 
size 24.""SQL statement ""CREATE TEMPORARY TABLE table_name ON COMMIT DROP 
AS

> Also, as mentioned upthread, it'd be interesting to see if there's
> a memory context dump showing up in your server log.
There are many memory context dumps, and they are long (well over 100 lines 
apiece).  To me these are unfamiliar and hard to read, but I tried to pick one 
that referenced all the same tables used in the function.  I put one such dump 
(188 lines) here:
tc.log

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
tc.log


 |

 |

 |



A hopefully-helpful short excerpt from the top:
TopMemoryContext: 174216 total in 7 blocks; 12520 free (27 chunks); 161696 used
  pgstat TabStatusArray lookup hash table: 8192 total in 1 blocks; 632 free (0 
chunks); 7560 used
  TopTransactionContext: 24576 total in 2 blocks; 7288 free (3 chunks); 17288 
used
    Combo CIDs: 8192 total in 1 blocks; 1800 free (0 chunks); 6392 used
    ExecutorState: 8192 total in 1 blocks; 7400 free (0 chunks); 792 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
    SPI Exec: 2088960 total in 8 blocks; 865976 free (7 chunks); 1222984 used
  ExecutorState: 262216 total in 6 blocks; 94344 free (11 chunks); 167872 
used
    HashTableContext: 8192 total in 1 blocks; 7584 free (0 chunks); 608 used
  HashBatchContext: 197104 total in 6 blocks; 0 free (0 chunks); 197104 
used
    HashTableContext: 8192 total in 1 blocks; 7968 free (0 chunks); 224 used
  HashBatchContext: 49240 total in 2 blocks; 8136 free (0 chunks); 
41104 used
    HashTableContext: 8192 total in 1 blocks; 7584 free (0 chunks); 608 used
  HashBatchContext: 197104 total in 6 blocks; 0 free (0 chunks); 197104 
used


Anything I can do to zero in on helpful information, I'm all ears.

Thanks so much!Tom



  

Re: What is the tuplestore?

2018-12-10 Thread Rene Romero Benavides
Yes, pgsql_tmp, you ought to be looking for a sudden and drastic jump in
space utilization around the time of the error message. You're not
concerned with the current space utilization, but with the one around that
time, because, it probably got freed right after the error was raised.
How many times has this happened ? what kind of queries were running at
that time? can you identify something that could have required lots of
temporary space?

Am Mo., 10. Dez. 2018 um 17:33 Uhr schrieb Ron :

> Which file system (specifically, which directory)?  Is it
> data/base/pgsql_tmp?  There's 96GB free, which is 74% of the volume.
>
> On 12/10/2018 04:50 PM, Rene Romero Benavides wrote:
>
> Maybe the temp space got released right after the failure?
>
> https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
> do you have space usage charts for that partition? doesn't it show a spike
> during that time?
>
> Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron :
>
>> There's certainly a problem with the application, but the error is in the
>> pg_log, not the application log.
>>
>> On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:
>>
>> What if this error message pertains to something happening on the
>> application side?
>>
>> Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron :
>>
>>> Hi,
>>>
>>> v9.6.6
>>>
>>>
>>> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748
>>> SELECT
>>> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
>>> temporary
>>> file: No space left on device
>>>
>>> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
>>> is
>>> (Google doesn't help), and #2 there's lots of space on all my file
>>> systems.
>>> data/base, where pgsql_tmp lives, has 96GB free.)
>>>
>>> Thanks
>>>
>>>
>>
> --
> Angular momentum makes the world go 'round.
>


-- 
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


Re: What is the tuplestore?

2018-12-10 Thread Thomas Munro
On Tue, Dec 11, 2018 at 2:56 AM Ron  wrote:
> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore temporary
> file: No space left on device
>
> I see this in the pg_log file, but #1 can't figure out what "tuplestore" is
> (Google doesn't help), and #2 there's lots of space on all my file systems.
> data/base, where pgsql_tmp lives, has 96GB free.)

Maybe the setting log_temp_files could help you identify the query
that is doing it?

https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Memory exhaustion due to temporary tables?

2018-12-10 Thread Tom Lane
Thomas Carroll  writes:
>   On Monday, December 10, 2018, 5:50:22 PM EST, Tom Lane  
> wrote:  
>> Also, as mentioned upthread, it'd be interesting to see if there's
>> a memory context dump showing up in your server log.

> There are many memory context dumps, and they are long (well over 100 lines 
> apiece).  To me these are unfamiliar and hard to read, but I tried to pick 
> one that referenced all the same tables used in the function.  I put one such 
> dump (188 lines) here:
> https://drive.google.com/file/d/1QI4nffdZByIehb_-GULOagI_dKpKElg_/view

Thanks.  Here's the smoking gun:

  CacheMemoryContext: 221241400 total in 38 blocks; 152192 free (90 chunks); 
221089208 used

This looks like something is leaking memory directly in CacheMemoryContext
(not in any of its child contexts).  Can you confirm that all the memory
context dumps in your logs are showing similarly very-large total space
values for CacheMemoryContext?

This doesn't seem to be related to temp table usage, at least not
directly.  (As a cross-check, I created and dropped 1 temp tables in a
session, and the process's VM footprint didn't budge.)  My guess is that
something else you are doing is tickling the leak.

You said you'd been able to reproduce this problem outside production.
Any chance you could boil that down to a sharable test case?  I'd bet
long odds that it's not related to your data particularly, but rather
to your schema or something your function is doing, so you probably
could reproduce it with some dummy data (and, perhaps, not even very
much of that; repeating some DDL action many times is probably what
is making the leak grow to something noticeable).

If I had to guess at this point, I'd wonder if it's something to do
with partitioned tables.  That code is still relatively wet behind
the ears, and memory leaks are the kind of bug I'd be expecting to
still be shaking out of it now.  But you might not be using partitioned
tables at all.  Anyway, anything you're doing with new-in-v10 features,
particularly schema-related features, would be a good bet for the
source of the problem.

regards, tom lane



Re: pg_stat_replication view

2018-12-10 Thread Michael Paquier
On Mon, Dec 10, 2018 at 02:24:43PM -0500, Jeff Janes wrote:
> What does this mean?  Is the standby "caught up" when it replays the LSN
> which was current on the master as-of the time that the standby initiated
> this connection?  Or is it caught up when the master receives at least one
> notification that a certain LSN was replayed on the replica, and verifies
> that no new WAL has been generated after that certain LSN was generated?
> Neither of those things?

The WAL sender would switch from catchup to streaming mode when it sees
that there is no more data to send to the standby.  Please look for the
call of WalSndSetState(WALSNDSTATE_STREAMING) in walsender.c.

> If a replica has caught up and then fallen behind again, is that different
> from a user/dba perspective than if it never caught up in the first
> place?

Not really, because it means that it has been able to catch up with the
latest LSN of the primary at least once.  Perhaps you have suggestions
to improve the documentation?
--
Michael


signature.asc
Description: PGP signature


Re: What is the tuplestore?

2018-12-10 Thread Ron
I extracted the queries from pg_log and sent them to the customer team.  To 
fill up 96GB of disk space seems like the customer selected a huge date range.


On 12/10/2018 06:07 PM, Rene Romero Benavides wrote:
Yes, pgsql_tmp, you ought to be looking for a sudden and drastic jump in 
space utilization around the time of the error message. You're not 
concerned with the current space utilization, but with the one around that 
time, because, it probably got freed right after the error was raised.
How many times has this happened ? what kind of queries were running at 
that time? can you identify something that could have required lots of 
temporary space?


Am Mo., 10. Dez. 2018 um 17:33 Uhr schrieb Ron >:


Which file system (specifically, which directory)?  Is it
data/base/pgsql_tmp?  There's 96GB free, which is 74% of the volume.

On 12/10/2018 04:50 PM, Rene Romero Benavides wrote:

Maybe the temp space got released right after the failure?

https://grokbase.com/t/postgresql/pgsql-general/02ag7k8gcr/tuplestore-write-failed
do you have space usage charts for that partition? doesn't it show a
spike during that time?

Am Mo., 10. Dez. 2018 um 15:54 Uhr schrieb Ron
mailto:ronljohnso...@gmail.com>>:

There's certainly a problem with the application, but the error
is in the pg_log, not the application log.

On 12/10/2018 03:21 PM, Rene Romero Benavides wrote:

What if this error message pertains to something happening on
the application side?

Am Mo., 10. Dez. 2018 um 09:56 Uhr schrieb Ron
mailto:ronljohnso...@gmail.com>>:

Hi,

v9.6.6


2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW
13748 SELECT
PostgreSQL JDBC Driver 53100 ERROR:  could not write to
tuplestore temporary
file: No space left on device

I see this in the pg_log file, but #1 can't figure out what
"tuplestore" is
(Google doesn't help), and #2 there's lots of space on all
my file systems.
data/base, where pgsql_tmp lives, has 96GB free.)

Thanks





-- 
Angular momentum makes the world go 'round.




--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/



--
Angular momentum makes the world go 'round.


Re: What is the tuplestore?

2018-12-10 Thread Ron

On 12/10/2018 06:24 PM, Thomas Munro wrote:

On Tue, Dec 11, 2018 at 2:56 AM Ron  wrote:

2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore temporary
file: No space left on device

I see this in the pg_log file, but #1 can't figure out what "tuplestore" is
(Google doesn't help), and #2 there's lots of space on all my file systems.
data/base, where pgsql_tmp lives, has 96GB free.)

Maybe the setting log_temp_files could help you identify the query
that is doing it?

https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES


Great idea; I've made the change.  What do I grep for in the pg_log file?


--
Angular momentum makes the world go 'round.



Re: What is the tuplestore?

2018-12-10 Thread Thomas Munro
On Tue, Dec 11, 2018 at 3:58 PM Ron  wrote:
> On 12/10/2018 06:24 PM, Thomas Munro wrote:
> > On Tue, Dec 11, 2018 at 2:56 AM Ron  wrote:
> >> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
> >> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore 
> >> temporary
> >> file: No space left on device
> >>
> >> I see this in the pg_log file, but #1 can't figure out what "tuplestore" is
> >> (Google doesn't help), and #2 there's lots of space on all my file systems.
> >> data/base, where pgsql_tmp lives, has 96GB free.)
> > Maybe the setting log_temp_files could help you identify the query
> > that is doing it?
> >
> > https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES
>
> Great idea; I've made the change.  What do I grep for in the pg_log file?

It should say something like:

LOG: temporary file: path "base/pgsql_tmp/...", size ...
STATEMENT: ...

-- 
Thomas Munro
http://www.enterprisedb.com



Re: What is the tuplestore?

2018-12-10 Thread Hans Schou
When one get a "No space left on device" and there is a lot of space it is
sometimes caused by lack of inodes.

Try run the command:
  df --inodes


On Mon, Dec 10, 2018 at 4:56 PM Ron  wrote:

> Hi,
>
> v9.6.6
>
>
> 2018-12-07 06:21:55.504 EST 10.140.181.89(35868) CDS CDSLBXW 13748 SELECT
> PostgreSQL JDBC Driver 53100 ERROR:  could not write to tuplestore
> temporary
> file: No space left on device
>
> I see this in the pg_log file, but #1 can't figure out what "tuplestore"
> is
> (Google doesn't help), and #2 there's lots of space on all my file
> systems.
> data/base, where pgsql_tmp lives, has 96GB free.)
>
> Thanks
>
> --
> Angular momentum makes the world go 'round.
>
>


new stored procedure with OUT parameters

2018-12-10 Thread Anton Shen
Hi all,

I was playing around with the stored procedure support in v11 and found
that pure OUT parameters are not supported. Is there any reason we only
support INOUT but not OUT parameters?

psql (11.0 (Homebrew petere/postgresql))
dev=# CREATE PROCEDURE test_sp(a OUT int) LANGUAGE plpgsql AS $$
dev$# BEGIN
dev$# a = 5;
dev$# END; $$;
ERROR:  procedures cannot have OUT arguments
HINT:  INOUT arguments are permitted.

Thanks,
Anton