PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread Matthias Apitz


Hello,

Our software, a huge ILS, is running on Linux with DBS Sybase. To
connect to the Sybase server (over the network, even on localhost),
credentials must be known: a user (say 'sisis') and its password.

For Sybase we have them stored on the disk of the system in a file
syb.npw as:

$ cat /opt/lib/sisis/etc/syb/syb.npw
sisis:e53902b9923ab2fb
sa:64406def48efca8c

for the user 'sisis' and the administrator 'sa'. Our software has as
shared library a blob which knows how to decrypt the password hash above
shown as 'e53902b9923ab2fb' into clear text which is then used in the
ESQL/C or Java layer to connect to the Sybase server.

For PostgreSQL the password must be typed in (for pgsql) or can be
provided in an environment variable PGPASSWORD=blabla

Is there somehow an API in PG to use ciphered passwords and provide as a
shared library the blob to decrypt it? If not, we will use the mechanism same as
we use for Sybase. Or any other idea to not make detectable the
credentials? This was a request of our customers some years ago.

matthias


-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Mientras haya voluntad de lucha habrá esperanza de vencer.




How to safely remove a corrupted cluster?

2019-09-19 Thread Marco Ippolito
If the saying "to learn you have to make mistakes", I'm "learning"...:

I removed a cluster's folder without using pg_dropcluster... sio now I have
a corrupted cluster. How can I safely remove it?

(base) marco@pc:~$ pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
11  fabmnet 5433 down/var/lib/postgresql/11/fabmnet
/var/log/postgresql/postgresql-11-fabmnet.log
11  main5432 online postgres  /var/lib/postgresql/11/main
 /var/log/postgresql/postgresql-11-main.log

(base) marco@pc:~$ pg_dropcluster 11 fabmnet
Warning: corrupted cluster: data directory does not exist
Warning: systemd was not informed about the removed cluster yet. Operations
like "service postgresql start" might fail. To fix, run:
  sudo systemctl daemon-reload
(base) marco@pc:~$ sudo systemctl daemon-reload
[sudo] password for marco:
(base) marco@pc:~$ pg_dropcluster 11 fabmnet
Warning: corrupted cluster: data directory does not exist
Warning: systemd was not informed about the removed cluster yet. Operations
like "service postgresql start" might fail. To fix, run:
  sudo systemctl daemon-reload

Marco


n_live_tup count increase after vacuum

2019-09-19 Thread Jason Ralph
Hello Lists,
DB1=# select version();
-[ RECORD 1 
]
version | PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 
20120313 (Red Hat 4.4.7-23), 64-bit

I recently upgraded a neglected DB in our fleet that contains a huge table 
(1.4B) rows.  I ran a vacuum against the huge table, as expected it took a long 
time, but it did finish.
I think I understand most of what the output is saying... one confusing 
thing
The number of "live " tuples went up..(?)  My guess would've been that dead 
tuples would clear out.. but live would remain the same maybe I'm not 
understanding what "live" means were some "dead" added to live?

BEFORE:
DB1=# SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_all_tables where relname =  'logs'
ORDER BY n_dead_tup
/ (n_live_tup
   * current_setting('autovacuum_vacuum_scale_factor')::float8
  + current_setting('autovacuum_vacuum_threshold')::float8)
 DESC;
schemaname |relname| n_live_tup | n_dead_tup |
last_autovacuum
+---+++---
public | logs  | 1337537055 |4293327 |
 (1 row)

AFTER:
DB1=# SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_all_tables where relname =  'logs'
ORDER BY n_dead_tup
/ (n_live_tup
   * current_setting('autovacuum_vacuum_scale_factor')::float8
  + current_setting('autovacuum_vacuum_threshold')::float8)
 DESC;
schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum
+-+++-
public | logs| 1447462480 |  0 |
(1 row)


VERBOSE RUN:
DB1=# vacuum (verbose) logs;
INFO:  vacuuming "public.logs"
INFO:  scanned index "logs_sid2" to remove 4467859 row versions
DETAIL:  CPU: user: 1524.01 s, system: 710.15 s, elapsed: 6757.19 s
INFO:  scanned index "logs_date2" to remove 4467859 row versions
DETAIL:  CPU: user: 228.94 s, system: 155.04 s, elapsed: 1659.30 s
INFO:  scanned index "logs_cobrid2" to remove 4467859 row versions
DETAIL:  CPU: user: 3924.05 s, system: 121.69 s, elapsed: 5925.26 s
INFO:  scanned index "logs_logid2" to remove 4467859 row versions
DETAIL:  CPU: user: 393.55 s, system: 0.00 s, elapsed: 2997.50 s
INFO:  "logs": removed 4467859 row versions in 2377192 pages
DETAIL:  CPU: user: 515.37 s, system: 374.36 s, elapsed: 8654.63 s
INFO:  index "logs_sid2" now contains 1336565510 row versions in 15756256 pages
DETAIL:  2699891 index row versions were removed.
5922239 index pages have been deleted, 5634540 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "logs_date2" now contains 1336565510 row versions in 10599372 pages
DETAIL:  4467601 index row versions were removed.
5931749 index pages have been deleted, 5255678 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "logs_cobrid2" now contains 1336565510 row versions in 10597509 
pages
DETAIL:  4467858 index row versions were removed.
5929869 index pages have been deleted, 5283219 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "logs_logid2" now contains 1336565510 row versions in 9944393 pages
DETAIL:  4467858 index row versions were removed.
6194584 index pages have been deleted, 5632471 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "logs": found 0 removable, 316509436 nonremovable row versions in 
18550227 out of 120104257 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 557115107
There were 213700141 unused item pointers.
Skipped 0 pages due to buffer pins, 10848129 frozen pages.
0 pages are entirely empty.
CPU: user: 6730.15 s, system: 1926.22 s, elapsed: 29265.93 s.
INFO:  vacuuming "pg_toast.pg_toast_3318856"
INFO:  index "pg_toast_3318856_index" now contains 101 row versions in 4 pages
DETAIL:  0 index row versions were removed.
1 index pages have been deleted, 1 are currently reusable.
CPU: user: 0.01 s, system: 0.01 s, elapsed: 0.07 s.
INFO:  "pg_toast_3318856": found 0 removable, 92 nonremovable row versions in 
18 out of 148 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 557115699
There were 6 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.04 s, system: 0.02 s, elapsed: 0.21 s.
VACUUM
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. E-mail transmission cannot be guaranteed to be 
secure or error-free as information could be inter

Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread rob stone
Hello,

On Thu, 2019-09-19 at 12:30 +0200, Matthias Apitz wrote:
> Hello,
> 
> Our software, a huge ILS, is running on Linux with DBS Sybase. To
> connect to the Sybase server (over the network, even on localhost),
> credentials must be known: a user (say 'sisis') and its password.
> 
> For Sybase we have them stored on the disk of the system in a file
> syb.npw as:
> 
> $ cat /opt/lib/sisis/etc/syb/syb.npw
> sisis:e53902b9923ab2fb
> sa:64406def48efca8c
> 
> for the user 'sisis' and the administrator 'sa'. Our software has as
> shared library a blob which knows how to decrypt the password hash
> above
> shown as 'e53902b9923ab2fb' into clear text which is then used in the
> ESQL/C or Java layer to connect to the Sybase server.
> 
> For PostgreSQL the password must be typed in (for pgsql) or can be
> provided in an environment variable PGPASSWORD=blabla
> 
> Is there somehow an API in PG to use ciphered passwords and provide
> as a
> shared library the blob to decrypt it? If not, we will use the
> mechanism same as
> we use for Sybase. Or any other idea to not make detectable the
> credentials? This was a request of our customers some years ago.
> 
>   matthias
> 
> 


https://www.postgresql.org/docs/11/auth-password.html

Chapters 20.5 and 20.6 may give you more information.

HTH,
Robert






Re: PostgreSQL License

2019-09-19 Thread Steve Litt
On Wed, 18 Sep 2019 17:20:14 -0500
Ron  wrote:

> Charging for *installing* PostgreSQL is not the same as charging for
> PostgreSQL.
> 
> Bottom line: you charge for *services**you provide* not for software
> that other people provide.

That makes a lot of sense. A head gasket costs about $25.00.
*Installing* the head gasket costs well over a thousand. As long as the
customer has the option to install it himself, you're selling your
services.

If you've modified PostgreSQL, as long as you offer the source code to
the customer and say "good luck installing it by yourself",  you can
charge to install it without breaking even the GNU GPL.

My understanding is the PostgreSQL license is more like the MIT
license, which actually allows one to modify the code and claim it as
proprietary.

SteveT

Steve Litt
Author: The Key to Everyday Excellence
http://www.troubleshooters.com/key
Twitter: http://www.twitter.com/stevelitt





Re: How to safely remove a corrupted cluster?

2019-09-19 Thread Marco Ippolito
Solved by repeating the execution of (base) postgres@pc:~$ pg_dropcluster
--stop 11 fabmnet :

(base) postgres@pc:~$ pg_dropcluster --stop 11 fabmnet
Error: specified cluster does not exist

Marco

Il giorno gio 19 set 2019 alle ore 13:02 Marco Ippolito <
ippolito.ma...@gmail.com> ha scritto:

> If the saying "to learn you have to make mistakes", I'm "learning"...:
>
> I removed a cluster's folder without using pg_dropcluster... sio now I
> have a corrupted cluster. How can I safely remove it?
>
> (base) marco@pc:~$ pg_lsclusters
> Ver Cluster Port Status Owner Data directory Log file
> 11  fabmnet 5433 down/var/lib/postgresql/11/fabmnet
> /var/log/postgresql/postgresql-11-fabmnet.log
> 11  main5432 online postgres  /var/lib/postgresql/11/main
>  /var/log/postgresql/postgresql-11-main.log
>
> (base) marco@pc:~$ pg_dropcluster 11 fabmnet
> Warning: corrupted cluster: data directory does not exist
> Warning: systemd was not informed about the removed cluster yet.
> Operations like "service postgresql start" might fail. To fix, run:
>   sudo systemctl daemon-reload
> (base) marco@pc:~$ sudo systemctl daemon-reload
> [sudo] password for marco:
> (base) marco@pc:~$ pg_dropcluster 11 fabmnet
> Warning: corrupted cluster: data directory does not exist
> Warning: systemd was not informed about the removed cluster yet.
> Operations like "service postgresql start" might fail. To fix, run:
>   sudo systemctl daemon-reload
>
> Marco
>


Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread Matthias Apitz
El día Thursday, September 19, 2019 a las 10:31:01PM +1000, rob stone escribió:

> Hello,
> 
> On Thu, 2019-09-19 at 12:30 +0200, Matthias Apitz wrote:
> > Hello,
> > 
> > Our software, a huge ILS, is running on Linux with DBS Sybase. To
> > connect to the Sybase server (over the network, even on localhost),
> > credentials must be known: a user (say 'sisis') and its password.
> > 
> > For Sybase we have them stored on the disk of the system in a file
> > syb.npw as:
> > 
> > $ cat /opt/lib/sisis/etc/syb/syb.npw
> > sisis:e53902b9923ab2fb
> > sa:64406def48efca8c
> > 
> > for the user 'sisis' and the administrator 'sa'. Our software has as
> > shared library a blob which knows how to decrypt the password hash
> > above
> > shown as 'e53902b9923ab2fb' into clear text which is then used in the
> > ESQL/C or Java layer to connect to the Sybase server.
> > 
> > For PostgreSQL the password must be typed in (for pgsql) or can be
> > provided in an environment variable PGPASSWORD=blabla
> > 
> > Is there somehow an API in PG to use ciphered passwords and provide
> > as a
> > shared library the blob to decrypt it? If not, we will use the
> > mechanism same as
> > we use for Sybase. Or any other idea to not make detectable the
> > credentials? This was a request of our customers some years ago.
> > 
> 
> 
> https://www.postgresql.org/docs/11/auth-password.html
> 
> Chapters 20.5 and 20.6 may give you more information.

The form of the password hash store in the PG server or interchange over
the network is not my question. The question is more: When the Linux
server starts and with this the (ESQL/C written) application servers are
starting, they need the password to connect and this is not provided at
this moment from some keyboard or humanbeing. It must be stored on the
server and available in clear for the server, but not for other eyes on
the server, i.e. the place of the sorage must be ciphered.

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
May, 9: Спаси́бо освободители! Thank you very much, Russian liberators!




Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread Tom Lane
Matthias Apitz  writes:
> Is there somehow an API in PG to use ciphered passwords and provide as a
> shared library the blob to decrypt it?

No.  Consider a non-password auth mechanism, for instance SSL
certificates.  You might find that an SSL certificate file
stored where libpq will find it is already about as secure as
what you're doing now.  If you want to jump through extra
hoops for more security, I think you can use ssh-agent to
hold the keys.

regards, tom lane




Re: n_live_tup count increase after vacuum

2019-09-19 Thread Tom Lane
Jason Ralph  writes:
> I recently upgraded a neglected DB in our fleet that contains a huge table 
> (1.4B) rows.  I ran a vacuum against the huge table, as expected it took a 
> long time, but it did finish.
> I think I understand most of what the output is saying... one confusing 
> thing
> The number of "live " tuples went up..(?)

If you're looking at the pg_stat counter, that's only an approximation.
It's probably a better approximation now than it was before.

regards, tom lane




Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-19 Thread Stephen Frost
Greetings,

* Ron (ronljohnso...@gmail.com) wrote:
> On 9/18/19 8:58 PM, David Steele wrote:
> >On 9/18/19 9:40 PM, Ron wrote:
> >>I'm concerned with one pgbackrest process stepping over another one and
> >>the restore (or the "pg_ctl start" recovery phase) accidentally
> >>corrupting the production database by writing WAL files to the original
> >>cluster.
> >This is not an issue unless you seriously game the system.  When a
> >cluster is promoted it selects a new timeline and all WAL will be
> >archived to the repo on that new timeline.  It's possible to promote a
> >cluster without a timeline switch by tricking it but this is obviously a
> >bad idea.
> 
> What's a timeline switchover?

Put simply, it's a branch off of the current WAL stream on to a new WAL
stream and it happens whenever there's a promotion.

Forgive the ASCII art, but-

A---> timeline 1, where things start
 \-> a promotion happened at time A, new WAL is on timeline 2

Consider an async replication scenario, where the network on the primary
is lost but it keeps writing out WAL and accepting new commits, but at a
time "A" we give up on it and promote the replica, so the replica
switches to timeline 2 and starts accepting writes.  Now we are in a
situation where two systems are generating WAL (the network partitioned
old primary, and the replica-now-primary).  Having the promotion switch
to a timeline makes it clear where that promotion happened and where the
replica-now-primary's WAL stream started.

This is actually what pg_rewind is based around too- to re-master the
old primary, it'll find that split point A and "rewind" (well, not
really, because it just grabs the pages, but whatever) the old primary
back to A and then the old primary can follow the new primary on
timeline 2.

> >So, if you promote the new cluster and forget to disable archive_command
> >there will be no conflict because the clusters will be generating WAL on
> >separate timelines.
> 
> No cluster promotion even contemplated.

Ah, but you are talking about a cluster promotion, though you don't
realize it.  Any time there is a "at some point, I was to stop replaying
WAL and start accepting new changes", there's a timeline switch and
notionally a promotion.

> The point of the exercise would be to create an older copy of the cluster --
> while the production cluster is still running, while production jobs are
> still pumping data into the production database -- from before the time of
> the data loss, and query it in an attempt to recover the records which were
> deleted.

Sure, that's all entirely possible and shouldn't be an issue.  When you
go through the restore process and specify a point where you want the
restore to stop, so that you can connect and pull the down the table,
when PG reaches that point it'll promote and do a timeline switch.

Now, if you don't actually want that restore to promote and come up as a
system that you can write to, you could instead say 'pause', and then
connect to the database and grab whatever data you needed.  That should
also avoid the concern around archive command, provided you never
actually let that system finish recovery and instead just shut it down
while it's still read-only.

If you want to play around with this stuff and see what happens with a
promote, or try doing a pause instead, you might be interested in:

https://learn.crunchydata.com/

and specifically the pgbackrest one:

https://learn.crunchydata.com/pg-administration/courses/basic-postgresql-for-dbas/pgbackrest/

Basically, it's kind of like a blog post where you can play around on a
scratch system that's built into the page and click through the steps to
see what happens, and change things around if you want.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread Adrian Klaver

On 9/19/19 3:30 AM, Matthias Apitz wrote:


Hello,

Our software, a huge ILS, is running on Linux with DBS Sybase. To
connect to the Sybase server (over the network, even on localhost),
credentials must be known: a user (say 'sisis') and its password.

For Sybase we have them stored on the disk of the system in a file
syb.npw as:

$ cat /opt/lib/sisis/etc/syb/syb.npw
sisis:e53902b9923ab2fb
sa:64406def48efca8c

for the user 'sisis' and the administrator 'sa'. Our software has as
shared library a blob which knows how to decrypt the password hash above
shown as 'e53902b9923ab2fb' into clear text which is then used in the
ESQL/C or Java layer to connect to the Sybase server.

For PostgreSQL the password must be typed in (for pgsql) or can be
provided in an environment variable PGPASSWORD=blabla

Is there somehow an API in PG to use ciphered passwords and provide as a
shared library the blob to decrypt it? If not, we will use the mechanism same as


There is not and I am not sure that would be much use even if it did 
exist. You would be right back at someone being able to grab the 
credentials from a file and feeding them to the database for access.


The system you currently have at least seems to limit access to a 
specific program external to Postgres.



we use for Sybase. Or any other idea to not make detectable the
credentials? This was a request of our customers some years ago.

matthias





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread domenico febbo
Hi,
maybe you want to use [1] pgcrypto encrypt/decrypt function using "secret"
word stored outside database.

See  F.25.4. Raw Encryption Functions

[1] https://www.postgresql.org/docs/11/pgcrypto.html

Regards,

Il giorno gio 19 set 2019 alle ore 16:19 Adrian Klaver <
adrian.kla...@aklaver.com> ha scritto:

> On 9/19/19 3:30 AM, Matthias Apitz wrote:
> >
> > Hello,
> >
> > Our software, a huge ILS, is running on Linux with DBS Sybase. To
> > connect to the Sybase server (over the network, even on localhost),
> > credentials must be known: a user (say 'sisis') and its password.
> >
> > For Sybase we have them stored on the disk of the system in a file
> > syb.npw as:
> >
> > $ cat /opt/lib/sisis/etc/syb/syb.npw
> > sisis:e53902b9923ab2fb
> > sa:64406def48efca8c
> >
> > for the user 'sisis' and the administrator 'sa'. Our software has as
> > shared library a blob which knows how to decrypt the password hash above
> > shown as 'e53902b9923ab2fb' into clear text which is then used in the
> > ESQL/C or Java layer to connect to the Sybase server.
> >
> > For PostgreSQL the password must be typed in (for pgsql) or can be
> > provided in an environment variable PGPASSWORD=blabla
> >
> > Is there somehow an API in PG to use ciphered passwords and provide as a
> > shared library the blob to decrypt it? If not, we will use the mechanism
> same as
>
> There is not and I am not sure that would be much use even if it did
> exist. You would be right back at someone being able to grab the
> credentials from a file and feeding them to the database for access.
>
> The system you currently have at least seems to limit access to a
> specific program external to Postgres.
>
> > we use for Sybase. Or any other idea to not make detectable the
> > credentials? This was a request of our customers some years ago.
> >
> >   matthias
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-19 Thread Ron

On 9/19/19 9:17 AM, Stephen Frost wrote:
[snip]

Ah, but you are talking about a cluster promotion, though you don't
realize it.  Any time there is a "at some point, I was to stop replaying
WAL and start accepting new changes", there's a timeline switch and
notionally a promotion.


The point of the exercise would be to create an older copy of the cluster --
while the production cluster is still running, while production jobs are
still pumping data into the production database -- from before the time of
the data loss, and query it in an attempt to recover the records which were
deleted.

Sure, that's all entirely possible and shouldn't be an issue.  When you
go through the restore process and specify a point where you want the
restore to stop, so that you can connect and pull the down the table,
when PG reaches that point it'll promote and do a timeline switch.

Now, if you don't actually want that restore to promote and come up as a
system that you can write to, you could instead say 'pause', and then
connect to the database and grab whatever data you needed.  That should
also avoid the concern around archive command, provided you never
actually let that system finish recovery and instead just shut it down
while it's still read-only.

If you want to play around with this stuff and see what happens with a
promote, or try doing a pause instead, you might be interested in:

https://learn.crunchydata.com/

and specifically the pgbackrest one:

https://learn.crunchydata.com/pg-administration/courses/basic-postgresql-for-dbas/pgbackrest/

Basically, it's kind of like a blog post where you can play around on a
scratch system that's built into the page and click through the steps to
see what happens, and change things around if you want.


I've been a DBA for 20+ years, and restored a **lot** of **copies** of 
production databases.  PostgreSQL has some seriously different concepts. 
With every other system, it's: restore full backup to new location, restore 
differential backup, apply some roll-forward logs and you're done.  No 
pausing, promoting, etc.


--
Angular momentum makes the world go 'round.




Re: pgbackrest - question about restoring cluster to a new cluster on same server

2019-09-19 Thread Stephen Frost
Greetings,

* Ron (ronljohnso...@gmail.com) wrote:
> I've been a DBA for 20+ years, and restored a **lot** of **copies** of
> production databases.  PostgreSQL has some seriously different concepts.
> With every other system, it's: restore full backup to new location, restore
> differential backup, apply some roll-forward logs and you're done.  No
> pausing, promoting, etc.

Yup, I agree entirely, PostgreSQL is different.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: PostgreSQL License

2019-09-19 Thread Steve Atkins



On 19/09/2019 13:48, Steve Litt wrote:

My understanding is the PostgreSQL license is more like the MIT
license, which actually allows one to modify the code and claim it as
proprietary.

You could do that, yes. :)

https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases

Cheers,
  Steve




is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Julie Nishimura
Hello, we've recently inherited large Greenplum system (master with standby and 
8 segment nodes), which is running old version of GP:

20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum initsystem version   = 4.3.4.0 build 1
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum current version  = PostgreSQL 8.2.15 (Greenplum
 Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc 
(GCC) 4.4.2 compiled on Apr 20 2016 08:08:56
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   Postgres 
version   = 8.2.15

If I scan logs, for the last 6 months I see the following warning after every 
transaction:
04 UTC,0,con9059926,cmd1,seg-1sx1,"WARNING","01000","database 
""my_db_name"" must be vacuumed within 1607900488 transactions",,"To avoid a 
database shutdown, execute a full-database VACUUM in ""my_db_name"".""set 
client_encoding to 'SQL_ASCII'",0,,"varsup.c",109,

The database "my_db_name" is 32 TB. According to the crontab logs, we run 
VACUUM on pg_catalog every day (while the system is online). Should I try to 
run VACUUM FULL on pg_catalog first, or I need to run VACUUM on the entire 
"my_db_name"? I am not sure what I should try first.

For the full picture: the largest schema on "my_db_name" was "temp", it was 
consuming about 25 tb. So what we did - we renamed this schema to "temp_orig", 
and created brand new schema "temp" (to make drop objects from temp_orig easier 
and isolated). However, I was hesitating to drop the entire schema that big in 
one transaction, and started dropping tables from "temp_orig", however, there 
are millions of objects in that schema, and as a result, number of "drop table" 
transactions are very high. How safe is it to run "DROP SCHEMA temp_orig 
CASCADE" if the schema is almost 25 tb?

We are running out of space very quickly. we have only 5% left on a device

Last time when we dropped millions of objects from that old schema, we were 
able to free up some space, but this time around even though I am running a lot 
of "drop tables", the space temporarily goes down (according to df -h), then it 
goes back again, even faster than I am freeing it up. Which makes me believe 
the system catalog is bloated now.

Any advice is appreciated.

Thanks a lot!



Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Adrian Klaver

On 9/19/19 12:06 PM, Julie Nishimura wrote:
Hello, we've recently inherited large Greenplum system (master with 
standby and 8 segment nodes), which is running old version of GP:


20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum initsystem version   = 4.3.4.0 build 1
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum current version  = PostgreSQL 8.2.15 (Greenplum
  Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC 
gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Postgres version   = 8.2.15


If I scan logs, for the last 6 months I see the following warning after 
every transaction:
04 UTC,0,con9059926,cmd1,seg-1sx1,"WARNING","01000","database 
""my_db_name"" must be vacuumed within 1607900488 transactions",,"To 
avoid a database shutdown, execute a full-database VACUUM in 
""my_db_name"".""set client_encoding to 'SQL_ASCII'",0,,"varsup.c",109,


The database "my_db_name" is 32 TB. According to the crontab logs, we 
run VACUUM on pg_catalog every day (while the system is online). Should 
I try to run VACUUM FULL on pg_catalog first, or I need to run VACUUM on 
the entire "my_db_name"? I am not sure what I should try first.


The vacuum warning is about transaction id wrap around:

https://www.postgresql.org/docs/8.2/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

You will need to vacuum more then just pg_catalog. You will need to do 
what the message says, vacuum the entire database.




For the full picture: the largest schema on "my_db_name" was "temp", it 
was consuming about 25 tb. So what we did - we renamed this schema to 
"temp_orig", and created brand new schema "temp" (to make drop objects 
from temp_orig easier and isolated). However, I was hesitating to drop 
the entire schema that big in one transaction, and started dropping 
tables from "temp_orig", however, there are millions of objects in that 
schema, and as a result, number of "drop table" transactions are very 
high. How safe is it to run "DROPSCHEMAtemp_orig CASCADE" if the schema 
is almost 25 tb?


Not sure.



We are running out of space very quickly. we have only 5% left on a device

Last time when we dropped millions of objects from that old schema, we 
were able to free up some space, but this time around even though I am 
running a lot of "drop tables", the space temporarily goes down 
(according to df -h), then it goes back again, even faster than I am 
freeing it up. Which makes me believe the system catalog is bloated now.


Probably due to all the other operations hitting the database.

Have you tried vacuuming the system catalogs?



Any advice is appreciated.

Thanks a lot!




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Julie Nishimura
Adrian,
We do run vacuum w/o FULL every day:
SYSTABLES="' pg_catalog.' || relname || ';' from pg_class a, pg_namespace b 
where a.relnamespace=b.oid and b.nspname='pg_catalog' an
d a.relkind='r'"

But it does not look like it frees up the space...

Or you meant we need to run vacuum on 'my_db_name' without parameters,  that it 
runs for every table? I am just not sure how long it will take to run for 39 
tb...:(

Thanks


From: Adrian Klaver 
Sent: Thursday, September 19, 2019 2:06 PM
To: Julie Nishimura 
Subject: Re: is it safe to drop 25 tb schema with cascade option?

On 9/19/19 1:30 PM, Julie Nishimura wrote:
> Adrian, thanks for your reply. We do run VACUUM on pg_catalog every day
> (while the system is online). Should I try to run VACUUM FULL on
> pg_catalog? is it the same as you referring system catalogs?

I would avoid VACUUM FULL as it acquires an exclusive lock on the table
and rewrites the table.:

https://www.postgresql.org/docs/8.2/sql-vacuum.html

A VACUUM w/o FULL will make space available for new tuples which is what
you want.

>
> Thank you!
>
> 
> *From:* Adrian Klaver 
> *Sent:* Thursday, September 19, 2019 12:38 PM
> *To:* Julie Nishimura ; t
> pgsql-general@lists.postgresql.org ;
> pgsql-general 
> *Subject:* Re: is it safe to drop 25 tb schema with cascade option?
> On 9/19/19 12:06 PM, Julie Nishimura wrote:
>> Hello, we've recently inherited large Greenplum system (master with
>> standby and 8 segment nodes), which is running old version of GP:
>>
>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>> Greenplum initsystem version   = 4.3.4.0 build 1
>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>> Greenplum current version  = PostgreSQL 8.2.15 (Greenplum
>>   Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC
>> gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56
>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>> Postgres version   = 8.2.15
>>
>> If I scan logs, for the last 6 months I see the following warning after
>> every transaction:
>> 04 UTC,0,con9059926,cmd1,seg-1sx1,"WARNING","01000","database
>> ""my_db_name"" must be vacuumed within 1607900488 transactions",,"To
>> avoid a database shutdown, execute a full-database VACUUM in
>> ""my_db_name"".""set client_encoding to 'SQL_ASCII'",0,,"varsup.c",109,
>>
>> The database "my_db_name" is 32 TB. According to the crontab logs, we
>> run VACUUM on pg_catalog every day (while the system is online). Should
>> I try to run VACUUM FULL on pg_catalog first, or I need to run VACUUM on
>> the entire "my_db_name"? I am not sure what I should try first.
>
> The vacuum warning is about transaction id wrap around:
>
> https://www.postgresql.org/docs/8.2/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND
>
> You will need to vacuum more then just pg_catalog. You will need to do
> what the message says, vacuum the entire database.
>
>>
>> For the full picture: the largest schema on "my_db_name" was "temp", it
>> was consuming about 25 tb. So what we did - we renamed this schema to
>> "temp_orig", and created brand new schema "temp" (to make drop objects
>> from temp_orig easier and isolated). However, I was hesitating to drop
>> the entire schema that big in one transaction, and started dropping
>> tables from "temp_orig", however, there are millions of objects in that
>> schema, and as a result, number of "drop table" transactions are very
>> high. How safe is it to run "DROPSCHEMAtemp_orig CASCADE" if the schema
>> is almost 25 tb?
>
> Not sure.
>
>>
>> We are running out of space very quickly. we have only 5% left on a device
>>
>> Last time when we dropped millions of objects from that old schema, we
>> were able to free up some space, but this time around even though I am
>> running a lot of "drop tables", the space temporarily goes down
>> (according to df -h), then it goes back again, even faster than I am
>> freeing it up. Which makes me believe the system catalog is bloated now.
>
> Probably due to all the other operations hitting the database.
>
> Have you tried vacuuming the system catalogs?
>
>>
>> Any advice is appreciated.
>>
>> Thanks a lot!
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com


Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Adrian Klaver

On 9/19/19 2:24 PM, Julie Nishimura wrote:

Adrian,
We do run vacuum w/o FULL every day:
SYSTABLES="' pg_catalog.' || relname || ';' from pg_class a, 
pg_namespace b where a.relnamespace=b.oid and b.nspname='pg_catalog' an

d a.relkind='r'"

But it does not look like it frees up the space...


It won't return space to the OS it just marks it as available for reuse 
by Postgres.




Or you meant we need to run vacuum on 'my_db_name' without parameters,  
that it runs for every table? I am just not sure how long it will take 
to run for 39 tb...:(


Not sure. The bottom line is you are running out of transaction ids and 
if the txid counter wraps things get ugly. You could try vacuuming 
individual non-system tables that have a lot of churn(UPDATES/DELETES) 
and see if that buys you some ids.




Thanks


*From:* Adrian Klaver 
*Sent:* Thursday, September 19, 2019 2:06 PM
*To:* Julie Nishimura 
*Subject:* Re: is it safe to drop 25 tb schema with cascade option?
On 9/19/19 1:30 PM, Julie Nishimura wrote:
Adrian, thanks for your reply. We do run VACUUM on pg_catalog every day 
(while the system is online). Should I try to run VACUUM FULL on 
pg_catalog? is it the same as you referring system catalogs?


I would avoid VACUUM FULL as it acquires an exclusive lock on the table
and rewrites the table.:

https://www.postgresql.org/docs/8.2/sql-vacuum.html

A VACUUM w/o FULL will make space available for new tuples which is what
you want.



Thank you!


*From:* Adrian Klaver 
*Sent:* Thursday, September 19, 2019 12:38 PM
*To:* Julie Nishimura ; t
pgsql-general@lists.postgresql.org ; 
pgsql-general 

*Subject:* Re: is it safe to drop 25 tb schema with cascade option?
On 9/19/19 12:06 PM, Julie Nishimura wrote:
Hello, we've recently inherited large Greenplum system (master with 
standby and 8 segment nodes), which is running old version of GP:


20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum initsystem version   = 4.3.4.0 build 1
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Greenplum current version  = PostgreSQL 8.2.15 (Greenplum
   Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC 
gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56
20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-   
Postgres version   = 8.2.15


If I scan logs, for the last 6 months I see the following warning after 
every transaction:
04 UTC,0,con9059926,cmd1,seg-1sx1,"WARNING","01000","database 
""my_db_name"" must be vacuumed within 1607900488 transactions",,"To 
avoid a database shutdown, execute a full-database VACUUM in 
""my_db_name"".""set client_encoding to 'SQL_ASCII'",0,,"varsup.c",109,


The database "my_db_name" is 32 TB. According to the crontab logs, we 
run VACUUM on pg_catalog every day (while the system is online). Should 
I try to run VACUUM FULL on pg_catalog first, or I need to run VACUUM on 
the entire "my_db_name"? I am not sure what I should try first.


The vacuum warning is about transaction id wrap around:

https://www.postgresql.org/docs/8.2/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

You will need to vacuum more then just pg_catalog. You will need to do
what the message says, vacuum the entire database.



For the full picture: the largest schema on "my_db_name" was "temp", it 
was consuming about 25 tb. So what we did - we renamed this schema to 
"temp_orig", and created brand new schema "temp" (to make drop objects 
from temp_orig easier and isolated). However, I was hesitating to drop 
the entire schema that big in one transaction, and started dropping 
tables from "temp_orig", however, there are millions of objects in that 
schema, and as a result, number of "drop table" transactions are very 
high. How safe is it to run "DROPSCHEMAtemp_orig CASCADE" if the schema 
is almost 25 tb?


Not sure.



We are running out of space very quickly. we have only 5% left on a device

Last time when we dropped millions of objects from that old schema, we 
were able to free up some space, but this time around even though I am 
running a lot of "drop tables", the space temporarily goes down 
(according to df -h), then it goes back again, even faster than I am 
freeing it up. Which makes me believe the system catalog is bloated now.


Probably due to all the other operations hitting the database.

Have you tried vacuuming the system catalogs?



Any advice is appreciated.

Thanks a lot!




--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Adrian Klaver

On 9/19/19 2:24 PM, Julie Nishimura wrote:

Adrian,
We do run vacuum w/o FULL every day:
SYSTABLES="' pg_catalog.' || relname || ';' from pg_class a, 
pg_namespace b where a.relnamespace=b.oid and b.nspname='pg_catalog' an

d a.relkind='r'"

But it does not look like it frees up the space...

Or you meant we need to run vacuum on 'my_db_name' without parameters,  
that it runs for every table? I am just not sure how long it will take 
to run for 39 tb...:(


Should have added to previous post that in this link:

https://www.postgresql.org/docs/8.2/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

there are some queries that can will show you the XID status for tables 
and the database.






Thanks




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Tom Lane
Adrian Klaver  writes:
> On 9/19/19 2:24 PM, Julie Nishimura wrote:
>> Or you meant we need to run vacuum on 'my_db_name' without parameters,  
>> that it runs for every table? I am just not sure how long it will take 
>> to run for 39 tb...:(

> Not sure. The bottom line is you are running out of transaction ids and 
> if the txid counter wraps things get ugly. You could try vacuuming 
> individual non-system tables that have a lot of churn(UPDATES/DELETES) 
> and see if that buys you some ids.

You're going to have to vacuum *all* of them to get the global freeze
counter to advance, as it's basically just the min of the per-table
pg_class.relfrozenxid values.  It might be worth looking at
age(relfrozenxid) to see which tables are furthest back, as those
are the one(s) holding things up; but if you've not been running
autovacuum then it's likely that they are all old.

Do NOT NOT NOT use VACUUM FULL here.  It will not make things any
better than a plain VACUUM, and it will take a lot longer ... time
you maybe haven't got.

regards, tom lane




Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Adrian Klaver

On 9/19/19 2:42 PM, Tom Lane wrote:

Adrian Klaver  writes:

On 9/19/19 2:24 PM, Julie Nishimura wrote:

Or you meant we need to run vacuum on 'my_db_name' without parameters,
that it runs for every table? I am just not sure how long it will take
to run for 39 tb...:(



Not sure. The bottom line is you are running out of transaction ids and
if the txid counter wraps things get ugly. You could try vacuuming
individual non-system tables that have a lot of churn(UPDATES/DELETES)
and see if that buys you some ids.


You're going to have to vacuum *all* of them to get the global freeze
counter to advance, as it's basically just the min of the per-table
pg_class.relfrozenxid values.  It might be worth looking at
age(relfrozenxid) to see which tables are furthest back, as those
are the one(s) holding things up; but if you've not been running
autovacuum then it's likely that they are all old.


So this is the part I need some clarification on. If you start vacuuming 
the tables that are setting the min then that should buy you some time?




Do NOT NOT NOT use VACUUM FULL here.  It will not make things any
better than a plain VACUUM, and it will take a lot longer ... time
you maybe haven't got.

regards, tom lane




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: FW: Re: FW: Re: Shouldn;t this trigger be called?

2019-09-19 Thread Michael Lewis
You can also look at citext type to avoid the casting.

customer_keyinteger DEFAULT
nextval('customer_key_serial') PRIMARY KEY ,
cust_no smallint NOT NULL UNIQUE ,
namevarchar UNIQUE ,

Why do you have a surrogate primary key generated by a sequence when you
have a natural key of either cust_no or name? Why not just declare the
customer number to be the PK? Where does customer number come from anyway?
Using smallint seems potentially short-sighted on potential future growth,
but changing the type later should be minimal work as long as you don't
have this customer_number denormalized many places, or use it as the FKey
after dropping customer_key surrogate key.


Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Tom Lane
Adrian Klaver  writes:
> So this is the part I need some clarification on. If you start vacuuming 
> the tables that are setting the min then that should buy you some time?

Yes, if there are some small tables that have relfrozenxid even further
back than the big ones, vacuuming them first would be a win.

regards, tom lane




Re: is it safe to drop 25 tb schema with cascade option?

2019-09-19 Thread Ron

On 9/19/19 4:55 PM, Tom Lane wrote:

Adrian Klaver  writes:

So this is the part I need some clarification on. If you start vacuuming
the tables that are setting the min then that should buy you some time?

Yes, if there are some small tables that have relfrozenxid even further
back than the big ones, vacuuming them first would be a win.


With this?

CDSCH02=# select cl.oid,
   COALESCE(ta.schemaname, 'pg_toast'),
   cl.relname,
   cl.relfrozenxid,
   age(cl.relfrozenxid)
from pg_class cl FULL JOIN pg_tables ta
   ON ta.tablename = cl.relname
where not cl.relfrozenxid = xid '0'
order by age(cl.relfrozenxid) desc
;



--
Angular momentum makes the world go 'round.




Re: PGPASSWORD in crypted form, for example BlowFish or SHA-256

2019-09-19 Thread rob stone



On Thu, 2019-09-19 at 15:23 +0200, Matthias Apitz wrote:
> El día Thursday, September 19, 2019 a las 10:31:01PM +1000, rob stone
> escribió:
> 
> > 
> > https://www.postgresql.org/docs/11/auth-password.html
> > 
> > Chapters 20.5 and 20.6 may give you more information.
> 
> The form of the password hash store in the PG server or interchange
> over
> the network is not my question. The question is more: When the Linux
> server starts and with this the (ESQL/C written) application servers
> are
> starting, they need the password to connect and this is not provided
> at
> this moment from some keyboard or humanbeing. It must be stored on
> the
> server and available in clear for the server, but not for other eyes
> on
> the server, i.e. the place of the sorage must be ciphered.
> 
>   matthias
> 

Sorry. More caffeine needed.

If you use pg_service.conf you could write a little program to encrypt
the password and store it in this file in its encrypted form.
Then you application obtains the connection credentials from
pg_service.conf, de-encrypts the password and is then able to form the
connection string to access the required database.

HTH,
Robert







Re: problems importing from csv

2019-09-19 Thread Michael Lewis
>I am trying to import some data from spreadsheets. Included in the data
>sets are US monetary values. These appear in the CSV file, like this: $1.00
>The column is defined like this: NUMERIC(5,2) NOT NULL.

1) remove all $ characters from csv before import
OR
2) import into text field (perhaps in a temp table), remove $ characters,
cast value and insert into numeric field

By the way, there may be no benefit to specifying a max value of 99,999.99
if you can foresee a potential need for bigger values. For values that fit,
numeric(1000,2) will store numbers in the same bytes as a numeric(5,2)
field will. It just won't throw an error if large values are used in the
future.