Security Vulnerability on PostgreSQL VMs

2020-07-17 Thread Hilbert, Karin
We have PostgreSQL v9.6 & also PostgreSQL v11.8 installed on various Linux VMs 
with Red Hat Enterprise Linux Server release 7.8 (Maipo) OS.  We're also 
running repmgr v5.1.0 & PgBouncer v1.13.

We're getting vulnerability reports from our Security Office for the following 
packages:
 - python-pulp-agent-lib-2.13.4.16-1.el7sat
 - python-gofer-2.12.5-5.el7sat

For some reason these packages aren't being updated to the current versions & 
our Linux Admins haven't been able to resolve the update issue.  It has 
something to do with a satellite?   (I'm not a Linux Admin - I don't really 
know what they're talking about).  Anyway, are these packages anything that 
would be required by PostgreSQL, repmgr or PgBouncer?  It's nothing that I 
installed on the VMs - I assume that it's something installed along with the 
OS.  The Linux Admin's recommendation is to just remove these packages.

Thanks,

Karin Hilbert



New message in PostgreSQL log regarding socket for statistics collector

2022-09-16 Thread Hilbert, Karin
​I'm working on an implementation of PostgreSQL 13.8 on a Linux RHEL8 sandbox 
environment.
The server & PostgreSQL have all been configured.
I started PostgreSQL & got the following messages in the log file:

2022-09-16 02:00:16 EDT [1918984]: [3-1] db=,user= LOG:  starting PostgreSQL 
13.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 
8.5.0-10), 64-bit
2022-09-16 02:00:16 EDT [1918984]: [4-1] db=,user= LOG:  listening on IPv4 
address "0.0.0.0", port 5432
2022-09-16 02:00:16 EDT [1918984]: [5-1] db=,user= LOG:  listening on IPv6 
address "::", port 5432
2022-09-16 02:00:16 EDT [1918984]: [6-1] db=,user= LOG:  listening on Unix 
socket "/var/run/postgresql/.s.PGSQL.5432"
2022-09-16 02:00:16 EDT [1918984]: [7-1] db=,user= LOG:  listening on Unix 
socket "/tmp/.s.PGSQL.5432"
2022-09-16 02:00:17 EDT [1918984]: [8-1] db=,user= LOG:  test message did not 
get through on socket for statistics collector
2022-09-16 02:00:17 EDT [1918984]: [9-1] db=,user= LOG:  trying another address 
for the statistics collector
2022-09-16 02:00:17 EDT [1918986]: [1-1] db=,user= LOG:  database system was 
shut down at 2022-09-14 17:26:53 EDT
2022-09-16 02:00:17 EDT [1918984]: [10-1] db=,user= LOG:  database system is 
ready to accept connections

I'm concerned about the "test message did not get through on socket for 
statistics collector" & "trying another address for the statistics collector" 
messages.

I've never seen these before & wasn't able to find anything about them via a 
Google search.  What I did find, talked about the unix socket directories.  I 
left that parameter at the default setting:

#unix_socket_directories = '/var/run/postgresql, /tmp'# comma-separated 
list of directories

I checked both of those locations & found the following entries:

[postgres@xx ~]$ ls -al /var/run/postgresql/
total 4
drwxr-xr-x  2 postgres postgres  80 Sep 16 02:00 .
drwxr-xr-x 30 root root 900 Aug 31 13:17 ..
srwxrwxrwx  1 postgres postgres   0 Sep 16 02:00 .s.PGSQL.5432
-rw---  1 postgres postgres  67 Sep 16 02:00 .s.PGSQL.5432.lock

[postgres@xx ~]$ cat /var/run/postgresql/.s.PGSQL.5432.lock
1918984
/var/lib/pgsql/13/data
1663308016
5432
/var/run/postgresql

[postgres@xx ~]$ ls -al /tmp/ | grep .s.PGSQL
srwxrwxrwx   1 postgres postgres  0 Sep 16 02:00 .s.PGSQL.5432
-rw---   1 postgres postgres 52 Sep 16 02:00 .s.PGSQL.5432.lock

[postgres@xx ~]$ cat /tmp/.s.PGSQL.5432.lock
1918984
/var/lib/pgsql/13/data
1663308016
5432
/tmp

Is there anything that I should be concerned about regarding these messages or 
are they normal?
Also, should I change the unix_socket_directories parm to just have one 
directory?  In past versions of PostgreSQL, we've just had one directory:

#unix_socket_directories = '/tmp'

Thanks, Karin
Karin Hilbert
Database Administration
Pennsylvania State University


Re: New message in PostgreSQL log regarding socket for statistics collector

2022-09-16 Thread Hilbert, Karin
Christoph,
Thank you for your explanation.
That puts my mind at ease.
Regards, Karin

From: Christoph Moench-Tegeder 
Sent: Friday, September 16, 2022 3:43 PM
To: Hilbert, Karin 
Cc: pgsql-general@lists.postgresql.org 
Subject: Re: New message in PostgreSQL log regarding socket for statistics 
collector

## Hilbert, Karin (i...@psu.edu):

> 2022-09-16 02:00:16 EDT [1918984]: [7-1] db=,user= LOG:  listening on Unix 
> socket "/tmp/.s.PGSQL.5432"
> 2022-09-16 02:00:17 EDT [1918984]: [8-1] db=,user= LOG:  test message did not 
> get through on socket for statistics collector
> 2022-09-16 02:00:17 EDT [1918984]: [9-1] db=,user= LOG:  trying another 
> address for the statistics collector

> I'm concerned about the "test message did not get through on socket
> for statistics collector" & "trying another address for the statistics
> collector" messages.

That's the stats collector socket, which is used to communicate runtime
statistic updates inside the server. As a user, you'll never interact
with this socket directly. It's created on the first working address
which the system returns as "localhost". The server will try all the
"localhost" addresses until the socket can be created and passes the
test message, logging messages similar to your message for each failure.
In your case the test message was not received (or it was received
too late - the server waits only 0.5 seconds for the test message).
In any case, a working stats socket was created on your machine:
without a working stats socket server startup will fail. (You should
be able to see that socket in each PostgreSQL process' file descriptors:
UDP, localhost, high port, "connected" to the very same address/port
tuple).
Why did the test message fail on your first socket? Maybe there's
some filtering in place, or your system was very slow and missed
the 500ms timeout, or the address was otherwise unusable (not
available in that namespace, trying to use IPv4/IPv6 when not
configured... I'd have to test which condition results in failure
at this stage).

Regards,
Christoph

--
Spare Space


Database size different on Primary and Standby?

2023-01-18 Thread Hilbert, Karin
I manage some PostgreSQL clusters on Linux.  We have a Primary & two Standby 
servers & for Production, there is also a DR server.  We use repmgr for our HA 
solution & the Standbys are cloned from the Primary using the repmgr standby 
clone command.

My manager asked for a report of all the user databases & their sizes for each 
server in the cluster.
I used the psql "\l+" command & then extracted the database name & the size 
from the output.
I expected the databases to be the same size on the Standbys as on the Primary, 
but I found that some of the databases were smaller on the Standby servers than 
on the Primary.

For example, the output on the Primary for one of the user databases showed as: 
 8997 kB, but on the Standbys, it was 8849 kB.

I even dropped the database on the Primary & then restored it from a backup.  
Then checked the sizes again & they still showed the difference.

I also found that the template1 database on the Primary was 7821 kB, but on the 
Standbys, it was 7673 kB.
Is this normal?  Why would the sizes be different?

Thanks,
Karin Hilbert
Database Administration
Pennsylvania State University



Manage PostgreSQL Database for GITLAB Application?

2019-01-21 Thread Hilbert, Karin
Does anyone manage a PostgreSQL database for a GITLAB application?

I have PostgreSQL v9.6 installed on my server & we are trying to migrate a 
GITLAB database there.

The developer says that we need to use the public schema instead of the schema 
of the same name as the application user.

The schema that he provided me to restore also is revoking all privileges from 
the database owner & instead granting all privileges to PUBLIC.

Has anyone else run across this?  I always thought that granting privileges to 
PUBLIC is a bad security thing to do?

If anyone can offer any thoughts regarding this, it would be greatly 
appreciated.

Thanks, Karin


Karin Hilbert
Database Specialist
Administrative Information Services
Pennsylvania State University
25 Shields Bldg., University Park, PA  16802
Work - 814-863-3633
Email - i...@psu.edu
IM - i...@chat.psu.edu


Re: Manage PostgreSQL Database for GITLAB Application?

2019-01-21 Thread Hilbert, Karin
Thanks Stephen,


I'm under the gun to get this database restored & then tested with the 
application.

I'll try changing the schema back from public to the original schema (the same 
as the application user account name).  If that doesn't work for the 
application, then I'll try leaving the schema as public.

I'll definitely remove the statements revoking privileges from the dbowner & 
change the grant statements back to the application account instead of PUBLIC.


The only access to the database is from the gitlab application (I guess that's 
what you mean by "I'd definitely have the database be dedicated to gitlab.")


I make the developer have his application connect in with the application user 
account for normal operations.  When his application undergoes an upgrade, it 
needs to also be able to update the database.  I always made him connect with 
the dbowner account for this & then switch the connection back the application 
user account when the upgrade was done.


Thanks for confirming my thoughts about public.  I was starting to second guess 
myself.


May I also ask your thoughts regarding something else for the gitlab database?

We have two instances; one for development & one for production.  When we 
originally created the databases, we had separate names for the database, 
schema & application user:


dbname_dev/dbname_prod

sname/snamep

username/usernamep


The other year, we had to restore the prod database backup to dev & that 
changed the schema name.  I was thinking that it would be better have the same 
names used for dev & prod so that restores from one environment to another 
would be easier.  (That's a standard that our DBA team employs for our SQL 
Server databases.)  Does it make sense to also employ that standard for 
PostgreSQL databases?  Is there any reason to keep the names different between 
the environments?


Thanks again for your help.

Regards,

Karin

________
From: Stephen Frost 
Sent: Monday, January 21, 2019 1:53:00 PM
To: Hilbert, Karin
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Manage PostgreSQL Database for GITLAB Application?

Greetings,

* Hilbert, Karin (i...@psu.edu) wrote:
> Does anyone manage a PostgreSQL database for a GITLAB application?

Yes.

> I have PostgreSQL v9.6 installed on my server & we are trying to migrate a 
> GITLAB database there.
>
> The developer says that we need to use the public schema instead of the 
> schema of the same name as the application user.

Not sure this is really required but it also shouldn't hurt anything
really- I'd definitely have the database be dedicated to gitlab.

> The schema that he provided me to restore also is revoking all privileges 
> from the database owner & instead granting all privileges to PUBLIC.

That's terrible.

> Has anyone else run across this?  I always thought that granting privileges 
> to PUBLIC is a bad security thing to do?

Yes, that's bad from a security perspective and shouldn't be necessary.
GRANT rights to the user(s) the application logs into, don't just grant
them to PUBLIC- that would allow anyone on the system to have access.

> If anyone can offer any thoughts regarding this, it would be greatly 
> appreciated.

Is this developer the only one who is going to be using this gitlab
instance..?  Sounds like maybe they want direct database access which
would only make sense if they're the one running it and should have full
access- but even then, I'd create a role and grant access to that role
and then grant them that role, if that's the requirement.  GRANT'ing
things to public isn't a good idea if you're at all concerned about
security.

Thanks!

Stephen


ALTER DEFAULT PRIVILEGES FOR ROLE

2019-01-30 Thread Hilbert, Karin
After a database was updated by the application, a schema dump showed the 
following default privilege statements:

--
-- Name: DEFAULT PRIVILEGES FOR SEQUENCES; Type: DEFAULT ACL; Schema: public; 
Owner: gitlab_dbo
--

ALTER DEFAULT PRIVILEGES FOR ROLE  IN SCHEMA public REVOKE ALL ON 
SEQUENCES  FROM ;
ALTER DEFAULT PRIVILEGES FOR ROLE  IN SCHEMA public GRANT SELECT,USAGE 
ON SEQUENCES  TO ;


--
-- Name: DEFAULT PRIVILEGES FOR TABLES; Type: DEFAULT ACL; Schema: public; 
Owner: 
--

ALTER DEFAULT PRIVILEGES FOR ROLE  IN SCHEMA public REVOKE ALL ON 
TABLES  FROM ;
ALTER DEFAULT PRIVILEGES FOR ROLE  IN SCHEMA public GRANT 
SELECT,INSERT,DELETE,UPDATE ON TABLES  TO ;

Why would you want to revoke all privileges from the dbowner?
It actually had granted the privileges to PUBLIC, but I revoked those 
privileges & changed it to the app account.


What is the difference between these statements?:
ALTER DEFAULT PRIVILEGES FOR ROLE  IN SCHEMA public GRANT ...  TO 
;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ... TO ;




Karin Hilbert
Database Specialist
Administrative Information Services
Pennsylvania State University
25 Shields Bldg., University Park, PA  16802
Work - 814-863-3633
Email - i...@psu.edu
IM - i...@chat.psu.edu


Drive Architecture for new PostgreSQL Environment

2019-06-10 Thread Hilbert, Karin
Hello,


We're in the process of building a new PostgreSQL environment on Scientific 
Linux release 7.6.

The new environment will have a Primary & 2 Standby servers & have asynchronous 
replication.  It will use repmgr to manage failover/switchover events.


In the past, we've always had separate separate physical drives for data, 
pg_xlog & backups.

We did this as a precaution against disk failure.  If we lose one, we would 
still have the other two to recover from.

Is that really necessary anymore, with having a repmgr cluster?


My Linux Admin wants to do the following instead:

What I propose is to set this up as a single drive and isolate the three 
directories using the Linux logical volume manager.  As a result, each 
directory would be on a separate filesystem.  This would provide the isolation 
that you require but would give me the ability to modify the sizes of the 
volumes should you run out of space.  Also, since this is a VM and all drives 
are essentially “virtual”, the performance of this different drive structure 
would be essentially identical to one with three separate drives.


Your thoughts would be appreciated.

Regards,

Karin Hilbert



Enterprise Active Directory Authentication?

2021-03-24 Thread Hilbert, Karin
I manage a PostgreSQL database cluster - PostgreSQL v12.4.
Our database cluster is on a Linux VM, with OS:
  Flavor: redhat_7
  Release: 3.10.0-1160.15.2.el7.x86_64

We have a repmgr cluster of 1 Primary & 2 Standby servers & use another server 
with PgBouncer to direct the connections to the current Primary.  Our firewall 
team writes exceptions for the application connections into the PgBouncer 
server.

We are in the process of adding a new database for a new client.
We normally create our databases & objects with a DB Owner account & the 
application connects with a "Hero" user account that has read/write permissions.

However, the new customer has asked if it would be possible for their employees 
to connect (via a VPN) from multiple locations with existing Enterprise Active 
Directory  (EAD) groups instead of using connections from the application with 
a single "Hero" account.

With the PostgreSQL/PgBouncer combination, we put entries in the pg_hba.conf 
file for PostgreSQL.  A normal entry for an external connection to our 
databases would look like this:

hostssl dbName  userNameIPAddress/32  md5

& the PgBouncer database.ini & userlist entries, respectively, would be:

dbAlias = host=PostgreSQLServerName dbname=dbName auth_user=HeroAcct
"HeroAcct" "md5"

where
dbAlias stands for the PgBouncer database alias
dbName stands for the PostgreSQL database name
HeroAcct stands for the username used for the application connection to the 
database
PostgreSQLServerName  stands for the current Primary PostgreSQL server
and IPAddress is the IP for the PgBouncer server

Is it possible to allow connections from EAD groups instead of individual 
accounts & if so, can you tell me how the config files would be changed to 
allow it?

Thanks,

Karin Hilbert



How to Resolve Data Being Truncated or Rounded Up During PostgreSQL Migration from v9.623 to v12.8?

2021-10-28 Thread Hilbert, Karin
I manage a PostgreSQL databases - we currently have clusters on PostgreSQL 
v9.6.23 & PostgreSQL v12.8.
Our database clusters are on Linux VMs, with OS:
  Flavor: redhat_7
  Release: 3.10.0-1160.45.1.el7.x86_64

We have repmgr clusters of 1 Primary & 2 Standby servers & use another server 
with PgBouncer to direct the connections to the current Primary.

I am in the process of migrating the v9.6.23 databases to the v12.8 cluster, 
which already has live databases on it, so I'm doing a pg_dump on the v9.6 
cluster for the individual databases to be migrated & restoring the backups to 
the v12.8 cluster.  I'm currently testing in a sandbox cluster.  The restore 
completes successfully.

After the restore, I compare the rowcounts of the dbs from both versions to 
verify that the data loaded correctly.
I also do a pg_dump of just the data from both clusters & compare them with the 
diff utility.  For one of the databases, I'm discovering some differences in 
the data.  It looks like some data is being truncated:

5,6c5,6
< -- Dumped from database version 9.6.23
< -- Dumped by pg_dump version 9.6.23
---
> -- Dumped from database version 12.8
> -- Dumped by pg_dump version 12.8
34085c34085
<   xxx P   108 xxx UP  FI  
-xx-xx  53809.6016  53809.6016  52W  0   xxx 0  
 x   \N
---
>   xxx P   108 xxx UP  FI  
> -xx-xx  53809.653809.652W0   xxx 0
>x   \N
34088c34088
<   xxx P   108 xxx UP  FI  
-xx-xx  53809.6016  53809.6016  52W  0   xxx 0  
 x   \N
---
>   xxx P   108 xxx UP  FI  
> -xx-xx  53809.653809.652W0   xxx 0
>x   \N
…ß data is truncated in new database
147825,147826c147825,147826
<   \N  \N  \N  46716.8008  \N  \N  \N  
\N  \N  \N  \N
<   \N  \N  \N  38729.6016  \N  \N  \N  
\N  \N  \N  \N
---
>   \N  \N  \N  46716.8  \N  \N  \N 
>  \N  \N  \N  \N
>   \N  \N  \N  38729.6  \N  \N  \N 
>  \N  \N  \N  \N


When I looked at the table specification, it is the same in both versions & the 
affected columns are specified as datatype real:

   Table "tablex"
  Column  | Type | Modifiers
--+--+---
 id   | integer  | not null
 column2  | character(8) | not null
 column3  | character(3) | not null
 column4  | character(1) |
 column5  | character(4) |
 column6  | character(10)|
 column7  | character(2) |
 column8  | date |
 column9  | real |
 column10 | real |


When I do a select on each database version, the results both display the 
truncated data:

 id | column9  | column10
+--+--
    |  53809.6 |  53809.6
(1 row)

And when I try to export the data from both versions, the data also exports 
with a 1-digit decimal for those columns.
It's only when I do the pg_dump that I can see the extra digits from the 
v9.6.23 tables.

In other tables, I'm seeing differences with only 2 digits showing for columns 
where the datatype is real - they are being rounded up.  For example:
19.817  \N  \N  3435\N  1   \N  
\N  \N  34350   0
   3435\N  \N  \N  0
…
25.878  \N  \N  4484.12988  80  \N  
\N  \N  \N  2069.6001
   0   0   2069.6001   \N  \N  \N  0
vs.
19.82   \N  \N  3435\N  1   \N  \N  
\N  34350   0   3435
\N  \N  \N  0
…
25.87   \N  \N  4484.13 80  \N  \N  \N  
\N  2069.6  0   0   2069.6  \N  \N  \N  0


How can I ensure that the data was migrated correctly - that the data hasn't 
been truncated or rounded up in the v12.8 tables?
Any help would be greatly appreciated.

Thanks,

Karin Hilbert


Re: How to Resolve Data Being Truncated or Rounded Up During PostgreSQL Migration from v9.623 to v12.8?

2021-10-28 Thread Hilbert, Karin
Thanks for the explanation, Tom.  🙂

From: Tom Lane 
Sent: Thursday, October 28, 2021 11:04 AM
To: Hilbert, Karin 
Cc: pgsql-general@lists.postgresql.org 
Subject: Re: How to Resolve Data Being Truncated or Rounded Up During 
PostgreSQL Migration from v9.623 to v12.8?

"Hilbert, Karin"  writes:
> [ PG12 displays float values a tad differently from 9.6 ]

This is not a bug; we just changed the behavior of the
"extra_float_digits" display option, so that it's less likely
to print garbage digits.  A float4 value only has about six
decimal digits of precision to begin with, and those extra
digits you are seeing in the 9.6 dump are basically fictional.

pg_dump does "set extra_float_digits to 3", which used to be
necessary to get reproducible results when dumping from old
servers.  That has this effect on 9.6:

regression=# select '53809.6'::float4;
 float4
-
 53809.6
(1 row)

regression=# set extra_float_digits to 3;
SET
regression=# select '53809.6'::float4;
   float4

 53809.6016
(1 row)

But it has no effect on new servers, because the six digits are
already enough to recreate the float4 value exactly.  The "016"
added by the old server is basically roundoff noise.

The reason for extra_float_digits is that with the old output
algorithm, there were corner cases where we had to print more than
six digits to ensure the value reloads exactly.  The new algorithm
automatically prints the minimum number of digits needed to ensure
exact reload.

All the same comments apply to float8, of course, with a
different number of digits.

regards, tom lane


How to Implement DR for a Production PostgreSQL v12.8 database cluster with repmgr & PgBouncer?

2021-11-12 Thread Hilbert, Karin
I manage PostgreSQL v12.8 database clusters.
Our database clusters are on Linux VMs, with OS:
  Flavor: redhat_7
  Release: 3.10.0-1160.45.1.el7.x86_64

We have repmgr clusters of 1 Primary & 2 Standby servers & use another server 
with PgBouncer to direct the connections to the current Primary.

The PostgreSQL servers have asynchronous replication & use repmgr to handle 
automatic failovers.
Any failovers have always promoted the 1st Standby server in the cluster.
We did have one time where the newly promoted server almost immediately went 
down & the 2nd Standby was promoted.
All three servers in the cluster currently have a priority of 100.

I've been requested to set up DR for the Production cluster.  My Google 
searches haven't found anything describing how DR should be implemented.  I 
have seen that there should always be an odd number of servers in the cluster.

My thoughts are to have a new VM created in our DR datacenter that will replace 
the 2nd Standby server in the cluster.  I'm thinking that the DR server would 
have a lower priority in the repmgr.conf file (50 instead of 100), since we 
would want the local Standby to be promoted first.  We have failover configured 
to be automatic, but it waits 80 seconds before promoting a Standby (to avoid a 
premature failover due to network flapping).

Is there any reason to change the failover configuration to manual for DR?  I 
would think in a DR situation we would want it to be automatic, but my 
colleague disagrees.  I'm thinking that if the Primary & the 1st Standby both 
go down, even if it isn't a real "DR" situation, we would still want it to 
promote to the 3rd (DR) server to prevent an outage.  Our failover script 
performs post-promote tasks, including redirecting the PgBouncer server to 
point to the new Primary, so it shouldn't matter if it was pointing to the DR 
server or a local server.

If we do decide to make failover manual, can the configuration specify 
automatic for the local servers & manual for the DR server, or is it an all or 
nothing type of configuration?

I guess another strategy could be to configure it as a stand-alone server & 
leave our current cluster intact.  Then copy all the backup & WAL files to the 
DR server, but then in a DR situation we would have to perform the restore 
before the databases could be available.

Any advice on a good DR strategy would be appreciated.
Thanks,

Karin Hilbert