Reconstructing transaction content after the fact

2020-09-02 Thread Lionel Bouton
Hi,

I'm a long time PostgreSQL user (since 7.x), familiar with the basics of
PostgreSQL internals and I'm currently participating in an audit of a
software voting solution which should for obvious reasons avoid to store
any link between the voter register and the actual votes.

The software uses a modern version of PostgreSQL (9.6 or later) and uses
a transaction to store a vote and the corresponding register entry in
two separate tables.

There's already an obvious angle of attack to correlate a vote and a
register entry : the WAL. That said the WAL can be configured to be
short lived to minimize the attack surface. By my understanding a low
checkpoint_timeout value should allow postgresql to recycle/remove old
WAL files early unless the system is overloaded. This can become a bit
messy but if the system is tightly controlled I believe the attack
surface can be reduced.

Another angle I could think of are the transaction ids. If I'm not
mistaken they are stored on disk inside the files storing table and
index data (my quick read of the documentation lets me think the txid is
stored in t_xmin in the HeapTupleHeaderData for a row content).
These values might be cleaned up when the data is visible by all
currently running transaction but I think this isn't needed and so
probably not done (this information is stored in the visibility map). So
reading the raw content from disk you should be able to correlate data
in several tables from a single transaction by comparing the txid.

Are txids in table file data indeed a means by which you can recover the
data written by a single transaction (assuming the txids don't overflow
the 32bit limit during the life of the cluster) ?

Are these t_xmin values ever cleaned up (by VACUUM or another mechanism)
? If positive is there a way to configure the approximate time during
which these values can be recovered ?

Is there a way to access these values by connecting to a PostgreSQL
server instead of analyzing in-memory or on-disk data ?

Best regards,

-- 
Lionel Bouton
gérant de JTEK SARL
https://www.linkedin.com/in/lionelbouton/





Re: Reconstructing transaction content after the fact

2020-09-02 Thread Tom Lane
Lionel Bouton  writes:
> Are txids in table file data indeed a means by which you can recover the
> data written by a single transaction (assuming the txids don't overflow
> the 32bit limit during the life of the cluster) ?

They are.  You could reduce the connections between different rows by
writing them in different subtransactions of the parent transaction
(use SAVEPOINT/RELEASE SAVEPOINT, or exception blocks in plpgsql).
But unless there's many parallel transactions writing data, somebody
could probably still reconstruct things by assuming that nearby XIDs
represent subtransactions of a single transaction.

> Are these t_xmin values ever cleaned up (by VACUUM or another mechanism)
> ? If positive is there a way to configure the approximate time during
> which these values can be recovered ?

See VACUUM FREEZE.  You couldn't hide connections immediately after
insertion, but if the idea is to sanitize every so often, it'd help.

> Is there a way to access these values by connecting to a PostgreSQL
> server instead of analyzing in-memory or on-disk data ?

SELECT xmin, ... FROM votes;

regards, tom lane




Tuchanka

2020-09-02 Thread Олег Самойлов
Hi all.

I have developed a test bed to test high available clusters based on Pacemaker 
and PostgreSQL. The combination of words "test bed" was given to me by a 
dictionary. For an russian this is rather funny, so, please, tell me is this 
suitable phrase for this? The test bed is deployed on VirtualBox  virtual 
machines (VMs) in MacBook Pro. Totally there will be 12 VMs which will occupy 
36GiB of hard disk. They will form 4 high available clusters (different 
variants). The clusters are automatically created. And can be automatically 
tested. The special script will in loop imitates different faults, wait for 
restoration the cluster, fix the broken node and do next test. The project is 
under MIT license in GitHub and I just have finished translation README to 
English language.

https://github.com/domclick/tuchanka

This test bed can be used to test HA clusters. There is a list of already 
detected problems of Pacemaker and PostgreSQL in the README. And it can be used 
for presentations, thats why it is designed to run inside one MacBook Pro. I 
think this will be much better instead of screenshots or video to show how HA 
clusters survive different faults in the real time.

The software is rather outdated. It works with PostgreSQL 11 and CentOS 7. The 
next step will be upgrading to CentOS 8 and PostgreSQL 12. Please tell me, is 
it useful and worth to continue? Where is better announce it? May be somewhere 
exists special mailing list for such things.



Re: Reconstructing transaction content after the fact

2020-09-02 Thread Francisco Olarte
Lionel:

On Wed, Sep 2, 2020 at 5:46 PM Lionel Bouton  wrote:
...
> The software uses a modern version of PostgreSQL (9.6 or later) and uses
> a transaction to store a vote and the corresponding register entry in
> two separate tables.
...>
> There's already an obvious angle of attack to correlate a vote and a
> register entry : the WAL. That said the WAL can be configured to be
> short lived to minimize the attack surface. By my understanding a low

> Another angle I could think of are the transaction ids. If I'm not
> mistaken they are stored on disk inside the files storing table and
> index data (my quick read of the documentation lets me think the txid is
...
> Is there a way to access these values by connecting to a PostgreSQL
> server instead of analyzing in-memory or on-disk data ?

If you assume attackers can read your disks, or even query arbitrary
tables, you are going to find plenty of potential attacks. Apart from
xid databases tend to lay insertions sequentially on disk, so if your
main way of filling the tables is inserting a pair you've got a nice
probability of correlating them with just two "select * from
voters/entries" ( or just dumping the raw table files if they get file
but not sql access ). In fact even paper voting can be attacked this
way ( in spain they write every voter in the order in which they vote
on a sheet, and if you do not take care to shake the ballot boxes you
can  get a very decent aproximation to the envelope insertion order (
you will need raw access to the boxes, like you would need raw sql or
disk access in your case ) ) .   ( Not sure if ballot box is the
correct term )

For something like that I would try to insure no disk access, no raw
sql access, give the apps a single point of access to the DB
mediating every query/operation with a stored procedure/function,
using accounts with access to only those, even for selects, so you
have tight control and easier auditing.

Francisco Olarte.




Re: Reconstructing transaction content after the fact

2020-09-02 Thread Alvaro Herrera
On 2020-Sep-02, Tom Lane wrote:

> Lionel Bouton  writes:

> > Are these t_xmin values ever cleaned up (by VACUUM or another mechanism)
> > ? If positive is there a way to configure the approximate time during
> > which these values can be recovered ?
> 
> See VACUUM FREEZE.  You couldn't hide connections immediately after
> insertion, but if the idea is to sanitize every so often, it'd help.

Starting with 9.4 which introduced the use of HEAP_XMIN_FROZEN
combination in infomask to replace rewriting the xmin value proper, the
Xids will be preserved by freezing, so that won't help.

One option to hide the xids might be to recreate the tables every once
in a while, with something like

BEGIN;
 LOCK TABLE votes;
 DROP TABLE IF EXISTS votes_copy;
 CREATE TABLE votes_copy AS SELECT * FROM votes;
 DROP TABLE votes;
 ALTER TABLE votes_copy RENAME TO votes;
 -- recreate indexes, if any?
COMMIT;

which will make all rows have the same Xmin.  Since the voting process
involves a human act and the tables are not expected to be enormous, it
might not be totally out of the question to do this after every vote, or
in the worst case, once every minute or so.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Reconstructing transaction content after the fact

2020-09-02 Thread Tom Lane
Alvaro Herrera  writes:
> On 2020-Sep-02, Tom Lane wrote:
>> See VACUUM FREEZE.  You couldn't hide connections immediately after
>> insertion, but if the idea is to sanitize every so often, it'd help.

> Starting with 9.4 which introduced the use of HEAP_XMIN_FROZEN
> combination in infomask to replace rewriting the xmin value proper, the
> Xids will be preserved by freezing, so that won't help.

Ah, right.

> One option to hide the xids might be to recreate the tables every once
> in a while, with something like

Hmm.  Expensive, but if you were willing to make it even more expensive,
you could also defeat the tuple-ordering attacks mentioned upthread:

   CREATE TABLE votes_copy AS SELECT * FROM votes ORDER BY random();


regards, tom lane




Re: SSL between Primary and Seconday PostgreSQL DBs

2020-09-02 Thread Peter Eisentraut

On 2020-08-27 12:57, Susan Joseph wrote:

So has no one done this before?


I'm sure people have done this.  But I suggest that if you ask a 
question on this mailing list, you ask something more concrete, like, I 
tried to do this, and got stuck here, and tried this and got this error. 
 People can help with that sort of thing.  What we have here is a 
complex security setup and you are asking people to do an open-ended 
review.  No one wants to do that.



-Original Message-
From: Susan Joseph 
To: pgsql-gene...@postgresql.org 
Sent: Mon, Aug 24, 2020 10:10 am
Subject: SSL between Primary and Seconday PostgreSQL DBs

I have setup a Primary and Secondary PostgreSQL DBs.  They were setup up 
with basic replication then I went back and modified them to use SSL.  I 
am just not sure if I did it correctly.  Everything is working but I 
want to make sure I have the settings correctly.  I am using PostgreSQL 
11.2.


  * I have a PKI that I stood up so I issued 2 server certificates one
for each database from my CA.
  * Primary server certificate - Primary Database
  o The FQDN and IP address are set in the SAN field.
  o FQDN is also the CN in the DN
  o Key Usage is set to Digital Signature and Key encipherment
  o EKU is set to Server Authentication and Client Authentication
  * Rep_user certificate - Secondary Database
  o CN is set to the rep_user account name
  o Key Usage is set to digital signature and key encipherment
  o EKU is set to client authentication
  * Each certificate file contains the certificate and the subCA
certificate who issued the certificate and put in a file called
server.crt for the Primary and client.crt for the secondary.
  * The key for each certificate is stored in a separate file
unencrypted (I have questions about this later on) in a file called
server.key and client.key
  * The server.crt, server.key, and root.crt are put onto the primary
database server in the /data/pgsql/data location, the owner and
group of these files is set to postgres
  * The client.crt, client.key, and root.crt are put onto the primary
database server in the /data/pgsql/data location, the owner and
group of these files is set to postgres
  * On the Primary in postgresql.conf I set:
  o ssl=on
  o ssl_ca_file='root.crt'
  o ssl_cert_file='server.crt'
  o ssl_key_file='server.key'
  o ssl_ciphers='HIGH:MEDIUM:+3DES:!aNULL'
  * On the Primary in pg_hba.conf I add a replication line:
  o hostssl    replication 
rep_user  cert

  * On the Secondary I set the following information in the
postgresql.conf to:  (DO I NEED TO DO THIS??)
  o ssl=on
  o ssl_ca_file='root.crt'
  o ssl_cert_file='client.crt'
  o ssl_cert_fkey='client.key'
  o ssl_ciphers='HIGH:MEDIUM:+3DES:!aNULL'
  * On the Secondary I edit the recovery.conf file to the following:
  o primary_conninfo = 'user=rep_user passfile=''/data/.pgpass''
host= port=5432 sslmode=verify-ca
sslcert=client.crt sslkey=client.key sslcompression=0
target_session_attrs=any'
  * On the Secondary I edit the pg_hba.conf file and change the rep_user
line to:
  o hostssl  replication rep_user   /32  cert clientcert=1
  * On the Secondary I move the root.crt to /data/pgsql/data/.postgresql
  * Then I restart the databases


My questions are:

  * Do I need to set the information in the Secondary postgresql.conf? 
Originally I did not set this and everything worked but I saw errors

in my log files that said to do SSL these needed to be set so I went
back and set them.  Are there pgsql commands I can run to test that
my SSL is working in both directions?
  * Are my pg_hba.conf files set correctly?  Is that how you get SSL
"turned on" for communications between the primary and the rep_user
account?
  * If I leave my key file encrypted then every time my databases have
to be started have to enter the password.  So you can either leave
the passwords unencrypted and set the permissions on the file to
0600 accessible only by postgres or you can enter the key password
each time the database is started up.  As someone in the security
field I have a tough time leaving the key unencrypted but as some
setting up a production system that is located on a network that you
can't get to without directly accessing the server I feel that is
enough security that I can leave them unencrypted.  Thoughts?
  * Am I missing anything?  There are no videos out there that show how
to stand up a 2 way SSL communication channel between the primary
and secondary, or does anyone have one that they can share?


Thanks,
   Susan






--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Trigger transaction isolation

2020-09-02 Thread Dirk Lattermann
Thank you, Tom, for this valuable information.

On Tue, 01 Sep 2020 11:02:01 -0400
1Tom Lane  wrote:

> visibility rules are the same as for any other function.  So the
> answer to the OP's question depends on the transaction's isolation
> level and (for typical PLs) on whether the function is VOLATILE or
> not.
> 
> serializable -> the transaction's initial snapshot is used throughout

I suppose by writing serializable, you include repeatable read here,
too?

> 
> non serializable, volatile function -> each statement in the function
> takes a fresh snapshot

This is the needed behaviour for the validations I intend to do, then.

> 
> non serializable, non-volatile function -> the function uses a single
> snapshot.  For a non-deferred trigger, I think it shares the snapshot
> used by the triggering query.  Not sure about exactly when the
> snapshot is taken for a deferred trigger.

So this case seems to be trickier than it looks at a first glance.
It depends heavily on the use case, of course, in how far the snapshot
time influences the correctness of a trigger written in this mode.

I suggest adding this to the documentation and will try to use the
commenting function there for it.

Thanks!
Dirk