Re: pg_controldata: could not read file "/opt/postgres/9.2/data//global/pg_control": Success

2018-10-08 Thread Rajni Baliyan
Hi Raghvendra,

Free up some space before restarting.
Old archive/backup/logs could be the candidate for clean up. You do not
need much space to restart instance.
Once started, login to db and execute checkpoint.

Thanks
Rajni

On Mon, 8 Oct 2018 at 4:40 pm Raghavendra Rao J S V <
raghavendra...@gmail.com> wrote:

> *Hi All,*
>
> *archive_mode *is turned *on *unfortunately in my Postgres 9.2 database.
>
> Due to that disk space is full 100%. We are facing below problem when we
> try to start the database.
>
> *PANIC: could not read from control file:Success*
>
> Please help me how to handle this situation.
>
>
>
> *Log files contians the infomration as below:-*
>
> 2018-10-08 05:27:44.262 UTC,,,27688,,5bbaead0.6c28,1,,2018-10-08 05:27:44
> UTC,,0,LOG,0,"database system was interrupted while in recovery at
> 2018-10-03 15:01:03 UTC",,"This probably means that some data is corrupted
> and you will have to use the last backup for recovery.",,,""
> 2018-10-08 05:27:44.262 UTC,,,27688,,5bbaead0.6c28,2,,2018-10-08 05:27:44
> UTC,,0,LOG,0,"database system was not properly shut down; automatic
> recovery in progress",""
> 2018-10-08 05:27:44.265 UTC,,,27688,,5bbaead0.6c28,3,,2018-10-08 05:27:44
> UTC,,0,LOG,0,"redo starts at 93/775816B0",""
> 2018-10-08 05:27:44.514 UTC,,,27688,,5bbaead0.6c28,4,,2018-10-08 05:27:44
> UTC,,0,FATAL,53100,"could not extend file ""base/77017/160045"": wrote only
> 4096 of 8192 bytes at block 278",,"Check free disk space.",,,"xlog redo
> insert(init): rel 1663/77017/160045; tid 278/1"""
> 2018-10-08 05:27:44.517 UTC,,,27686,,5bbaead0.6c26,1,,2018-10-08 05:27:44
> UTC,,0,LOG,0,"startup process (PID 27688) exited with exit code
> 1",""
> 2018-10-08 05:27:44.517 UTC,,,27686,,5bbaead0.6c26,2,,2018-10-08 05:27:44
> UTC,,0,LOG,0,"aborting startup due to startup process
> failure",""
>
> --
> Regards,
> Raghavendra Rao J S V
> Mobile- 8861161425
>
-- 
Thanks
Regards,
Rajni
0410472086


RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Virendra 

You think that outside encryption of the database is the best solution  
 ?
   How do you manage the encryption key ?
Can you give me some examples of this kind of solution.

Best Regards
Didier ROS

-Message d'origine-
De : virendra.ku...@guycarp.com [mailto:virendra.ku...@guycarp.com] 
Envoyé : dimanche 7 octobre 2018 20:41
À : ROS Didier ; fola...@peoplecall.com
Cc : pavel.steh...@gmail.com; pgsql-...@lists.postgresql.org; 
pgsql-performa...@lists.postgresql.org; pgsql-general@lists.postgresql.org
Objet : RE: Why the index is not used ?

You can consider outside DB encryption which is less of worry for performance 
and data at rest will be encrypted.

Regards,
Virendra
-Original Message-
From: ROS Didier [mailto:didier@edf.fr]
Sent: Sunday, October 07, 2018 2:33 PM
To: fola...@peoplecall.com
Cc: pavel.steh...@gmail.com; pgsql-...@lists.postgresql.org; 
pgsql-performa...@lists.postgresql.org; pgsql-general@lists.postgresql.org
Subject: RE: Why the index is not used ?

Hi Francisco

Thank you for your remark.
You're right, but it's the only procedure I found to make search on encrypted 
fields with good response times (using index) !

Regarding access to the file system, our servers are in protected network 
areas. few people can connect to it.

it's not the best solution, but we have data encryption needs and good 
performance needs too. I do not know how to do it except the specified 
procedure..
if anyone has any proposals to put this in place, I'm interested.

Thanks in advance

Best Regards
Didier ROS

-Message d'origine-
De : fola...@peoplecall.com [mailto:fola...@peoplecall.com] Envoyé : dimanche 7 
octobre 2018 17:58 À : ROS Didier  Cc : 
pavel.steh...@gmail.com; pgsql-...@lists.postgresql.org; 
pgsql-performa...@lists.postgresql.org; pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

ROS:

On Sun, Oct 7, 2018 at 3:13 PM, ROS Didier  wrote:

> -INSERT INTO cartedecredit(username,cc)  SELECT 'individu ' || x.id, 
> pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, 
> cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
> -CREATE INDEX idx_cartedecredit_cc02 ON 
> cartedecredit(pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2, 
> cipher-algo=aes256'));

If my french is not too rusty you are encrypting a credit-card, and then 
storing an UNENCRYPTED copy in the index. So, getting it from the server is 
trivial for anyone with filesystem access.

Francisco Olarte.



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.



This message is intended only for the use of the addressee and may contain 
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any 
dissemination of this communication is strictly prohibited. If you have 
received this communication in error, please erase all copies of the message 
and its attachments and notify the sender immediately. Thank you.



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf a

Re: Why the index is not used ?

2018-10-08 Thread Jean-Paul Argudo
Dear Didier,


Le lundi 08 octobre 2018 à 08:32 +, ROS Didier a écrit :
> Hi Virendra 
>   You think that outside encryption of the database is the best solution  
>  ?
>How do you manage the encryption key ?
>   Can you give me some examples of this kind of solution.
> Best Regards
> Didier ROS

If I understand your need well, you need to store credit card information into 
your database.

This is ruled by the Payment Card Industry Data Security Standard (aka PCI DSS).

I attend some years ago a real good presentation from Denish Patel, a well 
known community member.

I saw this talk in pgconf 2015 in Moscow: https://pgconf.ru/media2015c/patel.pdf

I recommend you read it, if you had not already? It shows code examples, etc.


My 2 cents...


-- 
Jean-Paul Argudo




pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread magodo


Hello,

I am using PostgreSQL-9.6, trying to do basebackup on a hot standby
with archive mode set to always. However, I found there is no backup
history file (*.backup) generated in the archive destination directory
after `pg_basebackup`, which is not the case if I do the same thing on
the primary.

Is this as expected? If so, in which case should I do backup on primary
and in which case should I do it on standby?

Thank you in advance!

Magodo





Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 03:23:47PM +0800, magodo wrote:
> Is this as expected?

Yes.

> If so, in which case should I do backup on primary and in which case
> should I do it on standby?

This depends on your use cases, sometimes you don't want to make the
production server, the primary use more CPU than necessary so you can
leverage the activity on a standby.
--
Michael


signature.asc
Description: PGP signature


RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Vlad
Sorry for this delay, but apparently the subject is of interest to many people 
in the community. I received a lot of comments and answers.
I wrote my answers in the body of your message below

Best Regards
Didier

De : greatvo...@gmail.com [mailto:greatvo...@gmail.com]
Envoyé : samedi 6 octobre 2018 18:51
À : ROS Didier 
Cc : pgsql-...@lists.postgresql.org; pgsql-performa...@lists.postgresql.org; 
pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

Hello Didier,

>>
(3), (5) to find the match, you decrypt the whole table, apparently this take 
quite a long time.
Index cannot help here because indexes work on exact match of type and value, 
but you compare mapped value, not indexed. Functional index should help, but 
like it was said, it against the idea of encrypted storage.
<<
I tested the solution of the functional index. It works very well, but the data 
is no longer encrypted. This is not the right solution
>>
(6) I never used pgp_sym_encrypt() but I see that in INSERT INTO you supplied 
additional parameter 'compress-algo=2, cipher-algo=aes256' while in (6) you did 
not. Probably this is the reason.

In general matching indexed bytea column should use index, you can ensure in 
this populating the column unencrypted and using 'test value 32'::bytea for 
match.
In you case I believe pgp_sym_encrypt() is not marked as STABLE or IMMUTABLE 
that's why it will be evaluated for each row (very inefficient) and cannot use 
index. From documentation:

"Since an index scan will evaluate the comparison value only once, not once at 
each row, it is not valid to use a VOLATILE function in an index scan 
condition."
https://www.postgresql.org/docs/10/static/xfunc-volatility.html

If you cannot add STABLE/IMMUTABLE to pgp_sym_encrypt() (which apparently 
should be there), you can encrypt searched value as a separate operation and 
then search in the table using basic value match.
>>
you're right about the missing parameter  'compress-algo=2, 
cipher-algo=aes256'. I agree with you.
(1) I have tested your proposal :
DROP TABLE cartedecredit;
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), 
cc bytea);
INSERT INTO cartedecredit(username,cc)  SELECT 'individu ' || x.id, 
decode('test value ' || x.id,'escape') FROM generate_series(1,10) AS x(id);

è I inserted unencrypted data into the bytea column
postgres=# select * from cartedecredit limit 5 ;
card_id |  username   |  cc
-+-+--
   1 | individu 1  | \x746573742076616c75652031
   2 | individu 2  | \x746573742076616c75652032
   3 | individu 3  | \x746573742076616c75652033
   4 | individu 4  | \x746573742076616c75652034
   5 | individu 5  | \x746573742076616c75652035
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
SELECT encode(cc,'escape') FROM cartedecredit WHERE cc=decode('test value 
32','escape');
 QUERY PLAN

Index Only Scan using idx_cartedecredit_cc02 on cartedecredit  (cost=0.42..8.44 
rows=1 width=32) (actual time=0.033..0.034 rows=1 loops=1)
   Index Cond: (cc = '\x746573742076616c7565203332'::bytea)
   Heap Fetches: 1
Planning time: 0.130 ms
Execution time: 0.059 ms
(5 rows)

è It works but the data is not encrypted. everyone can have access to the data
(2) 2nd test :
DROP TABLE cartedecredit;
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), 
cc bytea);
INSERT INTO cartedecredit(username,cc)  SELECT 'individu ' || x.id, 
pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, 
cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
postgres=# select * from cartedecredit limit 5 ;
>>
card_id |  username   | 
  cc
-+-+-
---
   1 | individu 1  | 
\xc30d0409030296304d007bf50ed768d2480153cd4a4e2d240249f94b31ec168391515ea80947f97970f7a4e058bff648f752df194498dd480c3b8a5c0d2942f90c6dde21a6b9bf4e9fd7986c6f986e3783
647e7a6205b48c03
   2 | individu 2  | 
\xc30d0409030257b50bc0e6bcd8d270d248010984b60126af01ba922da27e2e78c33110f223f0210cf34da77243277305254cba374708d447fc7d653dd9e00ff9a96803a2c47ee95269534f2c24fab1c9dc
31f7909ca7adeaf0
   3 | individu 3  | 
\xc30d040903023c5f8cb688c7945275d24801a518d70c6cc2d4a31f99f3738e736c5312f78bb9c3cc187a65d0cf7f893dbc9448825d39b79df5d0460508fc93336c2bec7794893bb08a290afd649ae15fe2
2b0433eff89222f7
   4 | individu 4  | 
\xc30d04090302dcc3bb49a41b297578d2480167f17b09004e7dacc0891fc0cc7276dd551273eec72644520f8d0543abe8e795af7c1b84fc8e5b4adc

RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Phil

Thank you for this recommendation, but I posted on this public list 
only generic examples that have nothing to do with the works done in my company.
These examples serve me only to discuss about the subject of data 
encryption and performance
My answers to your remarks :

>>
Why do you need to search by credit card number?
<<
 Again, this is just an example. I just want to find a solution to query a 
column containing encrypted data with good performance.

>>
one option is to use an encryption function that doesn't salt the data
<<
I am interested. Can you give some examples of these encryption function that 
doesn't salt the data.

Best Regards
Didier ROS
-Message d'origine-
De : phil_tnlcz_endec...@chezphil.org [mailto:phil_tnlcz_endec...@chezphil.org] 
Envoyé : dimanche 7 octobre 2018 21:17
À : ROS Didier ; pgsql-general@lists.postgresql.org
Objet : RE: Why the index is not used ?

Hello Didier,

Your email is didier@edf.fr.  Are you working at Electricite de France, and 
storing actual customers' credit card details?  How many millions of them?

Note that this mailing list is public; people looking for targets with poor 
security from which they can harvest credit card numbers might be reading it.
And after you are hacked and all your customers' credit card details are made 
public, someone will find this thread.

> it's not the best solution, but we have data encryption needs and good 
> performance needs too. I do not know how to do it except the specified 
> procedure..

You should probably employ someone who knows what they are doing.

Sorry for being so direct, but really... storing large quantities of credit 
card details is the text book example of something that has to be done 
correctly.

> if anyone has any proposals to put this in place, I'm interested.

Why do you need to search by credit card number?

If you really really need to do that, then one option is to use an encryption 
function that doesn't salt the data.  Or you could store part of the number 
(last 4 digits?), or an unsalted hash of the number, unencrypted and indexed, 
and then you need only to sequentially decrypt (using the salted encryption) 
e.g. 1/1 of the card numbers.  But there are complex security issues and 
tradeoffs involved here.  You probably need to comply with regulations (e.g. 
"PCI standards") which will specify what is allowed and what isn't. And if you 
didn't already know that, you shouldn't be doing this.


Good luck, I suppose.

Phil.

P.S. It seems that you were asking about this a year ago, and got the same 
answers...







Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Vlad
   OK, I take into account your remark about the need to do 
research on encrypted data.
My answers to your remarks :
>>
you can use a deterministic algorithm for it (without salt)
<<
Can you give me on of these deterministic algorithms(without salt) ?

Best Regards

Didier
De : greatvo...@gmail.com [mailto:greatvo...@gmail.com]
Envoyé : dimanche 7 octobre 2018 21:33
À : ROS Didier 
Cc : fola...@peoplecall.com; pavel.steh...@gmail.com; 
pgsql-...@lists.postgresql.org; pgsql-performa...@lists.postgresql.org; 
pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

Additionally it is not clear why you want to search in table on encrypted data. 
Usually you match user with it's unpersonalized data (such as login, user ID) 
and then decrypt personalized data. If you need to store user identifying data 
encrypted as well (e.g. bank account number) you can use a deterministic 
algorithm for it (without salt) because it is guaranteed to be unique and you 
don't need to have different encrypted data for two same input strings.

Vlad



Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


RE: Why the index is not used ?

2018-10-08 Thread Phil Endecott

ROS Didier wrote:
Can you give some examples of these encryption function 
that doesn't salt the data.


encrypt(data, 'motdepass', 'aes')


Regards, Phil.







RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Tomas

Thank you for your answer and recommendation which is very interesting. 
I'm going to study the PCI DSS document right now.
-   Here are my answer to your question :
>>
What is your threat model?
<<
we want to prevent access to sensitive data for everyone except those who have 
the encryption key.
in case of files theft, backups theft, dumps theft, we do not want anyone to 
access sensitive data.

-   I have tested the solution you proposed, it works great.

Best Regards

Didier ROS
-Message d'origine-
De : tomas.von...@2ndquadrant.com [mailto:tomas.von...@2ndquadrant.com]
Envoyé : dimanche 7 octobre 2018 22:08
À : ROS Didier ; fola...@peoplecall.com
Cc : pavel.steh...@gmail.com; pgsql-...@lists.postgresql.org; 
pgsql-performa...@lists.postgresql.org; pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

Hi,

On 10/07/2018 08:32 PM, ROS Didier wrote:
> Hi Francisco
>
>   Thank you for your remark.
> You're right, but it's the only procedure I found to make search on
> encrypted fields with good response times (using index) !
>

Unfortunately, that kinda invalidates the whole purpose of in-database 
encryption - you'll have encrypted on-disk data in one place, and then 
plaintext right next to it. If you're dealing with credit card numbers, then 
you presumably care about PCI DSS, and this is likely a direct violation of 
that.

> Regarding access to the file system, our servers are in protected
network areas. few people can connect to it.
>

Then why do you need encryption at all? If you assume access to the filesystem 
/ storage is protected, why do you bother with encryption?
What is your threat model?

> it's not the best solution, but we have data encryption needs and good
> performance needs too. I do not know how to do it except the specified
> procedure..
>
> if anyone has any proposals to put this in place, I'm interested.
>

One thing you could do is hashing the value and then searching by the hash. So 
aside from having the encrypted column you'll also have a short hash, and you 
may use it in the query *together* with the original condition. It does not 
need to be unique (in fact it should not be to make it impossible to reverse 
the hash), but it needs to have enough distinct values to make the index 
efficient. Say, 10k values should be enough, because that means 0.01% 
selectivity.

So the function might look like this, for example:

  CREATE FUNCTION cchash(text) RETURNS int AS $$
SELECT abs(hashtext($1)) % 1;
  $$ LANGUAGE sql;

and then be used like this:

  CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc));

and in the query

  SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit
   WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32'
 AND cchash(cc) = cchash('test value 32');

Obviously, this does not really solve the issues with having to pass the 
password to the query, making it visible in pg_stat_activity, various logs etc.

Which is why people generally use FDE for the whole disk, which is transparent 
and provides the same level of protection.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Hi Paul

   Thank you very much for your feedback which is very informative.
   I understand that concerning the encryption of credit card 
numbers, it is imperative to respect the PCI DSS document. I am going to study 
it.
   However, I would like to say that I chose my example badly by 
using a table storing credit card numbers. In fact, my problem is more generic.
I want to implement a solution that encrypts “sensitive” data and can retrieve 
data with good performance (by using an index).
I find that the solution you propose is very interesting and I am going to test 
it.

Best Regards
Didier ROS

De : p...@paulmcgarry.com [mailto:p...@paulmcgarry.com]
Envoyé : lundi 8 octobre 2018 00:11
À : ROS Didier 
Cc : fola...@peoplecall.com; pavel.steh...@gmail.com; 
pgsql-...@lists.postgresql.org; pgsql-performa...@lists.postgresql.org; 
pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

Hi Didier,

I’m sorry to tell you that you are probably doing something (ie 
handling/storing credit cards) which would mean you have to comply with PCI DSS 
requirements.

As such you should probably have a QSA (auditor) who you can run any proposed 
solution by (so you know they will be comfortable with it when they do their 
audit).

I think your current solution would be frowned upon because:
- cards are effectively stored in plaintext in the index.
- your encryption/decryption is being done in database, rather than by 
something with that as its sole role.

People have already mentioned the former so I won’t go into it further

But for the second part if someone can do a

Select pgp_sym_decrypt(cc)

then you are one sql injection away from having your card data stolen. You do 
have encryption, but in practice everything is available unencrypted so in 
practice the encryption is more of a tick in a box than an actual defence 
against bad things happening. In a properly segmented system even your DBA 
should not be able to access decrypted card data.

You probably should look into doing something like:

- store the first 6 and last 4 digits of the card unencrypted.
- store the remaining card digits encrypted
- have the encryption/decryption done by a seperate service called by your 
application code outside the db.

You haven’t gone into what your requirements re search are (or I missed them) 
but while the above won’t give you a fast exact cc lookup in practice being 
able to search using the first 6 and last 4 can get you a small enough subset 
than can then be filtered after decrypting the middle.

We are straying a little off PostgreSQL topic here but if you and/or your 
management aren’t already looking at PCI DSS compliance I’d strongly recommend 
you do so. It can seem like a pain but it is much better to take that pain up 
front rather than having to reengineer everything later. There are important 
security aspects it helps make sure you cover but maybe some business aspects 
(ie possible partners who won’t be able to deal with you without your 
compliance sign off documentation).


The alternative, if storing cc data isn’t a core requirement, is to not store 
the credit card data at all. That is generally the best solution if it meets 
your needs, ie if you just want to accept payments then use a third party who 
is PCI compliant to handle the cc part.

I hope that helps a little.

Paul



Sent from my iPhone

On 8 Oct 2018, at 05:32, ROS Didier 
mailto:didier@edf.fr>> wrote:
Hi Francisco

   Thank you for your remark.
   You're right, but it's the only procedure I found to make search on 
encrypted fields with good response times (using index) !

   Regarding access to the file system, our servers are in protected network 
areas. few people can connect to it.

   it's not the best solution, but we have data encryption needs and good 
performance needs too. I do not know how to do it except the specified 
procedure..
   if anyone has any proposals to put this in place, I'm interested.

   Thanks in advance

Best Regards
Didier ROS

-Message d'origine-
De : fola...@peoplecall.com 
[mailto:fola...@peoplecall.com]
Envoyé : dimanche 7 octobre 2018 17:58
À : ROS Didier mailto:didier@edf.fr>>
Cc : pavel.steh...@gmail.com; 
pgsql-...@lists.postgresql.org; 
pgsql-performa...@lists.postgresql.org;
 pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

ROS:

On Sun, Oct 7, 2018 at 3:13 PM, ROS Didier 
mailto:didier@edf.fr>> wrote:


-INSERT INTO cartedecredit(username,cc)  SELECT 'individu ' || 
x.id, pgp_sym_encrypt('test value ' || x.id, 
'motdepasse','compress-algo=2, cipher-algo=aes256') FROM 
generate_series(1,10) AS x(id);
-CREATE INDEX idx_cartedecredit_cc02 ON 
cartedecredit(pgp

RE: Why the index is not used ?

2018-10-08 Thread ROS Didier
Dear Jean-Paul

Thank you very much for this link which is actually very interesting. I 
am going to study it carefully.
But my problem is more generic: 
How to set up the encryption of sensitive data and have good 
performance (using an index by example) ?. 
Apparently it is not obvious as that.

Best Regards

Didier ROS
-Message d'origine-
De : j...@argudo.org [mailto:j...@argudo.org] 
Envoyé : lundi 8 octobre 2018 10:44
À : pgsql-general@lists.postgresql.org
Objet : Re: Why the index is not used ?

Dear Didier,


Le lundi 08 octobre 2018 à 08:32 +, ROS Didier a écrit :
> Hi Virendra 
>   You think that outside encryption of the database is the best solution  
>  ?
>How do you manage the encryption key ?
>   Can you give me some examples of this kind of solution.
> Best Regards
> Didier ROS

If I understand your need well, you need to store credit card information into 
your database.

This is ruled by the Payment Card Industry Data Security Standard (aka PCI DSS).

I attend some years ago a real good presentation from Denish Patel, a well 
known community member.

I saw this talk in pgconf 2015 in Moscow: https://pgconf.ru/media2015c/patel.pdf

I recommend you read it, if you had not already? It shows code examples, etc.


My 2 cents...


-- 
Jean-Paul Argudo





Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.




Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Malik Rumi
1. This code is entry_search_vector_trigger(), one of 3 trigger functions
based on the Django model that created the site.
2. So this is the trigger definition (as far as I know) and it is on the
Entry table. There is also a Tag table and the Tags intersection table.
3. Uhh, I'm not sure. I assume this is it, that when a new entry is posted,
the function that parses the entry into searchable text and indexes the
words is called. But I can tell you I got this code from this blog post:
blog.lotech.org/postgres-full-text-search-with-django.html. I asked the
author about this issue. He said he wasn't sure wthout debugging if it was
something he left out or something I did wrong.
4. Postgresql 9.4. Yea, I know, I should upgrade...

*“None of you has faith until he loves for his brother or his neighbor what
he loves for himself.”*


On Tue, Oct 2, 2018 at 6:27 PM Adrian Klaver 
wrote:

> On 10/2/18 10:34 AM, Malik Rumi wrote:
> > I have set up Postgres FTS on a Django/Python web site, and it works as
> > expected except for this one thing. When I wrote a script to bulk insert
> > legacy docs, the script works fine but the FTS trigger does not fire. I
> > have to go back and open each document one at a time to get them indexed.
> >
> > What am I missing to make this work? Thanks.
>
> Have no idea as there is not enough information.
>
> To begin with:
>
> 1) What is code below?
>
> 2) What is the trigger definition and on what table?
>
> 3) What is the function the trigger is calling?
>
> 4) For good measure what version of Postgres?
>
> >
> >  BEGIN
> >SELECT setweight(to_tsvector(NEW.title), 'A') ||
> >   setweight(to_tsvector(NEW.content), 'B') ||
> >   setweight(to_tsvector(NEW.category), 'D') ||
> >   setweight(to_tsvector(COALESCE(string_agg(tag.tag,
> > ', '), '')), 'C')
> >INTO NEW.search_vector
> >FROM ktab_entry AS entry
> >  LEFT JOIN ktab_entry_tags AS entry_tags ON
> > entry_tags.entry_id = entry.id 
> >  LEFT JOIN ktab_tag AS tag ON tag.id  =
> > entry_tags.tag_id
> >WHERE entry.id  = NEW.id
> >GROUP BY entry.id , category;
> >RETURN NEW;
> >  END;
> >
> >
> > */“None of you has faith until he loves for his brother or his neighbor
> > what he loves for himself.”/*
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Adrian Klaver

On 10/8/18 12:29 PM, Malik Rumi wrote:
1. This code is entry_search_vector_trigger(), one of 3 trigger 
functions based on the Django model that created the site.
2. So this is the trigger definition (as far as I know) and it is on the 
Entry table. There is also a Tag table and the Tags intersection table.
3. Uhh, I'm not sure. I assume this is it, that when a new entry is 
posted, the function that parses the entry into searchable text and 
indexes the words is called. But I can tell you I got this code from 
this blog post: 
blog.lotech.org/postgres-full-text-search-with-django.html 
. I 
asked the author about this issue. He said he wasn't sure wthout 
debugging if it was something he left out or something I did wrong.

4. Postgresql 9.4. Yea, I know, I should upgrade...


Your function name does not match up with the code on the site, so we 
will need to see the actual trigger/function.


In psql do:

\d entry

to see the trigger definition and then post it here.

Also from that definition you can get the function name.

Again in psql do:

\ef fnc_name

to confirm the function is the one you think it is.

Would also be helpful to see the script you wrote to do the bulk insert.



*/“None of you has faith until he loves for his brother or his neighbor 
what he loves for himself.”/*





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Why the index is not used ?

2018-10-08 Thread Tomas Vondra
Hi,

On 10/08/2018 04:10 PM, ROS Didier wrote:
> Hi Tomas
>    
>     Thank you for your answer and recommendation which is very
> interesting. I'm going to study the PCI DSS document right now.
> 
>   * Here are my answer to your question :
> 
> />>/
> /What is your threat model?/
> /< we want to prevent access to sensitive data for everyone except those
> who have the encryption key.
> in case of files theft, backups theft, dumps theft, we do not want
> anyone to access sensitive data.
>    

The thing is - encryption is not panacea. The interesting question is
whether this improves security compared to simply using FDE and regular
access rights (which are grantable at the column level).

Using those two pieces properly may very well be a better defense than
not well designed encryption scheme - and based on this discussion, it
does not seem very polished / resilient.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Malik Rumi
I hope this comes out readable. If not I can do a separate attachment. I
notice it says 'BEFORE INSERT'. Maybe that should be after?

Table
"public.ktab_entry"
Column |   Type   | Collation | Nullable |
  Default
---+--+---+--+
 id| integer  |   | not null |
nextval('ktab_entry_id_seq'::regclass)
 title | character varying(100)   |   | not null |
 slug  | character varying(100)   |   | not null |
 content   | text |   | not null |
 posted_date   | timestamp with time zone |   | not null |
 chron_date| date |   | not null |
 clock | time without time zone   |   | not null |
 category  | character varying(25)|   | not null |
 search_vector | tsvector |   |  |
 image1| character varying(100)   |   |  |
 image2| character varying(100)   |   |  |
 image3| character varying(100)   |   |  |
Indexes:
"ktab_entry_pkey" PRIMARY KEY, btree (id)
"ktab_entry_slug_e1313695_uniq" UNIQUE CONSTRAINT, btree
(slug)
"ktab_entry_title_6950e951_uniq" UNIQUE CONSTRAINT, btree
(title)
"ktab_entry_search__d5071f_gin" gin (search_vector)
"ktab_entry_slug_e1313695_like" btree (slug
varchar_pattern_ops)
"ktab_entry_title_6950e951_like" btree (title
varchar_pattern_ops)
Referenced by:
TABLE "ktab_entry_tags" CONSTRAINT
"ktab_entry_tags_entry_id_294f83f9_fk_ktab_entry_id" FOREIGN KEY
(entry_id) REFERENCES ktab_entry(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
search_vector_update BEFORE INSERT OR UPDATE ON ktab_entry FOR EACH ROW
EXECUTE PROCEDURE entry_search_vector_trigger()


  GNU
nano 2.9.3   /tmp/psql.edit.24305.sql
 

CREATE OR REPLACE FUNCTION public.entry_search_vector_trigger()
 RETURNS trigger
 LANGUAGE plpgsql
AS $function$BEGIN
  SELECT setweight(to_tsvector(NEW.title), 'A') ||
 setweight(to_tsvector(NEW.content), 'B') ||
 setweight(to_tsvector(NEW.category), 'D') ||
 setweight(to_tsvector(COALESCE(string_agg(tag.tag, ',
'), $
  INTO NEW.search_vector
  FROM ktab_entry AS entry
LEFT JOIN ktab_entry_tags AS
entry_tags ON entry_tags.entry_id $
LEFT JOIN ktab_tag AS tag ON
tag.id = entry_tags.tag_id
  WHERE entry.id = NEW.id
  GROUP BY entry.id, category;
  RETURN NEW;
END;
$function$

*“None of you has faith until he loves for his brother or his neighbor what
he loves for himself.”*


On Mon, Oct 8, 2018 at 2:57 PM Adrian Klaver 
wrote:

> On 10/8/18 12:29 PM, Malik Rumi wrote:
> > 1. This code is entry_search_vector_trigger(), one of 3 trigger
> > functions based on the Django model that created the site.
> > 2. So this is the trigger definition (as far as I know) and it is on the
> > Entry table. There is also a Tag table and the Tags intersection table.
> > 3. Uhh, I'm not sure. I assume this is it, that when a new entry is
> > posted, the function that parses the entry into searchable text and
> > indexes the words is called. But I can tell you I got this code from
> > this blog post:
> > blog.lotech.org/postgres-full-text-search-with-django.html
> > . I
> > asked the author about this issue. He said he wasn't sure wthout
> > debugging if it was something he left out or something I did wrong.
> > 4. Postgresql 9.4. Yea, I know, I should upgrade...
>
> Your function name does not match up with the code on the site, so we
> will need to see the actual trigger/function.
>
> In psql do:
>
> \d entry
>
> to see the trigger definition and then post it here.
>
> Also from that definition you can get the function name.
>
> Again in psql do:
>
> \ef fnc_name
>
> to confirm the function is the one you think it is.
>
> Would also be helpful to see the script you wrote to do the bulk insert.
>
> >
> > */“None of you has faith until he loves for his brother or his neighbor
> > what he loves for himself.”/*
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Adrian Klaver

On 10/8/18 1:25 PM, Malik Rumi wrote:
I hope this comes out readable. If not I can do a separate attachment. I 
notice it says 'BEFORE INSERT'. Maybe that should be after?


No as the return value would be ignored:

https://www.postgresql.org/docs/10/static/plpgsql-trigger.html

"The return value of a row-level trigger fired AFTER or a 
statement-level trigger fired BEFORE or AFTER is always ignored; it 
might as well be null. However, any of these types of triggers might 
still abort the entire operation by raising an error."


So what is the script you used to do the bulk INSERT?




                                        Table 
"public.ktab_entry"
     Column     |           Type           | Collation | Nullable |  
           Default

---+--+---+--+
  id            | integer                  |           | not null | 
nextval('ktab_entry_id_seq'::regclass)

  title         | character varying(100)   |           | not null |
  slug          | character varying(100)   |           | not null |
  content       | text                     |           | not null |
  posted_date   | timestamp with time zone |           | not null |
  chron_date    | date                     |           | not null |
  clock         | time without time zone   |           | not null |
  category      | character varying(25)    |           | not null |
  search_vector | tsvector                 |           |          |
  image1        | character varying(100)   |           |          |
  image2        | character varying(100)   |           |          |
  image3        | character varying(100)   |           |          |
Indexes:
     "ktab_entry_pkey" PRIMARY KEY, btree (id)
     "ktab_entry_slug_e1313695_uniq" UNIQUE CONSTRAINT, btree 
(slug)
     "ktab_entry_title_6950e951_uniq" UNIQUE CONSTRAINT, btree 
(title)

     "ktab_entry_search__d5071f_gin" gin (search_vector)
     "ktab_entry_slug_e1313695_like" btree (slug 
varchar_pattern_ops)
     "ktab_entry_title_6950e951_like" btree (title 
varchar_pattern_ops)

Referenced by:
     TABLE "ktab_entry_tags" CONSTRAINT 
"ktab_entry_tags_entry_id_294f83f9_fk_ktab_entry_id" FOREIGN 
KEY (entry_id) REFERENCES ktab_entry(id) DEFERRABLE INITIALLY DEFERRED

Triggers:
     search_vector_update BEFORE INSERT OR UPDATE ON ktab_entry FOR EACH 
ROW EXECUTE PROCEDURE entry_search_vector_trigger()



  GNU 
nano 2.9.3               /tmp/psql.edit.24305.sql
  


CREATE OR REPLACE color="#3465A4">FUNCTION color="#EF2929">public.color="#D3D7CF">entry_search_vector_trigger()

  RETURNS trigger
  LANGUAGE color="#729FCF">plpgsql
AS $function$            color="#3465A4">BEGIN
               SELECT color="#D3D7CF">setweight(to_tsvector(NEW.title), color="#4E9A06">'A') ||
                      color="#D3D7CF">setweight(to_tsvector(NEW.content), color="#4E9A06">'B') ||
                      color="#D3D7CF">setweight(to_tsvector(NEW.category), color="#4E9A06">'D') ||
                      color="#D3D7CF">setweight(to_tsvector(COALESCE(color="#D3D7CF">string_agg(tag.tag, ', 
'), $

               INTO NEW.search_vector
               FROM ktab_entry color="#3465A4">AS entry
                 LEFT JOIN ktab_entry_tags color="#3465A4">AS entry_tags ON entry_tags.entry_id $
                 LEFT JOIN ktab_tag AS tag 
ON tag.id  = entry_tags.tag_id
               WHERE entry.id 
 = NEW.id
               GROUP BY entry.id 
, category;

               RETURN NEW;
             END;
             $function$

*/“None of you has faith until he loves for his brother or his neighbor 
what he loves for himself.”/*



On Mon, Oct 8, 2018 at 2:57 PM Adrian Klaver > wrote:


On 10/8/18 12:29 PM, Malik Rumi wrote:
 > 1. This code is entry_search_vector_trigger(), one of 3 trigger
 > functions based on the Django model that created the site.
 > 2. So this is the trigger definition (as far as I know) and it is
on the
 > Entry table. There is also a Tag table and the Tags intersection
table.
 > 3. Uhh, I'm not sure. I assume this is it, that when a new entry is
 > posted, the function that parses the entry into searchable text and
 > indexes the words is called. But I can tell you I got this code from
 > this blog post:
 > blog.lotech.org/postgres-full-text-search-with-django.html

 >
. I
 > asked the author about this issue. He said he wasn't sure wthout
 > debugging if it was something he left out or something I did wrong.
 > 4. Postgresql 9.4. Yea, I know, I should upgrade...

Your function name does not match up with the code on the site, so we
will need to see the actual trigger/function.

In psql do:

\d 

Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Malik Rumi
So what is the script you used to do the bulk INSERT?

There's actually three, but they are all basically the same. The
differences have to do with the source material being inserted:

# usr/local/bin/python3.6
# coding: utf-8

from os import environ
environ['DJANGO_SETTINGS_MODULE'] = 'chronicle.settings'
import django
django.setup()
from ktab.models import Entry, Tag
from django.utils.text import slugify
import csv


filename = '/home/malikarumi/Projects/Progress_Logs/edit_blogpost.csv'

with open(filename, 'rt') as text:
reader = csv.DictReader(text, delimiter=',')
# next(reader, None)
for row in reader:
my_entry = Entry.objects.create(
title=row['title'], slug=row['slug'], chron_date=row['created'],
clock=row['clock'], content=row['content'])
my_entry.tags.add(row['tag'])
*“None of you has faith until he loves for his brother or his neighbor what
he loves for himself.”*


On Mon, Oct 8, 2018 at 3:32 PM Adrian Klaver 
wrote:

> On 10/8/18 1:25 PM, Malik Rumi wrote:
> > I hope this comes out readable. If not I can do a separate attachment. I
> > notice it says 'BEFORE INSERT'. Maybe that should be after?
>
> No as the return value would be ignored:
>
> https://www.postgresql.org/docs/10/static/plpgsql-trigger.html
>
> "The return value of a row-level trigger fired AFTER or a
> statement-level trigger fired BEFORE or AFTER is always ignored; it
> might as well be null. However, any of these types of triggers might
> still abort the entire operation by raising an error."
>
> So what is the script you used to do the bulk INSERT?
>
>
> >
> > Table
> > "public.ktab_entry"
> >  Column |   Type   | Collation | Nullable |
> >Default
> >
> ---+--+---+--+
> >   id| integer  |   | not null |
> > nextval('ktab_entry_id_seq'::regclass)
> >   title | character varying(100)   |   | not null |
> >   slug  | character varying(100)   |   | not null |
> >   content   | text |   | not null |
> >   posted_date   | timestamp with time zone |   | not null |
> >   chron_date| date |   | not null |
> >   clock | time without time zone   |   | not null |
> >   category  | character varying(25)|   | not null |
> >   search_vector | tsvector |   |  |
> >   image1| character varying(100)   |   |  |
> >   image2| character varying(100)   |   |  |
> >   image3| character varying(100)   |   |  |
> > Indexes:
> >  "ktab_entry_pkey" PRIMARY KEY, btree (id)
> >  "ktab_entry_slug_e1313695_uniq" UNIQUE CONSTRAINT, btree
> > (slug)
> >  "ktab_entry_title_6950e951_uniq" UNIQUE CONSTRAINT, btree
> > (title)
> >  "ktab_entry_search__d5071f_gin" gin (search_vector)
> >  "ktab_entry_slug_e1313695_like" btree (slug
> > varchar_pattern_ops)
> >  "ktab_entry_title_6950e951_like" btree (title
> > varchar_pattern_ops)
> > Referenced by:
> >  TABLE "ktab_entry_tags" CONSTRAINT
> > "ktab_entry_tags_entry_id_294f83f9_fk_ktab_entry_id" FOREIGN
> > KEY (entry_id) REFERENCES ktab_entry(id) DEFERRABLE INITIALLY DEFERRED
> > Triggers:
> >  search_vector_update BEFORE INSERT OR UPDATE ON ktab_entry FOR EACH
> > ROW EXECUTE PROCEDURE entry_search_vector_trigger()
> > 
> >
> >   GNU
> > nano 2.9.3   /tmp/psql.edit.24305.sql
> >   
> >
> > CREATE OR REPLACE  > color="#3465A4">FUNCTION  > color="#EF2929">public. > color="#D3D7CF">entry_search_vector_trigger()
> >   RETURNS trigger
> >   LANGUAGE  > color="#729FCF">plpgsql
> > AS $function$ > color="#3465A4">BEGIN
> >SELECT  > color="#D3D7CF">setweight(to_tsvector(NEW.title),  > color="#4E9A06">'A') ||
> >> color="#D3D7CF">setweight(to_tsvector(NEW.content),  > color="#4E9A06">'B') ||
> >> color="#D3D7CF">setweight(to_tsvector(NEW.category),  > color="#4E9A06">'D') ||
> >> color="#D3D7CF">setweight(to_tsvector(COALESCE( > color="#D3D7CF">string_agg(tag.tag, ',
> > '), $
> >INTO NEW.search_vector
> >FROM ktab_entry  > color="#3465A4">AS entry
> >  LEFT JOIN ktab_entry_tags  > color="#3465A4">AS entry_tags ON entry_tags.entry_id $
> >  LEFT JOIN ktab_tag AS tag
> > ON tag.id  = entry_tags.tag_id
> >WHERE entry.id
> >  = NEW.id
> >GROUP BY entry.id
> > , category;
> >RETURN NEW;
> >  END;
> >  $function$
> > 
> > */“None of you has faith until he loves for his brother or his neighbor
> > what he loves for him

Re: Why the index is not used ?

2018-10-08 Thread Paul McGarry
Hi Didier,

Yes, credit cards are a very specific space that probably gets people who are 
familiar with it going a bit. By the time you factor in general security 
practices, specific PCI requirements, your threat model and likely business 
requirements (needing relatively free access to parts of the card number) the 
acceptable solution space narrows considerably.

More generally though I’d recommend reading:

https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql

as (even if you aren’t using PHP) it discusses several strategies and what 
makes them good/bad for different use cases and how to implement them well.

I don’t think I’d consider the main solution discussed there particularly 
applicable to credit card data (mostly because the low entropy of card data 
makes it difficult to handle safely without additional per-row randomness 
added, though as always, consult your QSA) but it is generally interesting.

Paul

Sent from my iPhone

> On 9 Oct 2018, at 01:29, ROS Didier  wrote:
> 
> Hi Paul
>  
>Thank you very much for your feedback which is very 
> informative.
>I understand that concerning the encryption of credit card 
> numbers, it is imperative to respect the PCI DSS document. I am going to 
> study it.
>However, I would like to say that I chose my example badly by 
> using a table storing credit card numbers. In fact, my problem is more 
> generic.
> I want to implement a solution that encrypts “sensitive” data and can 
> retrieve data with good performance (by using an index).
> I find that the solution you propose is very interesting and I am going to 
> test it.
>  
> Best Regards
> Didier ROS
>  
> De : p...@paulmcgarry.com [mailto:p...@paulmcgarry.com] 
> Envoyé : lundi 8 octobre 2018 00:11
> À : ROS Didier 
> Cc : fola...@peoplecall.com; pavel.steh...@gmail.com; 
> pgsql-...@lists.postgresql.org; pgsql-performa...@lists.postgresql.org; 
> pgsql-general@lists.postgresql.org
> Objet : Re: Why the index is not used ?
>  
> Hi Didier,
>  
> I’m sorry to tell you that you are probably doing something (ie 
> handling/storing credit cards) which would mean you have to comply with PCI 
> DSS requirements.
>  
> As such you should probably have a QSA (auditor) who you can run any proposed 
> solution by (so you know they will be comfortable with it when they do their 
> audit).
>  
> I think your current solution would be frowned upon because:
> - cards are effectively stored in plaintext in the index.
> - your encryption/decryption is being done in database, rather than by 
> something with that as its sole role.
>  
> People have already mentioned the former so I won’t go into it further
>  
> But for the second part if someone can do a 
>  
> Select pgp_sym_decrypt(cc)
>  
> then you are one sql injection away from having your card data stolen. You do 
> have encryption, but in practice everything is available unencrypted so in 
> practice the encryption is more of a tick in a box than an actual defence 
> against bad things happening. In a properly segmented system even your DBA 
> should not be able to access decrypted card data.
>  
> You probably should look into doing something like:
>  
> - store the first 6 and last 4 digits of the card unencrypted.
> - store the remaining card digits encrypted
> - have the encryption/decryption done by a seperate service called by your 
> application code outside the db.
>  
> You haven’t gone into what your requirements re search are (or I missed them) 
> but while the above won’t give you a fast exact cc lookup in practice being 
> able to search using the first 6 and last 4 can get you a small enough subset 
> than can then be filtered after decrypting the middle. 
>  
> We are straying a little off PostgreSQL topic here but if you and/or your 
> management aren’t already looking at PCI DSS compliance I’d strongly 
> recommend you do so. It can seem like a pain but it is much better to take 
> that pain up front rather than having to reengineer everything later. There 
> are important security aspects it helps make sure you cover but maybe some 
> business aspects (ie possible partners who won’t be able to deal with you 
> without your compliance sign off documentation).
>  
>  
> The alternative, if storing cc data isn’t a core requirement, is to not store 
> the credit card data at all. That is generally the best solution if it meets 
> your needs, ie if you just want to accept payments then use a third party who 
> is PCI compliant to handle the cc part.
>  
> I hope that helps a little.
>  
> Paul
>  
>  
>  
> 
> Sent from my iPhone
> 
> On 8 Oct 2018, at 05:32, ROS Didier  wrote:
> 
> Hi Francisco
> 
>Thank you for your remark. 
>You're right, but it's the only procedure I found to make search on 
> encrypted fields with good response times (using index) !
> 
>Regarding access to the file system, our servers are in protected network 
> areas. few

Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Adrian Klaver

On 10/8/18 1:58 PM, Malik Rumi wrote:

So what is the script you used to do the bulk INSERT?

There's actually three, but they are all basically the same. The 
differences have to do with the source material being inserted:


# usr/local/bin/python3.6
# coding: utf-8

from os import environ
environ['DJANGO_SETTINGS_MODULE'] = 'chronicle.settings'
import django
django.setup()
from ktab.models import Entry, Tag


So I am going to assume Entry and Tag map to the tables ktab_entry and 
public.ktab_entry_tags respectively.



from django.utils.text import slugify
import csv


filename = '/home/malikarumi/Projects/Progress_Logs/edit_blogpost.csv'

with open(filename, 'rt') as text:
     reader = csv.DictReader(text, delimiter=',')
     # next(reader, None)
     for row in reader:
         my_entry = Entry.objects.create(
             title=row['title'], slug=row['slug'], 
chron_date=row['created'],

             clock=row['clock'], content=row['content'])
         my_entry.tags.add(row['tag'])


I thought that you needed to pass a model object to add(). If I am 
following the above it is just a csv field value.


Assuming log_statement = 'mod', I would tail the postgresql.conf file to 
see what is actually hitting the database.




*/“None of you has faith until he loves for his brother or his neighbor 
what he loves for himself.”/*







--
Adrian Klaver
adrian.kla...@aklaver.com



Re: FTS trigger works 1 at a time, but fails with bulk insert script

2018-10-08 Thread Adrian Klaver

On 10/8/18 3:54 PM, Adrian Klaver wrote:

On 10/8/18 1:58 PM, Malik Rumi wrote:

So what is the script you used to do the bulk INSERT?

There's actually three, but they are all basically the same. The 
differences have to do with the source material being inserted:


# usr/local/bin/python3.6
# coding: utf-8

from os import environ
environ['DJANGO_SETTINGS_MODULE'] = 'chronicle.settings'
import django
django.setup()
from ktab.models import Entry, Tag


So I am going to assume Entry and Tag map to the tables ktab_entry and 
public.ktab_entry_tags respectively.



from django.utils.text import slugify
import csv


filename = '/home/malikarumi/Projects/Progress_Logs/edit_blogpost.csv'

with open(filename, 'rt') as text:
     reader = csv.DictReader(text, delimiter=',')
     # next(reader, None)
     for row in reader:
         my_entry = Entry.objects.create(
             title=row['title'], slug=row['slug'], 
chron_date=row['created'],

             clock=row['clock'], content=row['content'])
         my_entry.tags.add(row['tag'])


I thought that you needed to pass a model object to add(). If I am 
following the above it is just a csv field value.


Assuming log_statement = 'mod', I would tail the postgresql.conf file to 
see what is actually hitting the database.


Aargh, meant tail the Postgres log file.





*/“None of you has faith until he loves for his brother or his 
neighbor what he loves for himself.”/*










--
Adrian Klaver
adrian.kla...@aklaver.com



Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
(Please do not forget to add the community mailing list in CC.)

On Tue, Oct 09, 2018 at 10:33:56AM +0800, magodo wrote:
> Since the backup history aims to aid administrator to identify the
> point from which wal archive should be kept and before which the
> archive could be cleaned. It is very helpful in kinds of use cases.
> 
> Why not also create it when do basebackup on standby?

The backup_label file is equally helpful, and backup history files are
not mandatory for backups.  One of the reasons behind why we cannot have
those on standbys is that there is no actual way to ensure the
uniqueness of this file, as two backups could be started in parallel
with the same start location and the *same* file name.  If an archive
command is not able to handle correctly the duplicates, you could bloat
pg_wal.  And that's a real problem.
--
Michael


signature.asc
Description: PGP signature


Re: pg9.6: no backup history file (*.backup) created on hot standby

2018-10-08 Thread Michael Paquier
On Tue, Oct 09, 2018 at 11:45:50AM +0800, magodo wrote:
> Yet, I am still not so clear how does the bloat of pg_wal happen? Do
> you mean pg_wal will be filled up by many .backup(s)?

If your archive_command is for example a simple cp (which it should not
be by the way), and if you try to archive twice the same file, then the
archive command would continuously fail and prevent existing WAL
segments to be archived.  Segments are continuously created, and pg_wal
grows in size.
--
Michael


signature.asc
Description: PGP signature


pg_dump: [archiver (db)] query failed: FATAL: semop(id=10649641) failed: Identifier removed

2018-10-08 Thread Raghavendra Rao J S V
Receiving below error while taking the backup using pg_dump. Please help me
why and how to resolve this.


pg_dump: [archiver (db)] query failed: ERROR:  could not open relation with
OID 14132724
pg_dump: [archiver (db)] query was: SELECT
pg_catalog.pg_get_viewdef('14132724'::pg_catalog.oid) AS viewdef
pg_dump: [archiver (db)] query failed: FATAL:  semop(id=10649641) failed:
Identifier removed
FATAL:  semop(id=10649641) failed: Identifier removed
FATAL:  semop(id=10649641) failed: Identifier removed
FATAL:  semop(id=10649641) failed: Identifier removed
pg_dump: [archiver (db)] query was: DECLARE bloboid CURSOR FOR SELECT oid
FROM pg_largeobject_metadata
pg_dump: [archiver (db)] query failed: FATAL:  semop(id=12943400) failed:
Identifier removed
FATAL:  semop(id=12943400) failed: Identifier removed
FATAL:  semop(id=12943400) failed: Identifier removed
FATAL:  semop(id=12943400) failed: Identifier removed
pg_dump: [archiver (db)] query was: DECLARE bloboid CURSOR FOR SELECT oid
FROM pg_largeobject_metadata
pg_dump: [archiver (db)] query failed: FATAL:  semop(id=15466542) failed:
Identifier removed
FATAL:  semop(id=15466542) failed: Identifier removed
FATAL:  semop(id=15466542) failed: Identifier removed
FATAL:  semop(id=15466542) failed: Identifier removed
pg_dump: [archiver (db)] query was: DECLARE bloboid CURSOR FOR SELECT oid
FROM pg_largeobject_metadata
pg_dump: [archiver (db)] query failed: FATAL:  semop(id=17432611) failed:
Identifier removed
FATAL:  semop(id=17432611) failed: Identifier removed
FATAL:  semop(id=17432611) failed: Identifier removed
FATAL:  semop(id=17432611) failed: Identifier removed
pg_dump: [archiver (db)] query was: DECLARE bloboid CURSOR FOR SELECT oid
FROM pg_largeobject_metadata
pg_dump: [archiver (db)] query failed: FATAL:  semop(id=19759139) failed:
Identifier removed
FATAL:  semop(id=19759139) failed: Identifier removed
FATAL:  semop(id=19759139) failed: Identifier removed
FATAL:  semop(id=19759139) failed: Identifier removed
pg_dump: [archiver (db)] query was: DECLARE bloboid CURSOR FOR SELECT oid
FROM pg_largeobject_metadata

-- 
Regards,
Raghavendra Rao J S V


Re: pg_dump: [archiver (db)] query failed: FATAL: semop(id=10649641) failed: Identifier removed

2018-10-08 Thread Tom Lane
Raghavendra Rao J S V  writes:
> Receiving below error while taking the backup using pg_dump. Please help me
> why and how to resolve this.

> pg_dump: [archiver (db)] query failed: FATAL:  semop(id=10649641) failed:
> Identifier removed

Are you running on a platform that uses systemd?  If so, see

https://www.postgresql.org/docs/current/static/kernel-resources.html#SYSTEMD-REMOVEIPC

If it's not systemd, I'd still say that something kneecapped your PG
server by removing its semaphores.  Restarting the postmaster will
clear the problem transiently, but do nothing to prevent a recurrence.

regards, tom lane