Re: Why could different data in a table be processed with different performance?
Hi, Assuming DB is quiescent. And if you run? select count(*) from articles where article_id between %s and %s ie without reading json, is your buffers hit count increasing? 20 000 8K blocks *2 is 500MB , should be in RAM after the first run. Fast: read=710 I/O Timings: read=852.547 ==> 1.3 ms /IO 800 IO/s some memory, sequential reads or a good raid layout. Slow: read=5244 I/O Timings: read=24507.621 ==> 4.7 ms /IO 200 IO/s more HD reads? more seeks? slower HD zones ? Maybe you can play with PG cache size. On Sat, Sep 22, 2018 at 12:32 PM Vladimir Ryabtsev wrote: > > I think reindex will improve the heap access..and maybe the index access > too. I don't see why it would be bloated without UPDATE/DELETE, but you > could check to see if its size changes significantly after reindex. > I tried REINDEX, and size of PK index changed from 2579 to 1548 MB. > But test don't show any significant improvement from what it was. May be > read speed for the "big" range became just slightly faster in average. > > Vlad > >
Re: Temporarily very slow planning time after a big delete
On Tue, May 21, 2019 at 8:27 PM Walter Smith wrote: > On Tue, May 21, 2019 at 11:17 AM Peter Geoghegan wrote: > >> On Tue, May 21, 2019 at 11:16 AM Walter Smith >> wrote: >> > It occurs to me that is a somewhat unusual index -- it tracks >> unprocessed notifications so it gets an insert and delete for every row, >> and is normally almost empty. >> >> Is it a very low cardinality index? In other words, is the total >> number of distinct keys rather low? Not just at any given time, but >> over time? > > > Very low. Probably less than ten over all time. I suspect the only use of > the index is to rapidly find the processed=false rows, so the > notifiable_type value isn’t important, really. It would probably work just > as well on any other column. > > — Walter > > > >
Re: Very slow Query compared to Oracle / SQL - Server
Are you sure you're using the same data det ? Unless I'm overlooking something obvious one result has 500 000 rows the other 7 000. >
Why the index is not used ?
Hi
I would like to submit the following problem to the PostgreSQL community. In my
company, we have data encryption needs.
So I decided to use the following procedure :
(1)Creating a table with a bytea type column to store the encrypted data
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100),
cc bytea);
(2)inserting encrypted data
INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id,
pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2,
cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
(3)Querying the table
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 115735.035 ms (01:55.735)
-> the execution time is very long. So, I decide to create an index
(4)Creating an index on encrypted data
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
(5)Querying the table again
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 118558.485 ms (01:58.558) -> almost 2 minutes !!
postgres=# explain analyze SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM
cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
QUERY PLAN
--
Seq Scan on cartedecredit (cost=0.00..3647.25 rows=500 width=32) (actual
time=60711.787..102920.509 rows=1 loops=1)
Filter: (pgp_sym_decrypt(cc, 'motdepasse'::text) = 'test value 32'::text)
Rows Removed by Filter: 9
Planning time: 0.112 ms
Execution time: 102920.585 ms
(5 rows)
? the index is not used in the execution plan. maybe because of the use of a
function in the WHERE clause. I decide to modify the SQL query
(6)Querying the table
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
cc=pgp_sym_encrypt('test value 32', 'motdepasse');
pgp_sym_decrypt
-
(0 rows)
Time: 52659.571 ms (00:52.660)
? The execution time is very long and I get no result (!?)
QUERY PLAN
---
Seq Scan on cartedecredit (cost=0.00..3646.00 rows=1 width=32) (actual
time=61219.989..61219.989 rows=0 loops=1)
Filter: (cc = pgp_sym_encrypt('test value 32'::text, 'motdepasse'::text))
Rows Removed by Filter: 10
Planning time: 0.157 ms
Execution time: 61220.035 ms
(5 rows)
? My index is not used.
QUESTIONS :
- why I get no result ?
-why the index is not used?
Thanks in advance
Best Regards
Didier
[cid:[email protected]]
Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à
l'intention exclusive des destinataires et les informations qui y figurent sont
strictement confidentielles. Toute utilisation de ce Message non conforme à sa
destination, toute diffusion ou toute publication totale ou partielle, est
interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si
vous avez reçu ce Message par erreur, merci de le supprimer de votre système,
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support
que ce soit. Nous vous remercions également d'en avertir immédiatement
l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
erreur ou virus.
This message and any attachments (the 'Message') are intended solely for the
addressees. The information contained in this Message is confidential. Any use
of information contained in this Message not in accord with its purpose, any
dissemination or disclosure, either whole or partial, is prohibited except
formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any
part of it. If you have received this message in error, please delete it and
all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or
virus-free.
RE: Why the index is not used ?
Hi Pavel
Thanks you for your answer. here is a procedure that works :
-CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username
VARCHAR(100), cc bytea);
-INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id,
pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2,
cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
-CREATE INDEX idx_cartedecredit_cc02 ON
cartedecredit(pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2,
cipher-algo=aes256'));
-SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2, cipher-algo=aes256')='test
value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 2.237 ms
- explain analyze SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM
cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2,
cipher-algo=aes256')='test value 32';
QUERY PLAN
---
Index Scan using idx_cartedecredit_cc02 on cartedecredit (cost=0.42..8.44
rows=1 width=32) (actual time=1.545..1.546 rows=1 loops=1)
Index Cond: (pgp_sym_decrypt(cc, 'motdepasse'::text, 'compress-algo=2,
cipher-algo=aes256'::text) = 'test value 32'::text)
Planning time: 0.330 ms
Execution time: 1.580 ms
(4 rows)
OK that works great.
Thank you for the recommendation
Best Regards
[cid:[email protected]]
Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre
[email protected]<mailto:[email protected]>
[email protected]<mailto:[email protected]>
[email protected]<mailto:[email protected]>
Tél. : 01 78 66 61 14
Tél. mobile : 06 49 51 11 88
Lync : [email protected]
De : [email protected] [mailto:[email protected]]
Envoyé : samedi 6 octobre 2018 12:14
À : ROS Didier
Cc : [email protected]; [email protected];
[email protected]
Objet : Re: Why the index is not used ?
so 6. 10. 2018 v 11:57 odesílatel ROS Didier
mailto:[email protected]>> napsal:
Hi
I would like to submit the following problem to the PostgreSQL community. In my
company, we have data encryption needs.
So I decided to use the following procedure :
(1)Creating a table with a bytea type column to store the encrypted data
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100),
cc bytea);
(2)inserting encrypted data
INSERT INTO cartedecredit(username,cc) SELECT 'individu ' ||
x.id<http://x.id>, pgp_sym_encrypt('test value ' || x.id<http://x.id>,
'motdepasse','compress-algo=2, cipher-algo=aes256') FROM
generate_series(1,10) AS x(id);
(3)Querying the table
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 115735.035 ms (01:55.735)
-> the execution time is very long. So, I decide to create an index
(4)Creating an index on encrypted data
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
this index cannot to help.
but functional index can cartedecredit(pgp_sym_decrypt(cc, 'motdepasse').
Unfortunately index file will be decrypted in this case.
CREATE INDEX ON
(5)Querying the table again
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 118558.485 ms (01:58.558) -> almost 2 minutes !!
postgres=# explain analyze SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM
cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
QUERY PLAN
--
Seq Scan on cartedecredit (cost=0.00..3647.25 rows=500 width=32) (actual
time=60711.787..102920.509 rows=1 loops=1)
Filter: (pgp_sym_decrypt(cc, 'motdepasse'::text) = 'test value 32'::text)
Rows Removed by Filter: 9
Planning time: 0.112 ms
Execution time: 102920.585 ms
(5 rows)
==> the index is not used in the execution plan. maybe because of the use of a
function in the WHERE clause. I decide to modify the SQL q
RE: Why the index is not used ?
Hi Paul
Thanks for the explanation. I think you are right.
I understand why the WHERE clause “cc=pgp_sym_encrypt('test
value 32', 'motdepasse');” does not bring anything back.
Best Regards
Didier ROS
De : [email protected] [mailto:[email protected]]
Envoyé : dimanche 7 octobre 2018 04:21
À : ROS Didier
Cc : [email protected]; [email protected];
[email protected]
Objet : Re: Why the index is not used ?
I haven’t looked up what pgp_sym_encrypt() does but assuming it does encryption
the way you should be for credit card data then it will be using a random salt
and the same input value won’t encrypt to the same output value so
WHERE cc=pgp_sym_encrypt('test value 32', 'motdepasse');
wouldn’t work because the value generated by the function when you are
searching on isn’t the same value as when you stored it.
Paul
On 6 Oct 2018, at 19:57, ROS Didier
mailto:[email protected]>> wrote:
WHERE cc=pgp_sym_encrypt('test value 32', 'motdepasse');
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à
l'intention exclusive des destinataires et les informations qui y figurent sont
strictement confidentielles. Toute utilisation de ce Message non conforme à sa
destination, toute diffusion ou toute publication totale ou partielle, est
interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si
vous avez reçu ce Message par erreur, merci de le supprimer de votre système,
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support
que ce soit. Nous vous remercions également d'en avertir immédiatement
l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
erreur ou virus.
This message and any attachments (the 'Message') are intended solely for the
addressees. The information contained in this Message is confidential. Any use
of information contained in this Message not in accord with its purpose, any
dissemination or disclosure, either whole or partial, is prohibited except
formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any
part of it. If you have received this message in error, please delete it and
all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or
virus-free.
RE: Why the index is not used ?
Hi Francisco Thank you for your remark. You're right, but it's the only procedure I found to make search on encrypted fields with good response times (using index) ! Regarding access to the file system, our servers are in protected network areas. few people can connect to it. it's not the best solution, but we have data encryption needs and good performance needs too. I do not know how to do it except the specified procedure.. if anyone has any proposals to put this in place, I'm interested. Thanks in advance Best Regards Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 17:58 À : ROS Didier Cc : [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? ROS: On Sun, Oct 7, 2018 at 3:13 PM, ROS Didier wrote: > -INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, > pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, > cipher-algo=aes256') FROM generate_series(1,10) AS x(id); > -CREATE INDEX idx_cartedecredit_cc02 ON > cartedecredit(pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2, > cipher-algo=aes256')); If my french is not too rusty you are encrypting a credit-card, and then storing an UNENCRYPTED copy in the index. So, getting it from the server is trivial for anyone with filesystem access. Francisco Olarte. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: Why the index is not used ?
Hi Vlad Your remark is very interesting. You want to say that it's better to run SQL queries on unpersonalized data, and then retrieve the encrypted data for those records. OK, I take this recommendation into account and I will forward it to my company's projects. Nevertheless, you say that it is possible, in spite of everything, to use indexes on the encrypted data by using deterministic algorithms. Can you tell me some examples of these algorithms? Thanks in advance Best Regards [cid:[email protected]] Didier ROS Expertise SGBD De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 21:33 À : ROS Didier Cc : [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? Additionally it is not clear why you want to search in table on encrypted data. Usually you match user with it's unpersonalized data (such as login, user ID) and then decrypt personalized data. If you need to store user identifying data encrypted as well (e.g. bank account number) you can use a deterministic algorithm for it (without salt) because it is guaranteed to be unique and you don't need to have different encrypted data for two same input strings. Vlad Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: Why the index is not used ?
Hi Virendra You think that outside encryption of the database is the best solution ? How do you manage the encryption key ? Can you give me some examples of this kind of solution. Best Regards Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 20:41 À : ROS Didier ; [email protected] Cc : [email protected]; [email protected]; [email protected]; [email protected] Objet : RE: Why the index is not used ? You can consider outside DB encryption which is less of worry for performance and data at rest will be encrypted. Regards, Virendra -Original Message- From: ROS Didier [mailto:[email protected]] Sent: Sunday, October 07, 2018 2:33 PM To: [email protected] Cc: [email protected]; [email protected]; [email protected]; [email protected] Subject: RE: Why the index is not used ? Hi Francisco Thank you for your remark. You're right, but it's the only procedure I found to make search on encrypted fields with good response times (using index) ! Regarding access to the file system, our servers are in protected network areas. few people can connect to it. it's not the best solution, but we have data encryption needs and good performance needs too. I do not know how to do it except the specified procedure.. if anyone has any proposals to put this in place, I'm interested. Thanks in advance Best Regards Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 17:58 À : ROS Didier Cc : [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? ROS: On Sun, Oct 7, 2018 at 3:13 PM, ROS Didier wrote: > -INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, > pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, > cipher-algo=aes256') FROM generate_series(1,10) AS x(id); > -CREATE INDEX idx_cartedecredit_cc02 ON > cartedecredit(pgp_sym_decrypt(cc, 'motdepasse','compress-algo=2, > cipher-algo=aes256')); If my french is not too rusty you are encrypting a credit-card, and then storing an UNENCRYPTED copy in the index. So, getting it from the server is trivial for anyone with filesystem access. Francisco Olarte. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. This message is intended only for the use of the addressee and may contain information that is PRIVILEGED AND CONFIDENTIAL. If you are not the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify the sender immediately. Thank you. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent son
RE: Why the index is not used ?
Hi Vlad Sorry for this delay, but apparently the subject is of interest to many people in the community. I received a lot of comments and answers. I wrote my answers in the body of your message below Best Regards Didier De : [email protected] [mailto:[email protected]] Envoyé : samedi 6 octobre 2018 18:51 À : ROS Didier Cc : [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? Hello Didier, >> (3), (5) to find the match, you decrypt the whole table, apparently this take quite a long time. Index cannot help here because indexes work on exact match of type and value, but you compare mapped value, not indexed. Functional index should help, but like it was said, it against the idea of encrypted storage. << I tested the solution of the functional index. It works very well, but the data is no longer encrypted. This is not the right solution >> (6) I never used pgp_sym_encrypt() but I see that in INSERT INTO you supplied additional parameter 'compress-algo=2, cipher-algo=aes256' while in (6) you did not. Probably this is the reason. In general matching indexed bytea column should use index, you can ensure in this populating the column unencrypted and using 'test value 32'::bytea for match. In you case I believe pgp_sym_encrypt() is not marked as STABLE or IMMUTABLE that's why it will be evaluated for each row (very inefficient) and cannot use index. From documentation: "Since an index scan will evaluate the comparison value only once, not once at each row, it is not valid to use a VOLATILE function in an index scan condition." https://www.postgresql.org/docs/10/static/xfunc-volatility.html If you cannot add STABLE/IMMUTABLE to pgp_sym_encrypt() (which apparently should be there), you can encrypt searched value as a separate operation and then search in the table using basic value match. >> you're right about the missing parameter 'compress-algo=2, cipher-algo=aes256'. I agree with you. (1) I have tested your proposal : DROP TABLE cartedecredit; CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), cc bytea); INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, decode('test value ' || x.id,'escape') FROM generate_series(1,10) AS x(id); è I inserted unencrypted data into the bytea column postgres=# select * from cartedecredit limit 5 ; card_id | username | cc -+-+-- 1 | individu 1 | \x746573742076616c75652031 2 | individu 2 | \x746573742076616c75652032 3 | individu 3 | \x746573742076616c75652033 4 | individu 4 | \x746573742076616c75652034 5 | individu 5 | \x746573742076616c75652035 CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc); SELECT encode(cc,'escape') FROM cartedecredit WHERE cc=decode('test value 32','escape'); QUERY PLAN Index Only Scan using idx_cartedecredit_cc02 on cartedecredit (cost=0.42..8.44 rows=1 width=32) (actual time=0.033..0.034 rows=1 loops=1) Index Cond: (cc = '\x746573742076616c7565203332'::bytea) Heap Fetches: 1 Planning time: 0.130 ms Execution time: 0.059 ms (5 rows) è It works but the data is not encrypted. everyone can have access to the data (2) 2nd test : DROP TABLE cartedecredit; CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100), cc bytea); INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id, pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2, cipher-algo=aes256') FROM generate_series(1,10) AS x(id); postgres=# select * from cartedecredit limit 5 ; >> card_id | username | cc -+-+- --- 1 | individu 1 | \xc30d0409030296304d007bf50ed768d2480153cd4a4e2d240249f94b31ec168391515ea80947f97970f7a4e058bff648f752df194498dd480c3b8a5c0d2942f90c6dde21a6b9bf4e9fd7986c6f986e3783 647e7a6205b48c03 2 | individu 2 | \xc30d0409030257b50bc0e6bcd8d270d248010984b60126af01ba922da27e2e78c33110f223f0210cf34da77243277305254cba374708d447fc7d653dd9e00ff9a96803a2c47ee95269534f2c24fab1c9dc 31f7909ca7adeaf0 3 | individu 3 | \xc30d0409030
RE: Why the index is not used ?
Hi Vlad OK, I take into account your remark about the need to do research on encrypted data. My answers to your remarks : >> you can use a deterministic algorithm for it (without salt) << Can you give me on of these deterministic algorithms(without salt) ? Best Regards Didier De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 21:33 À : ROS Didier Cc : [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? Additionally it is not clear why you want to search in table on encrypted data. Usually you match user with it's unpersonalized data (such as login, user ID) and then decrypt personalized data. If you need to store user identifying data encrypted as well (e.g. bank account number) you can use a deterministic algorithm for it (without salt) because it is guaranteed to be unique and you don't need to have different encrypted data for two same input strings. Vlad Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: Why the index is not used ?
Hi Tomas Thank you for your answer and recommendation which is very interesting. I'm going to study the PCI DSS document right now. - Here are my answer to your question : >> What is your threat model? << we want to prevent access to sensitive data for everyone except those who have the encryption key. in case of files theft, backups theft, dumps theft, we do not want anyone to access sensitive data. - I have tested the solution you proposed, it works great. Best Regards Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 22:08 À : ROS Didier ; [email protected] Cc : [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? Hi, On 10/07/2018 08:32 PM, ROS Didier wrote: > Hi Francisco > > Thank you for your remark. > You're right, but it's the only procedure I found to make search on > encrypted fields with good response times (using index) ! > Unfortunately, that kinda invalidates the whole purpose of in-database encryption - you'll have encrypted on-disk data in one place, and then plaintext right next to it. If you're dealing with credit card numbers, then you presumably care about PCI DSS, and this is likely a direct violation of that. > Regarding access to the file system, our servers are in protected network areas. few people can connect to it. > Then why do you need encryption at all? If you assume access to the filesystem / storage is protected, why do you bother with encryption? What is your threat model? > it's not the best solution, but we have data encryption needs and good > performance needs too. I do not know how to do it except the specified > procedure.. > > if anyone has any proposals to put this in place, I'm interested. > One thing you could do is hashing the value and then searching by the hash. So aside from having the encrypted column you'll also have a short hash, and you may use it in the query *together* with the original condition. It does not need to be unique (in fact it should not be to make it impossible to reverse the hash), but it needs to have enough distinct values to make the index efficient. Say, 10k values should be enough, because that means 0.01% selectivity. So the function might look like this, for example: CREATE FUNCTION cchash(text) RETURNS int AS $$ SELECT abs(hashtext($1)) % 1; $$ LANGUAGE sql; and then be used like this: CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc)); and in the query SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32' AND cchash(cc) = cchash('test value 32'); Obviously, this does not really solve the issues with having to pass the password to the query, making it visible in pg_stat_activity, various logs etc. Which is why people generally use FDE for the whole disk, which is transparent and provides the same level of protection. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: Why the index is not used ?
Hi Paul Thank you very much for your feedback which is very informative. I understand that concerning the encryption of credit card numbers, it is imperative to respect the PCI DSS document. I am going to study it. However, I would like to say that I chose my example badly by using a table storing credit card numbers. In fact, my problem is more generic. I want to implement a solution that encrypts “sensitive” data and can retrieve data with good performance (by using an index). I find that the solution you propose is very interesting and I am going to test it. Best Regards Didier ROS De : [email protected] [mailto:[email protected]] Envoyé : lundi 8 octobre 2018 00:11 À : ROS Didier Cc : [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Objet : Re: Why the index is not used ? Hi Didier, I’m sorry to tell you that you are probably doing something (ie handling/storing credit cards) which would mean you have to comply with PCI DSS requirements. As such you should probably have a QSA (auditor) who you can run any proposed solution by (so you know they will be comfortable with it when they do their audit). I think your current solution would be frowned upon because: - cards are effectively stored in plaintext in the index. - your encryption/decryption is being done in database, rather than by something with that as its sole role. People have already mentioned the former so I won’t go into it further But for the second part if someone can do a Select pgp_sym_decrypt(cc) then you are one sql injection away from having your card data stolen. You do have encryption, but in practice everything is available unencrypted so in practice the encryption is more of a tick in a box than an actual defence against bad things happening. In a properly segmented system even your DBA should not be able to access decrypted card data. You probably should look into doing something like: - store the first 6 and last 4 digits of the card unencrypted. - store the remaining card digits encrypted - have the encryption/decryption done by a seperate service called by your application code outside the db. You haven’t gone into what your requirements re search are (or I missed them) but while the above won’t give you a fast exact cc lookup in practice being able to search using the first 6 and last 4 can get you a small enough subset than can then be filtered after decrypting the middle. We are straying a little off PostgreSQL topic here but if you and/or your management aren’t already looking at PCI DSS compliance I’d strongly recommend you do so. It can seem like a pain but it is much better to take that pain up front rather than having to reengineer everything later. There are important security aspects it helps make sure you cover but maybe some business aspects (ie possible partners who won’t be able to deal with you without your compliance sign off documentation). The alternative, if storing cc data isn’t a core requirement, is to not store the credit card data at all. That is generally the best solution if it meets your needs, ie if you just want to accept payments then use a third party who is PCI compliant to handle the cc part. I hope that helps a little. Paul Sent from my iPhone On 8 Oct 2018, at 05:32, ROS Didier mailto:[email protected]>> wrote: Hi Francisco Thank you for your remark. You're right, but it's the only procedure I found to make search on encrypted fields with good response times (using index) ! Regarding access to the file system, our servers are in protected network areas. few people can connect to it. it's not the best solution, but we have data encryption needs and good performance needs too. I do not know how to do it except the specified procedure.. if anyone has any proposals to put this in place, I'm interested. Thanks in advance Best Regards Didier ROS -Message d'origine- De : [email protected]<mailto:[email protected]> [mailto:[email protected]] Envoyé : dimanche 7 octobre 2018 17:58 À : ROS Didier mailto:[email protected]>> Cc : [email protected]<mailto:[email protected]>; [email protected]<mailto:[email protected]>; [email protected]<mailto:[email protected]>; [email protected]<mailto:[email protected]> Objet : Re: Why the index is not used ? ROS: On Sun, Oct 7, 2018 at 3:13 PM, ROS Didier mailto:[email protected]>> wrote: -INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id<http://x.id>, pgp_sym_encrypt('test value ' || x.id<http://x.id>, 'motdepasse','compress-algo=2, ciphe
How to get the content of Bind variables
Hi In the log file of my PostgreSQL cluster, I find : >> Statement: update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 << è how to get the content of the bind variables ? Thanks in advance Best Regards [cid:[email protected]] Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected]<mailto:[email protected]> Tél. : +33 6 49 51 11 88 [cid:[email protected]]<mailto:[email protected]>[cid:[email protected]] Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: How to get the content of Bind variables
Hi Thanks for the answer. I have in my postgresql.conf : log_min_duration_statement=0 and the content of bind variables is not showed in the log file. What can I do to get the content of the bind variables ? Best Regard [cid:[email protected]] Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected]<mailto:[email protected]> Tél. : +33 6 49 51 11 88 [cid:[email protected]]<mailto:[email protected]>[cid:[email protected]] De : [email protected] [mailto:[email protected]] Envoyé : jeudi 28 février 2019 13:37 À : ROS Didier Cc : [email protected] Objet : Re: How to get the content of Bind variables If you set log_min_duration_statement low enough for your particular query, you will see another line below it showing what values are associated with each bind variable like this: 2019-02-28 00:07:55 CST 2019-02-28 00:02:09 CST ihr2 10.86.42.184(43460) SELECT LOG: duration: 26078.308 ms execute : select pg_advisory_lock($1) 2019-02-28 00:07:55 CST 2019-02-28 00:02:09 CST ihr2 10.86.42.184(43460) SELECT DETAIL: parameters: $1 = '3428922050323511872' Regards, Michael Vitale ROS Didier<mailto:[email protected]> Thursday, February 28, 2019 7:21 AM Hi In the log file of my PostgreSQL cluster, I find : >> Statement: update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 << è how to get the content of the bind variables ? Thanks in advance Best Regards [cid:[email protected]] Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected]<mailto:[email protected]> Tél. : +33 6 49 51 11 88 [cid:[email protected]]<mailto:[email protected]>[cid:[email protected]] Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in
RE: How to get the content of Bind variables
Hi Laurent Here is a biggest part of my log file : >> 2019-02-27 14:41:28 CET [16239]: [5696-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG: duration: 1.604 ms 2019-02-27 14:41:28 CET [16239]: [5697-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG: duration: 0.084 ms parse : update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 2019-02-27 14:41:28 CET [16239]: [5698-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG: plan: 2019-02-27 14:41:28 CET [16239]: [5699-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainSTATEMENT: update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 2019-02-27 14:41:28 CET [16239]: [5700-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG: duration: 0.288 ms bind : update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 2019-02-27 14:41:28 CET [16239]: [5701-1] [10086] user=pgbd_preint_sg2,db=pgbd_preint_sg2,client=localhost.localdomainLOG: execute : update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 << The statement has been executed It is the same problem for all the statements. I can not get the content of the bind variables. Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected] Tél. : +33 6 49 51 11 88 -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : jeudi 28 février 2019 17:01 À : ROS Didier ; [email protected] Objet : Re: How to get the content of Bind variables ROS Didier wrote: >In the log file of my PostgreSQL cluster, I find : > >> > Statement: update t_shared_liste_valeurs set deletion_date=$1, > deletion_login=$2, modification_date=$3, modification_login=$4, > administrable=$5, libelle=$6, niveau=$7 where code=$8 << > > how to get the content of the bind variables ? Can we see the whole log entry and the following one? Perhaps there was a syntax error or similar, and the statement was never executed. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: How to get the content of Bind variables
Hi Here is the information : postgres=# show log_error_verbosity ; log_error_verbosity - default (1 row) postgres=# show log_statement ; log_statement --- none (1 row) I am trying now to set up log_statement : log_statement=all ; log_min_duration_statement=250; Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected] Tél. : +33 6 49 51 11 88 -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : jeudi 28 février 2019 17:19 À : ROS Didier Cc : [email protected] Objet : Re: How to get the content of Bind variables On Thu, Feb 28, 2019 at 12:21:56PM +, ROS Didier wrote: > Statement: update t_shared_liste_valeurs set deletion_date=$1, > deletion_login=$2, modification_date=$3, modification_login=$4, > administrable=$5, libelle=$6, niveau=$7 where code=$8 > > è how to get the content of the bind variables ? What is your setting of log_error_verbosity ? https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-ERROR-VERBOSITY Also, I recommend using CSV logs, since they're easier to import into the DB and then easier to parse. https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-ERROR-VERBOSITY Also, note that you can either set log_min_duration_statement=0, which logs all statement durations, and associated statements (if they haven't been previously logged). Or, you can set log_statement=all, which logs all statements (but duration is only logged according to log_min_duration_statement). Justin Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: How to get the content of Bind variables
Hi Tom Thanks a lot for your answer. *) Here is information about my server : [postgres@noeyypvd pg_log]$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.3 (Maipo) postgres=# select version() ; version - PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit (1 row) *) it's very problematic that we can not get the content of bind variables. We can not determine the root query which makes UPDATE statements to crash our production database. What can explain the lack of information about bind variables? *) Here is the parameters setting I use : # postgresql.conf : include_if_exists = '/appli/postgres/pgbd_prod_pda/10/conf/audit.conf' log_rotation_size = 0 log_destination=stderr logging_collector=true client_min_messages=notice log_min_messages=ERROR log_min_error_statement=ERROR log_min_duration_statement=250 debug_print_parse=off debug_print_rewritten=off debug_print_plan=on debug_pretty_print=on log_checkpoints=on log_connections=on log_disconnections=on log_duration=on log_error_verbosity=VERBOSE log_hostname=on log_lock_waits=on deadlock_timeout=1s log_statement=all log_temp_files=0 log_autovacuum_min_duration = 0 track_activities=on track_io_timing=on track_functions=all log_line_prefix = '%t [%p]: [%l-1] [%x] user=%u,db=%d,client=%h' lc_messages ='C' shared_preload_libraries = 'passwordcheck,pg_stat_statements,pgstattuple' listen_addresses = '*' pg_stat_statements.track=all pg_stat_statements.max = 1000 pg_stat_statements.track_utility=on pg_stat_statements.save=on *) -> suggestion : It would be nice to have the content of bind variable of a query in a table of pg_catalog. (cf ORACLE) Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected] Tél. : +33 6 49 51 11 88 -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : vendredi 1 mars 2019 17:30 À : ROS Didier Cc : [email protected]; [email protected] Objet : Re: How to get the content of Bind variables ROS Didier writes: > postgres=# show log_error_verbosity ; > log_error_verbosity > - > default > (1 row) So ... how old is this server? AFAIK the above should be enough to ensure you get the DETAIL lines with parameter values. But the ability to log those hasn't been there forever. regards, tom lane Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: How to get the content of Bind variables
Hi The SQL is not executed from a trigger. Here is an extract of my log file : >> 2019-03-01 14:53:37 CET [24803]: [129-1] [3686] user=pgbd_preint_sg2,db=pgbd_preint_sg2 LOG: process 24803 still waiting for ShareLock on transaction 3711 after 1000.476 ms 2019-03-01 14:53:37 CET [24803]: [130-1] [3686] user=pgbd_preint_sg2,db=pgbd_preint_sg2 DETAIL: Process holding the lock: 24786. Wait queue: 24803. 2019-03-01 14:53:37 CET [24803]: [131-1] [3686] user=pgbd_preint_sg2,db=pgbd_preint_sg2 CONTEXT: while rechecking updated tuple (3,33) in relation "t_shared_liste_valeurs" 2019-03-01 14:53:37 CET [24803]: [132-1] [3686] user=pgbd_preint_sg2,db=pgbd_preint_sg2 STATEMENT: update t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, modification_date=$3, modification_login=$4, administrable=$5, libelle=$6, niveau=$7 where code=$8 << After a fresh db restart, the result is the same : no content of Bind variables in the log file. Best Regards[cid:[email protected]] Didier ROS Expertise SGBD EDF - DTEO - DSIT - IT DMA Département Solutions Groupe Groupe Performance Applicative 32 avenue Pablo Picasso 92000 NANTERRE [email protected]<mailto:[email protected]> Tél. : +33 6 49 51 11 88 [cid:[email protected]]<mailto:[email protected]>[cid:[email protected]] -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : vendredi 1 mars 2019 21:42 À : [email protected] Objet : RE: How to get the content of Bind variables Hi Didier, I imagine that this is the sql executed from a trigger. Could you provide the trigger pl/pgsql code ? as the source and target tables (anonymized) definition ? After a fresh db restart, are thoses logs the same for the 6 first executions and the following ones ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
RE: How to get the content of Bind variables
Hi I have executed grep command on the entire logfile for pid 24803. See the attached file NB : I have no DETAIL section in my entire log file. Is it normal ? Best Reagrds Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : samedi 2 mars 2019 16:57 À : ROS Didier Cc : [email protected]; [email protected]; [email protected] Objet : Re: How to get the content of Bind variables On Fri, Mar 01, 2019 at 06:47:06PM +, ROS Didier wrote: > log_line_prefix = '%t [%p]: [%l-1] [%x] user=%u,db=%d,client=%h' On Sat, Mar 02, 2019 at 01:14:44PM +, ROS Didier wrote: > 2019-03-01 14:53:37 CET [24803]: [129-1] [3686] > user=pgbd_preint_sg2,db=pgbd_preint_sg2 LOG: process 24803 still > waiting for ShareLock on transaction 3711 after 1000.476 ms > 2019-03-01 14:53:37 CET [24803]: [130-1] [3686] > user=pgbd_preint_sg2,db=pgbd_preint_sg2 DETAIL: Process holding the lock: > 24786. Wait queue: 24803. > 2019-03-01 14:53:37 CET [24803]: [131-1] [3686] > user=pgbd_preint_sg2,db=pgbd_preint_sg2 CONTEXT: while rechecking updated > tuple (3,33) in relation "t_shared_liste_valeurs" > 2019-03-01 14:53:37 CET [24803]: [132-1] [3686] > user=pgbd_preint_sg2,db=pgbd_preint_sg2 STATEMENT: update > t_shared_liste_valeurs set deletion_date=$1, deletion_login=$2, > modification_date=$3, modification_login=$4, administrable=$5, > libelle=$6, niveau=$7 where code=$8 I just realized that your log is showing "STATEMENT: [...]" which I think means that's using libpq PQexec (simple query protocol), which means it doesn't use or support bind parameters at all. If it were using PQexecParams (protocol 2.0 "extended" query), it would show "execute : [...]", with any bind params in DETAIL. And if you were using PQexecPrepared, it'd show "execute FOO: [...]" where FOO is the name of the statement "prepared" by PQprepare (plus bind params). https://www.postgresql.org/docs/current/libpq-exec.html https://www.postgresql.org/docs/current/protocol.html What client application is this ? It looks like it's going to set deletion_date to the literal string "$1" .. except that it's not quoted, so the statement will just cause an error. Am I wrong ? Could you grep the entire logfile for pid 24803 and post the output on dropbox or pastebin or show 10 lines of context by email ? I've just used my messages and test cases on this patch as a reference to check what I wrote above is accurate. https://www.postgresql.org/message-id/flat/20190210015707.GQ31721%40telsasoft.com#037d17567f4c84a5f436960ef1ed8c49 On Fri, Mar 01, 2019 at 06:47:06PM +, ROS Didier wrote: > *) -> suggestion : It would be nice to have the content of bind > variable of a query in a table of pg_catalog. (cf ORACLE) As I mentioned, you can set log_destination=csvlog,stderr and import them with COPY (and add indices and analysis and monitoring..). It look like DETAILs are being logged, so that's not the issue, but CSV also has the nice benefit of being easily imported to SQL where escaping and linebreaks and similar are not confusing the issue, which I think can be the case for text logs. Justin Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-m
RE: How to get the content of Bind variables
Hi Sergei Thank you for your explanation. I can understand for the lock wait message, but I have no DETAIL section in my entire log file. Why ? I have plenty of STATEMENT sections ... Thanks in advance Best Regards Didier ROS -Message d'origine- De : [email protected] [mailto:[email protected]] Envoyé : samedi 2 mars 2019 17:34 À : ROS Didier ; [email protected]; [email protected] Objet : Re: How to get the content of Bind variables Hello Postgresql does not log statement parameters on log_lock_wait. Because this is not implemented: https://github.com/postgres/postgres/blob/REL_10_STABLE/src/backend/storage/lmgr/proc.c#L1461 Compare with errdetail_params routine in this file: https://github.com/postgres/postgres/blob/REL_10_STABLE/src/backend/tcop/postgres.c#L1847 Currently query parameters can be logged only at the end of successful query execution. regards, Sergei Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
