Why the index is not used ?
Hi
I would like to submit the following problem to the PostgreSQL community. In my
company, we have data encryption needs.
So I decided to use the following procedure :
(1)Creating a table with a bytea type column to store the encrypted data
CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username VARCHAR(100),
cc bytea);
(2)inserting encrypted data
INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id,
pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2,
cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
(3)Querying the table
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 115735.035 ms (01:55.735)
-> the execution time is very long. So, I decide to create an index
(4)Creating an index on encrypted data
CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
(5)Querying the table again
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
pgp_sym_decrypt
-
test value 32
(1 row)
Time: 118558.485 ms (01:58.558) -> almost 2 minutes !!
postgres=# explain analyze SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM
cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
QUERY PLAN
--
Seq Scan on cartedecredit (cost=0.00..3647.25 rows=500 width=32) (actual
time=60711.787..102920.509 rows=1 loops=1)
Filter: (pgp_sym_decrypt(cc, 'motdepasse'::text) = 'test value 32'::text)
Rows Removed by Filter: 9
Planning time: 0.112 ms
Execution time: 102920.585 ms
(5 rows)
? the index is not used in the execution plan. maybe because of the use of a
function in the WHERE clause. I decide to modify the SQL query
(6)Querying the table
SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
cc=pgp_sym_encrypt('test value 32', 'motdepasse');
pgp_sym_decrypt
-
(0 rows)
Time: 52659.571 ms (00:52.660)
? The execution time is very long and I get no result (!?)
QUERY PLAN
---
Seq Scan on cartedecredit (cost=0.00..3646.00 rows=1 width=32) (actual
time=61219.989..61219.989 rows=0 loops=1)
Filter: (cc = pgp_sym_encrypt('test value 32'::text, 'motdepasse'::text))
Rows Removed by Filter: 10
Planning time: 0.157 ms
Execution time: 61220.035 ms
(5 rows)
? My index is not used.
QUESTIONS :
- why I get no result ?
-why the index is not used?
Thanks in advance
Best Regards
Didier
[cid:[email protected]]
Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à
l'intention exclusive des destinataires et les informations qui y figurent sont
strictement confidentielles. Toute utilisation de ce Message non conforme à sa
destination, toute diffusion ou toute publication totale ou partielle, est
interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si
vous avez reçu ce Message par erreur, merci de le supprimer de votre système,
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support
que ce soit. Nous vous remercions également d'en avertir immédiatement
l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
erreur ou virus.
This message and any attachments (the 'Message') are intended solely for the
addressees. The information contained in this Message is confidential. Any use
of information contained in this Message not in accord with its purpose, any
dissemination or disclosure, either whole or partial, is prohibited except
formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any
part of it. If you have received this message in error, please delete it and
all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or
virus-free.
Re: Why the index is not used ?
so 6. 10. 2018 v 11:57 odesílatel ROS Didier napsal:
> Hi
>
> I would like to submit the following problem to the PostgreSQL community.
> In my company, we have data encryption needs.
> So I decided to use the following procedure :
>
>
>
> (1)Creating a table with a bytea type column to store the encrypted
> data
> CREATE TABLE cartedecredit(card_id SERIAL PRIMARY KEY, username
> VARCHAR(100), cc bytea);
>
>
>
> (2)inserting encrypted data
> INSERT INTO cartedecredit(username,cc) SELECT 'individu ' || x.id,
> pgp_sym_encrypt('test value ' || x.id, 'motdepasse','compress-algo=2,
> cipher-algo=aes256') FROM generate_series(1,10) AS x(id);
>
>
>
> (3)Querying the table
> SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
> pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
>
> pgp_sym_decrypt
>
> -
>
> test value 32
>
> (1 row)
>
>
>
> Time: 115735.035 ms (01:55.735)
> -> the execution time is very long. So, I decide to create an index
>
>
>
> (4)Creating an index on encrypted data
> CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cc);
>
this index cannot to help.
but functional index can cartedecredit(pgp_sym_decrypt(cc, 'motdepasse').
Unfortunately index file will be decrypted in this case.
CREATE INDEX ON
>
>
> (5)Querying the table again
>
> SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
> pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
> pgp_sym_decrypt
>
> -
>
> test value 32
>
> (1 row)
>
>
>
> Time: 118558.485 ms (01:58.558) -> almost 2 minutes !!
> postgres=# explain analyze SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM
> cartedecredit WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32';
>
> QUERY PLAN
>
>
> --
>
> Seq Scan on cartedecredit (cost=0.00..3647.25 rows=500 width=32) (actual
> time=60711.787..102920.509 rows=1 loops=1)
>
>Filter: (pgp_sym_decrypt(cc, 'motdepasse'::text) = 'test value
> 32'::text)
>
>Rows Removed by Filter: 9
>
> Planning time: 0.112 ms
>
> Execution time: 102920.585 ms
>
> (5 rows)
>
>
>
> è the index is not used in the execution plan. maybe because of the use
> of a function in the WHERE clause. I decide to modify the SQL query
>
>
>
> (6)Querying the table
> SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit WHERE
> *cc*=pgp_sym_encrypt('test
> value 32', 'motdepasse');
>
it is strange - this should to use index, when there is usual index over cc
column.
What is result of explain analyze when you penalize seq scan by
set enable_seqscan to off
> pgp_sym_decrypt
>
> -
>
> (0 rows)
>
>
>
> Time: 52659.571 ms (00:52.660)
>
> è The execution time is very long and I get no result (!?)
>
> QUERY PLAN
>
>
> ---
>
> Seq Scan on cartedecredit (cost=0.00..3646.00 rows=1 width=32) (actual
> time=61219.989..61219.989 rows=0 loops=1)
>
>Filter: (cc = pgp_sym_encrypt('test value 32'::text,
> 'motdepasse'::text))
>
>Rows Removed by Filter: 10
>
> Planning time: 0.157 ms
>
> Execution time: 61220.035 ms
>
> (5 rows)
>
>
>
> è My index is not used.
>
>
> QUESTIONS :
> - why I get no result ?
>
> -why the index is not used?
>
> Thanks in advance
>
>
>
> Best Regards
> Didier
>
>
>
>
>
> [image: cid:[email protected]]
>
>
> * Didier ROS*
> * Expertise SGBD*
>
>
> *DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD *
>
>
>
>
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
>
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> pur
Re: Why the index is not used ?
Hello Didier, (3), (5) to find the match, you decrypt the whole table, apparently this take quite a long time. Index cannot help here because indexes work on exact match of type and value, but you compare mapped value, not indexed. Functional index should help, but like it was said, it against the idea of encrypted storage. (6) I never used pgp_sym_encrypt() but I see that in INSERT INTO you supplied additional parameter 'compress-algo=2, cipher-algo=aes256' while in (6) you did not. Probably this is the reason. In general matching indexed bytea column should use index, you can ensure in this populating the column unencrypted and using 'test value 32'::bytea for match. In you case I believe pgp_sym_encrypt() is not marked as STABLE or IMMUTABLE that's why it will be evaluated for each row (very inefficient) and cannot use index. From documentation: "Since an index scan will evaluate the comparison value only once, not once at each row, it is not valid to use a VOLATILE function in an index scan condition." https://www.postgresql.org/docs/10/static/xfunc-volatility.html If you cannot add STABLE/IMMUTABLE to pgp_sym_encrypt() (which apparently should be there), you can encrypt searched value as a separate operation and then search in the table using basic value match. Vlad
Re: Why the index is not used ?
I haven’t looked up what pgp_sym_encrypt() does but assuming it does encryption
the way you should be for credit card data then it will be using a random salt
and the same input value won’t encrypt to the same output value so
WHERE cc=pgp_sym_encrypt('test value 32', 'motdepasse');
wouldn’t work because the value generated by the function when you are
searching on isn’t the same value as when you stored it.
Paul
> On 6 Oct 2018, at 19:57, ROS Didier wrote:
>
> WHERE cc=pgp_sym_encrypt('test value 32', 'motdepasse');
