On 23 Aug 2022, at 21:27, Andrés de la Peña
<adelap...@apache.org> wrote:
As mentioned in the CEP document, dynamic data masking
doesn't try to prevent malicious users with SELECT
permissions to indirectly guess the real value of the
masked value. This can easily be done by just trying
values on the WHERE clause of SELECT queries. DDM would
not be a replacement for proper column-level permissions.
The data served by the database is usually consumed by
applications that present this data to end users. These
end users are not necessarily the users directly
connecting to the database. With DDM, it would be easy
for applications to mask sensitive data that is going to
be consumed by the end users. However, the users
directly connecting to the database should be trusted,
provided that they have the right SELECT permissions.
In other words, DDM doesn't directly protect the data,
but it eases the production of protected data.
Said that, we could later go one step ahead and add a
way to prevent untrusted users from inferring the masked
data. That could be done adding a new permission
required to use certain columns on WHERE clauses,
different to the current SELECT permission. That would
play especially well with column-level permissions,
which is something that we still have pending.
On Tue, 23 Aug 2022 at 19:13, Aaron Ploetz
<aaronplo...@gmail.com> wrote:
Applying this should prevent querying on a
field, else you could leak its contents, surely?
In theory, yes. Although I could see folks doing
something like this:
SELECT COUNT(*) FROM patients
WHERE year_of_birth = 2002
AND date_of_birth >= '2002-04-01'
AND date_of_birth < '2002-11-01';
In this case, the rows containing the masked key
column(s) could be filtered on without revealing the
actual data. But again, that's probably better for
a "phase 2" of the implementation.
Agreed on not being a queryable field. That
would also preclude secondary indexing, right?
Yes, that's my thought as well.
On Tue, Aug 23, 2022 at 12:42 PM Derek Chen-Becker
<de...@chen-becker.org> wrote:
Agreed on not being a queryable field. That
would also preclude secondary indexing, right?
On Tue, Aug 23, 2022 at 11:20 AM Benedict
<bened...@apache.org> wrote:
Applying this should prevent querying on a
field, else you could leak its contents,
surely? This pretty much prohibits using it
in a clustering key, and a partition key
with the ordered partitioner - but probably
also a hashed partitioner since we do not
use a cryptographic hash and the hash
function is well defined.
We probably also need to ensure that any
ALLOW FILTERING queries on such a field are
disabled.
Plausibly the data could be
cryptographically jumbled before using it in
a primary key component (or permitting
filtering), but it is probably easier and
safer to exclude for now…
On 23 Aug 2022, at 18:13, Aaron Ploetz
<aaronplo...@gmail.com> wrote:
Some thoughts on this one:
In a prior job, we'd give app teams access
to a single keyspace, and two roles: a
read-write role and a read-only role. In
some cases, a "privileged" application role
was also requested. Depending on the
requirements, I could see the UNMASK
permission being applied to the RW or
privileged roles. But if there's a problem
on the table and the operators go in to
investigate, they will likely use a
SUPERUSER account, and they'll see that data.
How hard would it be for SUPERUSERs to
*not* automatically get the UNMASK permission?
I'll also echo the concerns around masking
primary key components. It's highly likely
that certain personal data properties would
be used as a partition or clustering key
(ex: range query for people born within a
certain timeframe). In addition to the
"breaks existing" concern, I'm curious
about the challenges around getting that to
work with the current primary key
implementation.
Does this first implementation only apply
to payload (non-key) columns? The examples
in the CEP currently do not show primary
key components being masked.
Thanks,
Aaron
On Tue, Aug 23, 2022 at 6:44 AM Henrik Ingo
<henrik.i...@datastax.com> wrote:
On Tue, Aug 23, 2022 at 1:10 PM Andrés
de la Peña <adelap...@apache.org> wrote:
One thought: The way the CEP is
currently written, it is only
possible to mask a column one
way. You can only define one
masking function for a column,
and since you use the original
column name, you could only
return one version of it in the
result set, even if you had a
way to define several functions.
Right, it's one single type of
mapping per the column, declared on
CREATE/ALTER TABLE statements.
Also, users can manually specify
their own masking function in
SELECT statements if they have
permissions for seeing the clear data.
For those cases where the data is
automatically masked for an
unprivileged user, I don't see the
use of including different types of
masking for the same column into
the same result set. Instead, we
might be interested on having
different types of masking
associated to different roles. We
could do so with dedicated
CREATE/DROP/LIST MASK statements,
instead of using the
CREATE/ALTER/DESCRIBE TABLE
statements. That CREATE MASK
statement would associate a masking
function to a column and role.
However, I'm not sure we need that
type of granularity instead of the
simplicity of attaching the masking
to the column declaration. wdyt?
My gut feeling likewise is that this
adds complexity but little value.
--
Henrik Ingo
+358 40 569 7354 <tel:358405697354>
Visit us online.
<https://www.datastax.com/>Visit us on
Twitter.
<https://twitter.com/DataStaxEng>Visit
us on YouTube.
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=>Visit
my LinkedIn profile.
<https://urldefense.com/v3/__https://www.linkedin.com/in/heingo/__;!!PbtH5S7Ebw!YKhUm1ce3A3Djw9kupwqUWknncAxAeKovQ9vuMMPTMAubth1Zjbs8W62LQMY3KorY7W3H7Fhb1GRu1wnvEAU$>
--
+---------------------------------------------------------------+
| Derek Chen-Becker |
| GPG Key available at
https://keybase.io/dchenbecker
<https://urldefense.com/v3/__https://keybase.io/dchenbecker__;!!PbtH5S7Ebw!YKhUm1ce3A3Djw9kupwqUWknncAxAeKovQ9vuMMPTMAubth1Zjbs8W62LQMY3KorY7W3H7Fhb1GRu-uKf-oY$>and
|
|
https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org
<https://urldefense.com/v3/__https://pgp.mit.edu/pks/lookup?search=derek*40chen-becker.org__;JQ!!PbtH5S7Ebw!YKhUm1ce3A3Djw9kupwqUWknncAxAeKovQ9vuMMPTMAubth1Zjbs8W62LQMY3KorY7W3H7Fhb1GRuz_jdH0t$>
|
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5
AFEE 96E4 6ACC |
+---------------------------------------------------------------+