That's 5 votes for A and 2 votes for B so far. None of these options opposes to the CEP, so I think we can probably start the vote, unless we want to wait longer for the poll.
On Mon, 12 Sept 2022 at 13:51, Benjamin Lerer <ble...@apache.org> wrote: > A > > Le mer. 7 sept. 2022 à 17:02, Jeremiah D Jordan <jeremiah.jor...@gmail.com> > a écrit : > >> A >> >> On Sep 7, 2022, at 8:58 AM, Benedict <bened...@apache.org> wrote: >> >> Well, I am not convinced these changes will materially impact the >> outcome, but at least we’ll have some extra fun collating the votes. >> >> >> On 7 Sep 2022, at 14:05, Andrés de la Peña <adelap...@apache.org> wrote: >> >> >> The poll makes sense to me. I would slightly change it to: >> >> A) We shouldn't prefer neither approach, and I agree to the implementor >> selecting the table schema approach for this CEP >> B) We should prefer the view approach, but I am not opposed to the >> implementor selecting the table schema approach for this CEP >> C) We should NOT implement the table schema approach, and should >> implement the view approach >> D) We should NOT implement the table view approach, and should implement >> the schema approach >> E) We should NOT implement the table schema approach, and should >> implement some other scheme (or not implement this feature) >> >> Where my vote is for A. >> >> >> On Wed, 7 Sept 2022 at 13:12, Benedict <bened...@apache.org> wrote: >> >>> I’m not convinced there’s been adequate resolution over which approach >>> is adopted. I know you have expressed a preference for the table schema >>> approach, but the weight of other opinion so far appears to be against this >>> approach - even if it is broadly adopted by other databases. I will note >>> that Postgres does not adopt this approach, it has a more sophisticated >>> security label approach that has not been proposed by anybody so far. >>> >>> I think extra weight should be given to the implementer’s preference, so >>> while I personally do not like the table schema approach, I am happy to >>> accept this is an industry norm, and leave the decision to you. >>> >>> However, we should ensure the community as a whole endorses this. I >>> think an indicative poll should be undertaken first, eg: >>> >>> A) We should implement the table schema approach, as proposed >>> B) We should prefer the view approach, but I am not opposed to the >>> implementor selecting the table schema approach for this CEP >>> C) We should NOT implement the table schema approach, and should >>> implement the view approach >>> D) We should NOT implement the table schema approach, and should >>> implement some other scheme (or not implement this feature) >>> >>> Where my vote is B >>> >>> On 7 Sep 2022, at 12:50, Andrés de la Peña <adelap...@apache.org> wrote: >>> >>> >>> If nobody has more concerns regarding the CEP I will start the vote >>> tomorrow. >>> >>> On Wed, 31 Aug 2022 at 13:18, Andrés de la Peña <adelap...@apache.org> >>> wrote: >>> >>>> Is there enough support here for VIEWS to be the implementation >>>>> strategy for displaying masking functions? >>>> >>>> >>>> I'm not sure that views should be "the" strategy for masking functions. >>>> We have multiple approaches here: >>>> >>>> 1) CQL functions only. Users can decide to use the masking functions on >>>> their own will. I think most dbs allow this pattern of usage, which is >>>> quite straightforward. Obviously, it doesn't allow admins to decide enforce >>>> users seeing only masked data. Nevertheless, it's still useful for trusted >>>> database users generating masked data that will be consumed by the end >>>> users of the application. >>>> >>>> 2) Masking functions attached to specific columns. This way the same >>>> queries will see different data (masked or not) depending on the >>>> permissions of the user running the query. It has the advantage of not >>>> requiring to change the queries that users with different permissions run. >>>> The downside is that users would need to query the schema if they need to >>>> know whether a column is masked, unless we change the names of the returned >>>> columns. This is the approach offered by Azure/SQL Server, PostgreSQL, IBM >>>> Db2, Oracle, MariaDB/MaxScale and SnowFlake. All these databases support >>>> applying the masking function to columns on the base table, and some of >>>> them also allow to apply masking to views. >>>> >>>> 3) Masking functions as part of projected views. This ways users might >>>> need to query the view appropriate for their permissions instead of the >>>> base table. This might mean changing the queries if the masking policy is >>>> changed by the admin. MySQL recommends this approach on a blog entry, >>>> although it's not part of its main documentation for data masking, and the >>>> implementation has security issues. Some of the other databases offering >>>> the approach 2) as their main option also support masking on view columns. >>>> >>>> Each approach has its own advantages and limitations, and I don't think >>>> we necessarily have to choose. The CEP proposes implementing 1) and 2), but >>>> no one impedes us to also have 3) if we get to have projected views. >>>> However, I think that projected views is a new general-purpose feature with >>>> its own complexities, so it would deserve its own CEP, if someone is >>>> willing to work on the implementation. >>>> >>>> >>>> >>>> On Wed, 31 Aug 2022 at 12:03, Claude Warren via dev < >>>> dev@cassandra.apache.org> wrote: >>>> >>>>> Is there enough support here for VIEWS to be the implementation >>>>> strategy for displaying masking functions? >>>>> >>>>> It seems to me the view would have to store the query and apply a >>>>> where clause to it, so the same PK would be in play. >>>>> >>>>> It has data leaking properties. >>>>> >>>>> It has more use cases as it can be used to >>>>> >>>>> - construct views that filter out sensitive columns >>>>> - apply transforms to convert units of measure >>>>> >>>>> Are there more thoughts along this line? >>>>> >>>> >>