I agree with Joey, kernel also should be able to take advantage of the crypto
acceleration.
I also want to add, since performance of JDK is a concern here, newer Intel
Icelake server platforms supports VAES and SHA-NI which further accelerates
AES-GCM perf by 2x and SHA1 perf by ~6x using JDK 11.
Some configuration information for the tests I ran.
- JDK version used was JDK14 (should behave similarly with JDK11 also).
- Since the tests were done before 4.0 GA'd, Cassandra version used was
4.0-beta3. Dataset size was ~500G
- Workloads tested were 100% reads, 100% updates & 80:20 mix with
cassandra-stress. I have not tested streaming yet.
I would be happy to provide additional data points or make necessary code
changes based on recommendations from folks here.
Thanks,
Shylaja
-----Original Message-----
From: Joshua McKenzie <[email protected]>
Sent: Friday, November 19, 2021 4:53 AM
To: [email protected]
Subject: Re: Resurrection of CASSANDRA-9633 - SSTable encryption
>
> setting performance requirements on this regard is a nonsense. As long
> as it's reasonably usable in real world, and Cassandra makes the
> estimated effects on performance available, it will be up to the
> operators to decide whether to turn on the feature
I think Joey's argument, and correct me if I'm wrong, is that implementing a
complex feature in Cassandra that we then have to manage that's essentially
worse in every way compared to a built-in full-disk encryption option via
LUKS+LVM etc is a poor use of our time and energy.
i.e. we'd be better off investing our time into documenting how to do full disk
encryption in a variety of scenarios + explaining why that is our recommended
approach instead of taking the time and energy to design, implement, debug, and
then maintain an inferior solution.
On Fri, Nov 19, 2021 at 7:49 AM Joshua McKenzie <[email protected]>
wrote:
> Are you for real here?
>
> Please keep things cordial. Statements like this don't help move the
> conversation along.
>
>
> On Fri, Nov 19, 2021 at 3:57 AM Stefan Miklosovic <
> [email protected]> wrote:
>
>> On Fri, 19 Nov 2021 at 02:51, Joseph Lynch <[email protected]> wrote:
>> >
>> > On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja <
>> [email protected]>
>> > wrote:
>> >
>> > > To address Joey's concern, the OpenJDK JVM and its derivatives
>> optimize
>> > > Java crypto based on the underlying HW capabilities. For example,
>> > > if
>> the
>> > > underlying HW supports AES-NI, JVM intrinsics will use those for
>> crypto
>> > > operations. Likewise, the new vector AES available on the latest
>> > > Intel platform is utilized by the JVM while running on that
>> > > platform to make crypto operations faster.
>> > >
>> >
>> > Which JDK version were you running? We have had a number of issues
>> > with
>> the
>> > JVM being 2-10x slower than native crypto on Java 8 (especially
>> > MD5,
>> SHA1,
>> > and AES-GCM) and to a lesser extent Java 11 (usually ~2x slower).
>> > Again
>> I
>> > think we could get the JVM crypto penalty down to ~2x native if we
>> linked
>> > in e.g. ACCP by default [1, 2] but even the very best Java crypto
>> > I've
>> seen
>> > (fully utilizing hardware instructions) is still ~2x slower than
>> > native code. The operating system has a number of advantages here
>> > in that they don't pay JVM allocation costs or the JNI barrier (in
>> > the case of ACCP)
>> and
>> > the kernel also takes advantage of hardware instructions.
>> >
>> >
>> > > From our internal experiments, we see single digit % regression
>> > > when transparent data encryption is enabled.
>> > >
>> >
>> > Which workloads are you testing and how are you measuring the
>> regression? I
>> > suspect that compaction, repair (validation compaction), streaming,
>> > and quorum reads are probably much slower (probably ~10x slower for
>> > the throughput bound operations and ~2x slower on the read path).
>> > As compaction/repair/streaming usually take up between 10-20% of
>> > available
>> CPU
>> > cycles making them 2x slower might show up as <10% overall
>> > utilization increase when you've really regressed 100% or more on
>> > key metrics (compaction throughput, streaming throughput, memory
>> > allocation rate,
>> etc
>> > ...). For example, if compaction was able to achieve 2 MiBps of
>> throughput
>> > before encryption and it was only able to achieve 1MiBps of
>> > throughput afterwards, that would be a huge real world impact to
>> > operators as compactions now take twice as long.
>> >
>> > I think a CEP or details on the ticket that indicate the
>> > performance
>> tests
>> > and workloads that will be run might be wise? Perhaps something
>> > like "encryption creates no more than a 1% regression of:
>> > compaction
>> throughput
>> > (MiBps), streaming throughput (MiBps), repair validation throughput
>> > (duration of full repair on the entire cluster), read throughput at
>> > 10ms
>> > p99 tail at quorum consistency (QPS handled while not exceeding P99
>> > SLO
>> of
>> > 10ms), etc ... while a sustained load is applied to a multi-node
>> cluster"?
>>
>> Are you for real here?Nobody will ever guarantee you these %1 numbers
>> ... come on. I think we are super paranoid about performance when we
>> are not paranoid enough about security. This is a two way street.
>> People are willing to give up on performance if security is a must.
>> You do not need to use it if you do not want to, it is not like we
>> are going to turn it on and you have to stick with that. Are you just
>> saying that we are going to protect people from using some security
>> features because their db might be slow? What if they just dont care?
>>
>> > Even a microbenchmark that just sees how long it takes to encrypt
>> > and decrypt a 500MiB dataset using the proposed JVM implementation
>> > versus encrypting it with a native implementation might be enough
>> > to
>> confirm/deny.
>> > For example, keypipe (C, [3]) achieves around 2.8 GiBps symmetric
>> > of AES-GCM and age (golang, ChaCha20-Poly1305, [4]) achieves about
>> > 1.6
>> GiBps
>> > encryption and 1.0 GiBps decryption; from my past experiences with
>> > Java crypto is it would achieve maybe 200 MiBps of _non-authenticated_ AES.
>> >
>> > Cheers,
>> > -Joey
>> >
>> > [1] https://issues.apache.org/jira/browse/CASSANDRA-15294
>> > [2] https://github.com/corretto/amazon-corretto-crypto-provider
>> > [3] https://github.com/FiloSottile/age
>> > [4] https://github.com/hashbrowncipher/keypipe#encryption
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>