$ dd if=/dev/zero bs=4096 count=$((4*1024*1024)) status=none | time /usr/lib/jvm/java-11-openjdk/bin/java -jar aes-bench.jar >/dev/null 36.24s user 5.96s system 100% cpu 41.912 total $ dd if=/dev/zero bs=4096 count=$((4*1024*1024)) status=none | time openssl enc -aes-256-cbc -e -K "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef" -iv "0123456789abcdef0123456789abcdef" >/dev/null 31.09s user 3.92s system 99% cpu 35.043 total
This is not an accurate test of the AES performance, as the Java test includes the JVM start up time and the key and IV generation in the Java code. But this gives us a pretty good idea that the total performance regression is definitely far from the 2x to 10x slower claimed in some previous emails.
The Java code I used:
package com.example.AesBenchmark;
import java.security.Security;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.security.SecureRandom;
import javax.crypto.Cipher;
import javax.crypto.KeyGenerator;
import javax.crypto.SecretKey;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
public class AesBenchmark {
static {
try {
Security.setProperty("crypto.policy", "unlimited");
} catch (Exception e) {
e.printStackTrace();
}
}
static final int BUF_LEN = 4096;
public static void main(String[] args) throws Exception
{
KeyGenerator keyGenerator = KeyGenerator.getInstance("AES");
keyGenerator.init(256);
// Generate Key
SecretKey key = keyGenerator.generateKey();
// Generating IV.
byte[] IV = new byte[16];
SecureRandom random = new SecureRandom();
random.nextBytes(IV);
//Get Cipher Instance
Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
//Create SecretKeySpec
SecretKeySpec keySpec = new SecretKeySpec(key.getEncoded(),
"AES");
//Create IvParameterSpec
IvParameterSpec ivSpec = new IvParameterSpec(IV);
//Initialize Cipher for ENCRYPT_MODE
cipher.init(Cipher.ENCRYPT_MODE, keySpec, ivSpec);
byte[] bufInput = new byte[BUF_LEN];
FileInputStream fis = new FileInputStream(new
File("/dev/stdin"));
FileOutputStream fos = new FileOutputStream(new
File("/dev/stdout"));
int nBytes;
while ((nBytes = fis.read(bufInput, 0, BUF_LEN)) != -1)
{
fos.write(cipher.update(bufInput, 0, nBytes));
}
fos.write(cipher.doFinal());
}
}
On 19/11/2021 13:28, Jeff Jirsa wrote:
For better or worse, different threat models mean that it’s not strictly better to do FDE and some use cases definitely want this at the db layer instead of file system.On Nov 19, 2021, at 12:54 PM, Joshua McKenzie<[email protected]> wrote: setting performance requirements on this regard is a nonsense. As long as it's reasonably usable in real world, and Cassandra makes the estimated effects on performance available, it will be up to the operators to decide whether to turn on the featureI think Joey's argument, and correct me if I'm wrong, is that implementing a complex feature in Cassandra that we then have to manage that's essentially worse in every way compared to a built-in full-disk encryption option via LUKS+LVM etc is a poor use of our time and energy. i.e. we'd be better off investing our time into documenting how to do full disk encryption in a variety of scenarios + explaining why that is our recommended approach instead of taking the time and energy to design, implement, debug, and then maintain an inferior solution.On Fri, Nov 19, 2021 at 7:49 AM Joshua McKenzie<[email protected]> wrote: Are you for real here? Please keep things cordial. Statements like this don't help move the conversation along. On Fri, Nov 19, 2021 at 3:57 AM Stefan Miklosovic < [email protected]> wrote:On Fri, 19 Nov 2021 at 02:51, Joseph Lynch<[email protected]> wrote:On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja <[email protected]>wrote:To address Joey's concern, the OpenJDK JVM and its derivativesoptimizeJava crypto based on the underlying HW capabilities. For example, iftheunderlying HW supports AES-NI, JVM intrinsics will use those forcryptooperations. Likewise, the new vector AES available on the latest Intel platform is utilized by the JVM while running on that platform to make crypto operations faster.Which JDK version were you running? We have had a number of issues withtheJVM being 2-10x slower than native crypto on Java 8 (especially MD5,SHA1,and AES-GCM) and to a lesser extent Java 11 (usually ~2x slower). AgainIthink we could get the JVM crypto penalty down to ~2x native if welinkedin e.g. ACCP by default [1, 2] but even the very best Java crypto I'veseen(fully utilizing hardware instructions) is still ~2x slower than native code. The operating system has a number of advantages here in that they don't pay JVM allocation costs or the JNI barrier (in the case of ACCP)andthe kernel also takes advantage of hardware instructions.From our internal experiments, we see single digit % regression when transparent data encryption is enabled.Which workloads are you testing and how are you measuring theregression? Isuspect that compaction, repair (validation compaction), streaming, and quorum reads are probably much slower (probably ~10x slower for the throughput bound operations and ~2x slower on the read path). As compaction/repair/streaming usually take up between 10-20% of availableCPUcycles making them 2x slower might show up as <10% overall utilization increase when you've really regressed 100% or more on key metrics (compaction throughput, streaming throughput, memory allocation rate,etc...). For example, if compaction was able to achieve 2 MiBps ofthroughputbefore encryption and it was only able to achieve 1MiBps of throughput afterwards, that would be a huge real world impact to operators as compactions now take twice as long. I think a CEP or details on the ticket that indicate the performancetestsand workloads that will be run might be wise? Perhaps something like "encryption creates no more than a 1% regression of: compactionthroughput(MiBps), streaming throughput (MiBps), repair validation throughput (duration of full repair on the entire cluster), read throughput at 10ms p99 tail at quorum consistency (QPS handled while not exceeding P99 SLOof10ms), etc ... while a sustained load is applied to a multi-nodecluster"? Are you for real here?Nobody will ever guarantee you these %1 numbers ... come on. I think we are super paranoid about performance when we are not paranoid enough about security. This is a two way street. People are willing to give up on performance if security is a must. You do not need to use it if you do not want to, it is not like we are going to turn it on and you have to stick with that. Are you just saying that we are going to protect people from using some security features because their db might be slow? What if they just dont care?Even a microbenchmark that just sees how long it takes to encrypt and decrypt a 500MiB dataset using the proposed JVM implementation versus encrypting it with a native implementation might be enough toconfirm/deny.For example, keypipe (C, [3]) achieves around 2.8 GiBps symmetric of AES-GCM and age (golang, ChaCha20-Poly1305, [4]) achieves about 1.6GiBpsencryption and 1.0 GiBps decryption; from my past experiences with Java crypto is it would achieve maybe 200 MiBps of _non-authenticated_ AES. Cheers, -Joey [1]https://issues.apache.org/jira/browse/CASSANDRA-15294 [2]https://github.com/corretto/amazon-corretto-crypto-provider [3]https://github.com/FiloSottile/age [4]https://github.com/hashbrowncipher/keypipe#encryption--------------------------------------------------------------------- To unsubscribe, e-mail:[email protected] For additional commands, e-mail:[email protected]--------------------------------------------------------------------- To unsubscribe, e-mail:[email protected] For additional commands, e-mail:[email protected]
