HDFS* On Wed, 15 Mar, 2023, 2:56 pm Ishan Chattopadhyaya, < ichattopadhy...@gmail.com> wrote:
> Btw, +1 to the initiative. I've heard of clients used encrypted HDFC for > these usecases. Direct support at Lucene/Solr level is much better. > > On Wed, 15 Mar, 2023, 2:52 pm Ishan Chattopadhyaya, < > ichattopadhy...@gmail.com> wrote: > >> Does it need to be a first party project? >> >> On Wed, 15 Mar, 2023, 2:46 pm Bruno Roustant, <broust...@apache.org> >> wrote: >> >>> Hi, >>> >>> I pushed a PR <https://github.com/apache/solr-sandbox/pull/51> in >>> solr-sandbox <https://github.com/apache/solr-sandbox> to propose a >>> Java-level encryption for Solr. >>> This work is the follow up of LUCENE-9379 >>> <https://issues.apache.org/jira/projects/LUCENE/issues/LUCENE-9379>. >>> >>> To give some details, here is the overview section of the ENCRYPTION.md >>> < >>> https://github.com/apache/solr-sandbox/blob/e422e3dd4febab54ba9a8d965189b38217552b46/ENCRYPTION.md >>> > >>> file in this PR: >>> >>> This solution provides the encryption of the Lucene index files at the >>> Java >>> level. >>> It encrypts all (or some) the files in a given index with a provided >>> encryption key. >>> It stores the id of the encryption key in the commit metadata (and >>> obviously the >>> key secret is never stored). It is possible to define a different key per >>> Solr Core. >>> This module also provides an EncryptionRequestHandler so that a client >>> can >>> trigger >>> the (re)encryption of a Solr Core index. The (re)encryption is done >>> concurrently >>> while the Solr Core can continue to serve update and query requests. >>> >>> Comparing with an OS-level encryption: >>> >>> - OS-level encryption [1][2] is more performant and more adapted to let >>> Lucene >>> leverage the OS memory cache. It can manage encryption at block or >>> filesystem >>> level in the OS. This makes it possible to encrypt with different keys >>> per-directory, >>> making multi-tenant use-cases possible. >>> If you can use OS-level encryption, prefer it and skip this Java-level >>> encryption. >>> >>> - Java-level encryption can be used when the OS-level encryption >>> management >>> is >>> not possible (e.g. host machine managed by a cloud provider). It has an >>> impact >>> on performance: expect -20% on most queries, -60% on multi-term queries. >>> >>> [1] https://wiki.archlinux.org/title/Fscrypt >>> [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html >>> >>> - Bruno >>> >>