This implements the ChaCha20 permutation as a single C statement, by way
of the comma operator, which the compiler is able to simplify
terrifically.
Information: https://cr.yp.to/chacha.html
Signed-off-by: Jason A. Donenfeld
Cc: Samuel Neves
Cc: Jean-Philippe Aumasson
Cc: Andy Lutomirski
Cc:
These x86_64 vectorized implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
These port and prepare Andy Polyakov's implementations for the kernel,
but don't actually wire up any of the code yet. The wiring will be done
in a subsequent commit, since we'll need to merge these implementations
with another one. We make a few small changes to the assembly:
- Entries and exit
These NEON and non-NEON implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
This MIPS32r2 implementation comes from René van Dorst and me and
results in a nice speedup on the usual OpenWRT targets.
Signed-off-by: Jason A. Donenfeld
Signed-off-by: René van Dorst
Co-developed-by: René van Dorst
Cc: Ralf Baechle
Cc: Paul Burton
Cc: James Hogan
Cc: linux-m...@linux-mips
This ports SSSE3, AVX-2, AVX-512F, and AVX-512VL implementations for
ChaCha20. The AVX-512F implementation is disabled on Skylake, due to
throttling, and the VL ymm implementation is used instead. These come
from Andy Polyakov's implementation, with the following modifications
from Samuel Neves:
These wire Andy Polyakov's implementations up to the kernel for ARMv7,8
NEON, and introduce Eric Biggers' ultra-fast scalar implementation for
CPUs without NEON or for CPUs with slow NEON (Cortex-A5,7).
This commit does the following:
- Adds the glue code for the assembly implementations.
- Re
These x86_64 vectorized implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
This ports AVX, AVX-2, and AVX-512F implementations for Poly1305.
The AVX-512F implementation is disabled on Skylake, due to throttling.
These come from Andy Polyakov's implementation, with the following
modifications from Samuel Neves:
- Some cosmetic changes, like renaming labels to .Lname, co
This MIPS64 accelerated implementation comes from Andy Polyakov's
implementation, and is included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
b
These two C implementations -- a 32x32 one and a 64x64 one, depending on
the platform -- come from Andrew Moon's public domain poly1305-donna
portable code, modified for usage in the kernel and for usage with
accelerated primitives.
Information: https://cr.yp.to/mac.html
Signed-off-by: Jason A. D
The C implementation was originally based on Samuel Neves' public
domain reference implementation but has since been heavily modified
for the kernel. We're able to do compile-time optimizations by moving
some scaffolding around the final function into the header file.
Information: https://blake2.n
This contains two formally verified C implementations of the Curve25519
scalar multiplication function, one for 32-bit systems, and one for
64-bit systems whose compiler supports efficient 128-bit integer types.
Not only are these implementations formally verified, but they are also
the fastest ava
These implementations from Samuel Neves support AVX and AVX-512VL.
Originally this used AVX-512F, but Skylake thermal throttling made
AVX-512VL more attractive and possible to do with negligable difference.
Signed-off-by: Jason A. Donenfeld
Signed-off-by: Samuel Neves
Co-developed-by: Samuel Nev
This ports the SUPERCOP implementation for usage in kernel space. In
addition to the usual header, macro, and style changes required for
kernel space, it makes a few small changes to the code:
- The stack alignment is relaxed to 16 bytes.
- Superfluous mov statements have been removed.
- ldr
This comes from Dan Bernstein and Peter Schwabe's public domain NEON
code, and is included here in raw form so that subsequent commits that
fix these up for the kernel can see how it has changed. This code does
have some entirely cosmetic formatting differences, adding indentation
and so forth, so
Now that ChaCha20 is in Zinc, we can have the crypto API code simply
call into it. The crypto API expects to have a stored key per instance
and independent nonces, so we follow suite and store the key and
initialize the nonce independently.
Signed-off-by: Jason A. Donenfeld
Cc: Samuel Neves
Cc:
Now that Poly1305 is in Zinc, we can have the crypto API code simply
call into it. We have to do a little bit of book keeping here, because
the crypto API receives the key in the first few calls to update.
Signed-off-by: Jason A. Donenfeld
Cc: Samuel Neves
Cc: Andy Lutomirski
Cc: Greg KH
Cc: l
This implementation is the fastest available x86_64 implementation, and
unlike Sandy2x, it doesn't requie use of the floating point registers at
all. Instead it makes use of BMI2 and ADX, available on recent
microarchitectures. The implementation was written by Armando
Faz-Hernández with contributi
This MIPS32r2 implementation comes from René van Dorst and me and
results in a nice speedup on the usual OpenWRT targets. The MIPS64
implementation from Andy Polyakov ported here results in a nice speedup
on commodity Octeon hardware, and has been modified slightly from the
original:
- The functi
These NEON and non-NEON implementations come from Andy Polyakov's
implementation, and are included here in raw form without modification,
so that subsequent commits that fix these up for the kernel can see how
it has changed.
While this is CRYPTOGAMS code, the originating code for this happens to
These wire Andy Polyakov's implementations up to the kernel. We make a
few small changes to the assembly:
- Entries and exits use the proper kernel convention macro.
- CPU feature checking is done in C by the glue code, so that has been
removed from the assembly.
- The function names have been r
Zinc stands for "Zinc Is Neat Crypto" or "Zinc as IN Crypto". It's also
short, easy to type, and plays nicely with the recent trend of naming
crypto libraries after elements. The guiding principle is "don't overdo
it". It's less of a library and more of a directory tree for organizing
well-curated
On Wed, Aug 15, 2018 at 09:00:04AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:7796916146b8 Merge branch 'x86-cpu-for-linus' of git://git..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=164b192240
> kernel
On Wed, Sep 26, 2018 at 07:27:04AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:a38523185b40 erge tag 'libnvdimm-fixes-4.19-rc6' of git://..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1767b7fa40
> kernel
> On 5 Oct 2018, at 20:28, Jason A. Donenfeld wrote:
>
> Hey Andy,
>
>> On Fri, Oct 5, 2018 at 7:44 PM Andy Lutomirski wrote:
>> I *think* the only change to Zinc per se would be that the calls like
>> chacha20_simd() would be static calls or patchable functions or
>> whatever we want to cal
Hey Dan,
On Fri, Oct 05, 2018 at 03:05:38PM -, D. J. Bernstein wrote:
> Of course, there are other ARM microarchitectures, and there are many
> cases where different microarchitectures prefer different optimizations.
> The kernel already has boot-time benchmarks for different optimizations
> f
Hey Andy,
On Fri, Oct 5, 2018 at 7:44 PM Andy Lutomirski wrote:
> I *think* the only change to Zinc per se would be that the calls like
> chacha20_simd() would be static calls or patchable functions or
> whatever we want to call them. And there could be a debugfs to
> override the default select
On Fri, Oct 5, 2018 at 10:28 AM Ard Biesheuvel
wrote:
>
> On 5 October 2018 at 19:26, Andy Lutomirski wrote:
> > On Fri, Oct 5, 2018 at 10:15 AM Ard Biesheuvel
> > wrote:
> >>
> >> On 5 October 2018 at 15:37, Jason A. Donenfeld wrote:
> >> ...
> >> > Therefore, I think this patch goes in exactl
From: Richard Weinberger
Date: Fri, 5 Oct 2018 15:37:57 +0200
> And I strongly vote that Herbert Xu shall remain the maintainer of
> the whole crypto system (including zinc!) in the kernel.
I 100% agree with this.
On Fri, Oct 5, 2018 at 10:38 AM Jason A. Donenfeld wrote:
>
> On Fri, Oct 5, 2018 at 7:29 PM Andy Lutomirski wrote:
> > (None of this is to say that I disagree with Jason, though -- I'm not
> > entirely convinced that this makes sense for Zinc. But maybe it can
> > be done in a way that makes ev
On Fri, Oct 5, 2018 at 7:29 PM Andy Lutomirski wrote:
> (None of this is to say that I disagree with Jason, though -- I'm not
> entirely convinced that this makes sense for Zinc. But maybe it can
> be done in a way that makes everyone happy.)
Zinc indeed will continue to push in the simpler and
On Fri, Oct 5, 2018 at 10:23 AM Ard Biesheuvel
wrote:
>
> On 5 October 2018 at 19:20, Andy Lutomirski wrote:
> > On Fri, Oct 5, 2018 at 10:11 AM Ard Biesheuvel
> > wrote:
> >>
> >> On 5 October 2018 at 18:58, Andy Lutomirski wrote:
> >> > On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel
> >> > w
On 5 October 2018 at 19:26, Andy Lutomirski wrote:
> On Fri, Oct 5, 2018 at 10:15 AM Ard Biesheuvel
> wrote:
>>
>> On 5 October 2018 at 15:37, Jason A. Donenfeld wrote:
>> ...
>> > Therefore, I think this patch goes in exactly the wrong direction. I
>> > mean, if you want to introduce dynamic pa
On Fri, Oct 5, 2018 at 10:15 AM Ard Biesheuvel
wrote:
>
> On 5 October 2018 at 15:37, Jason A. Donenfeld wrote:
> ...
> > Therefore, I think this patch goes in exactly the wrong direction. I
> > mean, if you want to introduce dynamic patching as a means for making
> > the crypto API's dynamic dis
On 5 October 2018 at 19:20, Andy Lutomirski wrote:
> On Fri, Oct 5, 2018 at 10:11 AM Ard Biesheuvel
> wrote:
>>
>> On 5 October 2018 at 18:58, Andy Lutomirski wrote:
>> > On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel
>> > wrote:
>> >>
>> >> On 5 October 2018 at 17:08, Andy Lutomirski wrote:
>
On Fri, Oct 05, 2018 at 07:16:13PM +0200, Ard Biesheuvel wrote:
> On 5 October 2018 at 19:13, Eric Biggers wrote:
> > From: Eric Biggers
> >
> > aesni-intel_glue.c still calls crypto_fpu_init() and crypto_fpu_exit()
> > to register/unregister the "fpu" template. But these functions don't
> > exi
On Fri, Oct 5, 2018 at 10:11 AM Ard Biesheuvel
wrote:
>
> On 5 October 2018 at 18:58, Andy Lutomirski wrote:
> > On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel
> > wrote:
> >>
> >> On 5 October 2018 at 17:08, Andy Lutomirski wrote:
> >> >
> >> >
> >> >> On Oct 5, 2018, at 7:14 AM, Peter Zijlstr
On 5 October 2018 at 19:13, Eric Biggers wrote:
> From: Eric Biggers
>
> aesni-intel_glue.c still calls crypto_fpu_init() and crypto_fpu_exit()
> to register/unregister the "fpu" template. But these functions don't
> exist anymore, causing a build error. Remove the calls to them.
>
> Fixes: 944
On 5 October 2018 at 15:37, Jason A. Donenfeld wrote:
...
> Therefore, I think this patch goes in exactly the wrong direction. I
> mean, if you want to introduce dynamic patching as a means for making
> the crypto API's dynamic dispatch stuff not as slow in a post-spectre
> world, sure, go for it;
From: Eric Biggers
aesni-intel_glue.c still calls crypto_fpu_init() and crypto_fpu_exit()
to register/unregister the "fpu" template. But these functions don't
exist anymore, causing a build error. Remove the calls to them.
Fixes: 944585a64f5e ("crypto: x86/aes-ni - remove special handling of A
On 5 October 2018 at 18:58, Andy Lutomirski wrote:
> On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel
> wrote:
>>
>> On 5 October 2018 at 17:08, Andy Lutomirski wrote:
>> >
>> >
>> >> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wrote:
>> >>
>> >>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Bies
On Fri, Oct 5, 2018 at 8:24 AM Ard Biesheuvel wrote:
>
> On 5 October 2018 at 17:08, Andy Lutomirski wrote:
> >
> >
> >> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wrote:
> >>
> >>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
> >>> diff --git a/include/linux/ffp.h b/include/l
On Fri, Oct 05, 2018 at 10:08:30AM +0800, Herbert Xu wrote:
> Hi Greg:
>
> This push fixes the following issues:
>
> - Out-of-bound stack access in qat.
> - Illegal schedule in mxs-dcp.
> - Memory corruption in chelsio.
> - Incorrect pointer computation in caam.
>
>
> Please pull from
>
> git
On 5 October 2018 at 17:08, Andy Lutomirski wrote:
>
>
>> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wrote:
>>
>>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
>>> diff --git a/include/linux/ffp.h b/include/linux/ffp.h
>>> new file mode 100644
>>> index ..8fc3b4c9b3
On 5 October 2018 at 17:05, D. J. Bernstein wrote:
> For the in-order ARM Cortex-A8 (the target for this code), adjacent
> multiply-add instructions forward summands quickly. A simple in-order
> dot-product computation has no latency problems, while interleaving
> computations, as suggested in thi
For the in-order ARM Cortex-A8 (the target for this code), adjacent
multiply-add instructions forward summands quickly. A simple in-order
dot-product computation has no latency problems, while interleaving
computations, as suggested in this thread, creates problems. Also, on
this microarchitecture,
> On Oct 5, 2018, at 7:14 AM, Peter Zijlstra wrote:
>
>> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
>> diff --git a/include/linux/ffp.h b/include/linux/ffp.h
>> new file mode 100644
>> index ..8fc3b4c9b38f
>> --- /dev/null
>> +++ b/include/linux/ffp.h
>> @@ -0,
On 5 October 2018 at 16:14, Peter Zijlstra wrote:
> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
>> diff --git a/include/linux/ffp.h b/include/linux/ffp.h
>> new file mode 100644
>> index ..8fc3b4c9b38f
>> --- /dev/null
>> +++ b/include/linux/ffp.h
>> @@ -0,0 +1,43 @
On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
> diff --git a/include/linux/ffp.h b/include/linux/ffp.h
> new file mode 100644
> index ..8fc3b4c9b38f
> --- /dev/null
> +++ b/include/linux/ffp.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef
On 5 October 2018 at 15:57, Peter Zijlstra wrote:
> On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
>> Add a function pointer abstraction that can be implemented by the arch
>> in a manner that avoids the downsides of function pointers, i.e., the
>> fact that they are typically loc
On Fri, Oct 05, 2018 at 10:13:25AM +0200, Ard Biesheuvel wrote:
> Add a function pointer abstraction that can be implemented by the arch
> in a manner that avoids the downsides of function pointers, i.e., the
> fact that they are typically located in a writable data section, and
> their vulnerabili
Am Freitag, 5. Oktober 2018, 15:46:29 CEST schrieb Jason A. Donenfeld:
> On Fri, Oct 5, 2018 at 3:38 PM Richard Weinberger
> wrote:
> > So we will have two competing crypo stacks in the kernel?
> > Having a lightweight crypto API is a good thing but I really don't like the
> > idea
> > of having
On Fri, Oct 5, 2018 at 3:38 PM Richard Weinberger
wrote:
> So we will have two competing crypo stacks in the kernel?
> Having a lightweight crypto API is a good thing but I really don't like the
> idea
> of having zinc parallel to the existing crypto stack.
No, as you've seen in this patchset, t
Hi Ard,
On Fri, Oct 5, 2018 at 10:13 AM Ard Biesheuvel
wrote:
> At the moment, the Zinc library [1] is being proposed as a solution for that,
> and while it does address the usability problems, it does a lot more than
> that, and what we end up with is a lot less flexible than what we have now.
On Fri, Oct 5, 2018 at 3:14 PM Jason A. Donenfeld wrote:
> On Wed, Oct 3, 2018 at 8:49 AM Eric Biggers wrote:
> > It's not really about the name, though. It's actually about the whole way
> > of
> > thinking about the submission. Is it a new special library with its own
> > things
> > going o
Hi Eric,
On Wed, Oct 3, 2018 at 8:49 AM Eric Biggers wrote:
> It's not really about the name, though. It's actually about the whole way of
> thinking about the submission. Is it a new special library with its own
> things
> going on, or is it just some crypto helper functions? It's really jus
Hi,
this block:
|int caam_qi_shutdown(struct device *qidev)
| {
| struct cpumask old_cpumask = current->cpus_allowed;
…
| /*
| * QMan driver requires CGRs to be deleted from same CPU from where
they
| * were instantiated. Hence we get the module removal execute f
Add a function pointer abstraction that can be implemented by the arch
in a manner that avoids the downsides of function pointers, i.e., the
fact that they are typically located in a writable data section, and
their vulnerability to Spectre like defects.
The FFP (or fast function pointer) is calla
Move the PMULL based routines out of the crypto API into the core
CRC-T10DIF library.
Signed-off-by: Ard Biesheuvel
---
arch/arm64/crypto/crct10dif-ce-glue.c | 61 +---
1 file changed, 14 insertions(+), 47 deletions(-)
diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c
b/arch/a
crc_t10dif() is a trivial wrapper around crc_t10dif_update() so move
it into the header file as a static inline function.
Signed-off-by: Ard Biesheuvel
---
include/linux/crc-t10dif.h | 6 +-
lib/crc-t10dif.c | 6 --
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a
Move the PMULL based routines out of the crypto API into the core
CRC-T10DIF library.
Signed-off-by: Ard Biesheuvel
---
arch/x86/crypto/crct10dif-pclmul_glue.c | 98 +++-
1 file changed, 13 insertions(+), 85 deletions(-)
diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c
b/ar
Move the PMULL based routines out of the crypto API into the core
CRC-T10DIF library.
Signed-off-by: Ard Biesheuvel
---
arch/arm/crypto/crct10dif-ce-glue.c | 78 ++--
1 file changed, 22 insertions(+), 56 deletions(-)
diff --git a/arch/arm/crypto/crct10dif-ce-glue.c
b/arch/arm/c
Use a patchable function pointer for the core CRC-T10DIF transform so
that we can allow modules to supersede this transform based on platform
capabilities.
Signed-off-by: Ard Biesheuvel
---
crypto/Kconfig | 1 +
crypto/Makefile| 2 +-
crypto/crct10dif_common.c | 82
Implement arm64 support for patchable function pointers by emitting
them as branch instructions (and a couple of NOPs in case the new
target is out of range of a normal branch instruction.)
Signed-off-by: Ard Biesheuvel
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/ffp.h | 35 +
Move the PMULL based routines out of the crypto API into the core
CRC-T10DIF library.
Signed-off-by: Ard Biesheuvel
---
arch/powerpc/crypto/crct10dif-vpmsum_glue.c | 55 +++-
1 file changed, 6 insertions(+), 49 deletions(-)
diff --git a/arch/powerpc/crypto/crct10dif-vpmsum_glue.
Signed-off-by: Ard Biesheuvel
---
crypto/crct10dif_generic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/crypto/crct10dif_generic.c b/crypto/crct10dif_generic.c
index 8e94e29dc6fc..9ea4242c4921 100644
--- a/crypto/crct10dif_generic.c
+++ b/crypto/crct10dif_generic.c
@@
Linux's crypto API is widely regarded as being difficult to use for cases
where support for asynchronous accelerators or runtime dispatch of algorithms
(i.e., passed in as a string) are not needed. This leads to kludgy code and
also to actual security issues [0], although arguably, using AES in the
68 matches
Mail list logo