This patch adds P9 NX support for 842 compression engine. Virtual
Accelerator Switchboard (VAS) is used to access 842 engine on P9.
For each NX engine per chip, setup receive window using
vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
pid and tid values. This unique (lpid, pid
This patch adds changes for checking P9 specific 842 engine
error codes. These errros are reported in coprocessor status
block (CSB) for failures.
Signed-off-by: Haren Myneni
---
arch/powerpc/include/asm/icswx.h | 3 +++
drivers/crypto/nx/nx-842-powernv.c | 18 ++
drivers/cry
Move deleting coprocessors info upon exit or failure to
nx842_delete_coprocs().
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 25 -
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/drivers/crypto/nx/nx-842-powernv.c
b/drivers/crypto
Updating coprocessor list is moved to nx842_add_coprocs_list().
This function will be used for both icswx and VAS functions.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/nx/nx-8
Configure CRB is moved to nx842_configure_crb() so that it can
be used for icswx and VAS exec functions. VAS function will be
added later with P9 support.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 57 +-
1 file changed, 38 insertion
Rename nx842_powernv_function to nx842_powernv_exec.
nx842_powernv_exec points to nx842_exec_icswx and
will be point to VAS exec function which will be added later
for P9 NX support.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 20 +---
1 file changed, 13
P9 introduces Virtual Accelerator Switchboard (VAS) to communicate
with NX 842 engine. icswx function is used to access NX before.
On powerNV systems, NX-842 driver invokes VAS functions for
configuring RxFIFO (receive window) per each NX engine. VAS uses
this FIFO to communicate the request to NX
On 07/17/2017 11:53 PM, Ram Pai wrote:
> On Mon, Jul 17, 2017 at 04:50:38PM -0700, Haren Myneni wrote:
>>
>> This patch adds P9 NX support for 842 compression engine. Virtual
>> Accelerator Switchboard (VAS) is used to access 842 engine on P9.
>>
>> For each NX engine per chip, setup receive window
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-crypto-aes-xts.c | 16 +---
drivers/crypto/ccp/ccp-crypto.h |2 +-
drivers/crypto/ccp/ccp-ops.c|3 +++
3 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-
The following series adds support for XS-AES on version 5 CCPs,
both 128- and 256-bit, and enhances/clarifies/simplifies some
crypto layer code.
Changes since v1:
- rework the validation of the unit-size; move to a separate patch
- expand the key buffer to accommodate 256-bit keys
- use xts_che
The CCP supports a limited set of unit-size values. Change the check
for this parameter such that acceptable values match the enumeration.
Then clarify the conditions under which we must use the fallback
implementation.
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-crypto-aes-xts.c | 7
Vet the key using the available standard function
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-crypto-aes-xts.c |9 -
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
index 58a4244
Version 5 CCPs have some new requirements for XTS-AES: the type field
must be specified, and the key requires 512 bits, with each part
occupying 256 bits and padded with zeroes.
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-dev-v5.c |2 ++
drivers/crypto/ccp/ccp-dev.h|2 ++
d
Modify Kconfig help text to reflect the fact that random data from hwrng
is fed into kernel random number generator's entropy pool.
Signed-off-by: PrasannaKumar Muralidharan
---
drivers/char/hw_random/Kconfig | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/char/
On 07/17/2017 04:48 PM, Lendacky, Thomas wrote:
On 7/17/2017 3:08 PM, Gary R Hook wrote:
Version 5 CCPs have differing requirements for XTS-AES: key components
are stored in a 512-bit vector. The context must be little-endian
justified. AES-256 is supported now, so propagate the cipher size to
t
This is a followup to 'crypto: scompress - eliminate percpu scratch buffers',
which attempted to replace the scompress per-CPU buffer entirely, but as
Herbert pointed out, this is not going to fly in the targeted use cases.
Instead, move the alloc/free of the buffers into the tfm init/exit hooks,
The scompress code allocates 2 x 128 KB of scratch buffers for each CPU,
so that clients of the async API can use synchronous implementations
even from atomic context. However, on systems such as Cavium Thunderx
(which has 96 cores), this adds up to a non-negligible 24 MB. Also,
32-bit systems may
When allocating the per-CPU scratch buffers, we allocate the source
and destination buffers separately, but bail immediately if the second
allocation fails, without freeing the first one. Fix that.
Signed-off-by: Ard Biesheuvel
---
crypto/scompress.c | 5 -
1 file changed, 4 insertions(+), 1
Due to the use of per-CPU buffers, scomp_acomp_comp_decomp() executes
with preemption disabled, and so whether the CRYPTO_TFM_REQ_MAY_SLEEP
flag is set is irrelevant, since we cannot sleep anyway. So disregard
the flag, and use GFP_ATOMIC unconditionally.
Cc: # v4.10+
Signed-off-by: Ard Biesheuve
Am Freitag, 21. Juli 2017, 17:09:11 CEST schrieb Arnd Bergmann:
Hi Arnd,
> On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller
wrote:
> > Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
> >> Um, the timer is the largest number of interrupts on my system. Compare:
> >>
On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller wrote:
> Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
>> Um, the timer is the largest number of interrupts on my system. Compare:
>>
>> CPU0 CPU1 CPU2 CPU3
>> LOC:639655260388656558646
On 07/21/2017 09:47 AM, Theodore Ts'o wrote:
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
Better, but obviously there is still much room for improvement by reducing
the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we be
On 21.07.2017 16:47, Theodore Ts'o wrote:
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
Better, but obviously there is still much room for improvement by reducing
the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we believ
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
> Better, but obviously there is still much room for improvement by reducing
> the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we believe this is going to be true for all AMD de
On 21 July 2017 at 14:44, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote:
>>
>> >> - Would you mind a patch that makes the code only use the per-CPU
>> >> buffers if we are running atomically to begin with?
>> >
>> > That would mean dropping the first packet so
On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote:
>
> >> - Would you mind a patch that makes the code only use the per-CPU
> >> buffers if we are running atomically to begin with?
> >
> > That would mean dropping the first packet so no.
> >
>
> I think you misunderstood me: the idea
On 21 July 2017 at 14:31, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote:
>>
>> OK, but that doesn't really answer any of my questions:
>> - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually
>> exclusive with CRYPTO_TFM_REQ_MAY_SLEEP, or should
>> cry
On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote:
>
> OK, but that doesn't really answer any of my questions:
> - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually
> exclusive with CRYPTO_TFM_REQ_MAY_SLEEP, or should
> crypto_scomp_sg_alloc() always use GFP_ATOMIC? We need
On 21 July 2017 at 14:24, Ard Biesheuvel wrote:
> On 21 July 2017 at 14:11, Herbert Xu wrote:
>> On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>>>
>>> Right. And is req->dst guaranteed to be assigned in that case? Because
>>> crypto_scomp_sg_alloc() happily allocates pages and k
On 21 July 2017 at 14:11, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>>
>> Right. And is req->dst guaranteed to be assigned in that case? Because
>> crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the
>> scatterlist if req->dst == NULL.
>>
>>
On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>
> Right. And is req->dst guaranteed to be assigned in that case? Because
> crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the
> scatterlist if req->dst == NULL.
>
> Is there any way we could make these scratch buffer
On 21 July 2017 at 13:42, Herbert Xu wrote:
> On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote:
>> The scompress code unconditionally allocates 2 per-CPU scratch buffers
>> of 128 KB each, in order to avoid allocation overhead in the async
>> wrapper that encapsulates the synchronous
On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote:
> The scompress code unconditionally allocates 2 per-CPU scratch buffers
> of 128 KB each, in order to avoid allocation overhead in the async
> wrapper that encapsulates the synchronous compression algorithm, since
> it may execute in
On Fri, Jul 21, 2017 at 3:12 AM, Oliver Mangold wrote:
> Hi,
>
> I was wondering why reading from /dev/urandom is much slower on Ryzen than
> on Intel, and did some analysis. It turns out that the RDRAND instruction is
> at fault, which takes much longer on AMD.
>
> if I read this correctly:
>
> -
On 21.07.2017 11:26, Jan Glauber wrote:
Nice catch. How much does the performance improve on Ryzen when you
use arch_get_random_int()?
Okay, now I have some results for you:
On Ryzen 1800X (using arch_get_random_int()):
---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
8751415296 b
Hi Ted,
Snipping one comment:
> Practically no one uses /dev/random. It's essentially a deprecated
> interface; the primary interfaces that have been recommended for well
> over a decade is /dev/urandom, and now, getrandom(2). We only need
> 384 bits of randomness every 5 minutes to reseed the
On Fri, Jul 21, 2017 at 09:12:01AM +0200, Oliver Mangold wrote:
> Hi,
>
> I was wondering why reading from /dev/urandom is much slower on
> Ryzen than on Intel, and did some analysis. It turns out that the
> RDRAND instruction is at fault, which takes much longer on AMD.
>
> if I read this correc
Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
Hi Theodore,
> On Thu, Jul 20, 2017 at 09:00:02PM +0200, Stephan Müller wrote:
> > I concur with your rationale where de-facto the correlation is effect is
> > diminished and eliminated with the fast_pool and the minimal entropy
> >
Hi,
I was wondering why reading from /dev/urandom is much slower on Ryzen
than on Intel, and did some analysis. It turns out that the RDRAND
instruction is at fault, which takes much longer on AMD.
if I read this correctly:
--- drivers/char/random.c ---
862 spin_lock_irqsave(&crn
39 matches
Mail list logo