From: Chris von Recklinghausen
> Sent: 12 April 2021 20:51
...
> > This is not about BIOS bugs. Hibernation is deep suspend/resume
> > grafted onto cold boot, and it is perfectly legal for the firmware to
> > present a different memory map to the OS after a cold boot. It is
> > Linux that decides t
From: Chris von Recklinghausen
> Sent: 08 April 2021 11:46
>
> Suspend fails on a system in fips mode because md5 is used for the e820
> integrity check and is not available. Use crc32 instead.
>
> Prior to this patch, MD5 is used only to create a digest to ensure integrity
> of
> the region, no
From: Corentin Labbe
> Sent: 16 November 2020 13:54
>
> The optimized cipher function need length multiple of 4 bytes.
> But it get sometimes odd length.
> This is due to SG data could be stored with an offset.
>
> So the fix is to check also if the offset is aligned with 4 bytes.
> Fixes: 6298e9
From: Nathan Chancellor
> Sent: 12 November 2020 21:49
>
> On Thu, Nov 12, 2020 at 10:21:35PM +0100, Christian Lamparter wrote:
> > Hello,
> >
> > On 12/11/2020 21:07, Nathan Chancellor wrote:
> > > Clang warns:
> > >
> > > drivers/crypto/amcc/crypto4xx_core.c:921:60: warning: operator '?:' has
>
From: Arvind Sankar
> Sent: 25 October 2020 23:54
...
> > That's odd, the BLEND loop is about 20 instructions.
> > I wouldn't expect unrolling to help - unless you manage
> > to use 16 registers for the active W[] values.
> >
>
> I am not sure about what's going on inside the hardware, but even wi
From: Arvind Sankar
> Sent: 25 October 2020 20:18
>
> On Sun, Oct 25, 2020 at 06:51:18PM +0000, David Laight wrote:
> > From: Arvind Sankar
> > > Sent: 25 October 2020 14:31
> > >
> > > Unrolling the LOAD and BLEND loops improves performance by ~8% on x86
From: Arvind Sankar
> Sent: 25 October 2020 14:31
>
> Unrolling the LOAD and BLEND loops improves performance by ~8% on x86_64
> (tested on Broadwell Xeon) while not increasing code size too much.
I can't believe unrolling the BLEND loop makes any difference.
Unrolling the LOAD one might - but y
; > register-constrained architectures. On x86-32 (tested on Broadwell
> > Xeon), this gives a 10% performance benefit.
> >
> > Signed-off-by: Arvind Sankar
> > Suggested-by: David Laight
> > ---
> > lib/crypto/sha256.c | 49 ++
From: Ard Biesheuvel
> Sent: 21 October 2020 20:03
>
> On Wed, 21 Oct 2020 at 20:58, Joe Perches wrote:
> >
> > Like the __section macro, the __alias macro uses
> > macro # stringification to create quotes around
> > the section name used in the __attribute__.
> >
> > Remove the stringification a
From: Arvind Sankar
> Sent: 20 October 2020 21:40
>
> Putting the round constants and the message schedule arrays together in
> one structure saves one register, which can be a significant benefit on
> register-constrained architectures. On x86-32 (tested on Broadwell
> Xeon), this gives a 10% per
From: Arvind Sankar
> Sent: 20 October 2020 15:07
> To: David Laight
>
> On Tue, Oct 20, 2020 at 07:41:33AM +, David Laight wrote:
> > From: Arvind Sankar> Sent: 19 October 2020 16:30
> > > To: Herbert Xu ; David S. Miller
> > > ; linux-
> >
From: Arvind Sankar> Sent: 19 October 2020 16:30
> To: Herbert Xu ; David S. Miller
> ; linux-
> cry...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Subject: [PATCH 4/5] crypto: lib/sha256 - Unroll SHA256 loop 8 times intead
> of 64
>
> This reduces code size substantially (on x86_64 wit
From: Yang Shen
> Sent: 24 August 2020 04:12
>
> Replace 'sprintf' with 'scnprintf' to avoid overrun.
>
> Signed-off-by: Yang Shen
> Reviewed-by: Zhou Wang
> ---
> drivers/crypto/hisilicon/zip/zip_main.c | 11 +++
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/driv
From: Eric Dumazet
> Sent: 07 August 2020 19:29
>
> On 8/7/20 2:18 AM, David Laight wrote:
> > From: Eric Dumazet
> >> Sent: 06 August 2020 23:21
> >>
> >> On 7/22/20 11:09 PM, Christoph Hellwig wrote:
> >>> Rework the remaining setsockopt
From: Eric Dumazet
> Sent: 06 August 2020 23:21
>
> On 7/22/20 11:09 PM, Christoph Hellwig wrote:
> > Rework the remaining setsockopt code to pass a sockptr_t instead of a
> > plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
> > outside of architecture specific code.
> >
> >
From: Christoph Hellwig
> Sent: 27 July 2020 17:24
>
> On Mon, Jul 27, 2020 at 06:16:32PM +0200, Jason A. Donenfeld wrote:
> > Maybe sockptr_advance should have some safety checks and sometimes
> > return -EFAULT? Or you should always use the implementation where
> > being a kernel address is an e
From: Al Viro
> Sent: 27 July 2020 14:48
>
> On Mon, Jul 27, 2020 at 09:51:45AM +0000, David Laight wrote:
>
> > I'm sure there is code that processes options in chunks.
> > This probably means it is possible to put a chunk boundary
> > at the end of userspa
From: Ido Schimmel
> Sent: 27 July 2020 13:15
> On Thu, Jul 23, 2020 at 08:09:01AM +0200, Christoph Hellwig wrote:
> > Pass a sockptr_t to prepare for set_fs-less handling of the kernel
> > pointer from bpf-cgroup.
> >
> > Note that the get case is pretty weird in that it actually copies data
> > b
From: David Miller
> Sent: 24 July 2020 23:44
>
> From: Christoph Hellwig
> Date: Thu, 23 Jul 2020 08:08:42 +0200
>
> > setsockopt is the last place in architecture-independ code that still
> > uses set_fs to force the uaccess routines to operate on kernel pointers.
> >
> > This series adds a ne
From: 'Christoph Hellwig'
> Sent: 23 July 2020 15:45
>
> On Thu, Jul 23, 2020 at 02:42:11PM +, David Laight wrote:
> > From: Christoph Hellwig
> > > Sent: 23 July 2020 07:09
> > >
> > > The bpfilter user mode helper processes the optval
From: Christoph Hellwig
> Sent: 23 July 2020 07:09
>
> The bpfilter user mode helper processes the optval address using
> process_vm_readv. Don't send it kernel addresses fed under
> set_fs(KERNEL_DS) as that won't work.
What sort of operations is the bpf filter doing on the sockopt buffers?
An
From: Christoph Hellwig
> Sent: 23 July 2020 07:09
>
> This is mostly to prepare for cleaning up the callers, as bpfilter by
> design can't handle kernel pointers.
You've failed to fix the sense of the above...
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes
From: 'Christoph Hellwig'
> Sent: 22 July 2020 09:07
> On Tue, Jul 21, 2020 at 09:38:23AM +0000, David Laight wrote:
> > From: Christoph Hellwig
> > > Sent: 20 July 2020 13:47
> > >
> > > setsockopt is the last place in architecture-independ code
From: Christoph Hellwig
> Sent: 20 July 2020 13:47
>
> setsockopt is the last place in architecture-independ code that still
> uses set_fs to force the uaccess routines to operate on kernel pointers.
>
> This series adds a new sockptr_t type that can contained either a kernel
> or user pointer, a
From: Christoph Hellwig
> Sent: 20 July 2020 13:47
>
> Add a uptr_t type that can hold a pointer to either a user or kernel
> memory region, and simply helpers to copy to and from it. For
> architectures like x86 that have non-overlapping user and kernel
> address space it just is a union and use
From: Eric Biggers
> Sent: 20 July 2020 17:38
...
> How does this not introduce a massive security hole when
> CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE?
>
> AFAICS, userspace can pass in a pointer >= TASK_SIZE,
> and this code makes it be treated as a kernel pointer.
One thought I've had is
From: Christoph Hellwig
> Sent: 20 July 2020 13:47
>
> setsockopt is the last place in architecture-independ code that still
> uses set_fs to force the uaccess routines to operate on kernel pointers.
>
> This series adds a new sockptr_t type that can contained either a kernel
> or user pointer, an
From: Christoph Hellwig
> Sent: 20 July 2020 13:47
>
> This is mostly to prepare for cleaning up the callers, as bpfilter by
> design can't handle kernel pointers.
^^^ user ??
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT,
UK
From: Herbert Xu
> Sent: 05 June 2020 13:17
...
> Better yet use strscpy which will even return an error for you.
It really ought to return the buffer length on truncation.
Then you can loop:
while(...)
buf += strxxxcpy(buf, src, buf_end - buf);
and only check right at the
From: Herbert Xu
> Sent: 17 June 2019 15:56
> On Mon, Jun 17, 2019 at 04:54:16PM +0200, Arnd Bergmann wrote:
> >
> > Just converting the three testvec_config variables is what I originally
> > had in my patch. It got some configurations below the warning level,
> > but some others still had the pro
From: Ard Biesheuvel
> Sent: 14 June 2019 12:15
> (fix Eric's email address)
>
> On Fri, 14 Jun 2019 at 13:14, Ard Biesheuvel
> wrote:
> >
> > Using a bare block cipher in non-crypto code is almost always a bad idea,
> > not only for security reasons (and we've seen some examples of this in
> >
> A more interesting version would be to generate the lookup table
> for a byte followed by 3 zero bytes.
> You could then run four separate register dependency chains using the
> same 256 entry lookup table.
Not sure that works with a table lookup :-(
David
-
Registered Address Lakeside
From: Jeff Lien
> Sent: 10 August 2018 20:12
>
> This patch provides a performance improvement for the CRC16 calculations done
> in read/write
> workloads using the T10 Type 1/2/3 guard field. For example, today with
> sequential write
> workloads (one thread/CPU of IO) we consume 100% of the C
From: Antoine Tenart
> Sent: 02 May 2018 10:57
> Adds the AEAD_REQUEST_ON_STACK primitive to allow allocating AEAD
> requests on the stack, as it can already be done with various other
> crypto algorithms within the kernel.
>
> Signed-off-by: Antoine Tenart
> ---
> include/crypto/aead.h | 5
From: Salvatore Mesoraca
> Sent: 09 April 2018 17:38
...
> > You can also do much better than allocating MAX_BLOCKSIZE + MAX_ALIGNMASK
> > bytes by requesting 'long' aligned on-stack memory.
> > The easiest way is to define a union like:
> >
> > union crypto_tmp {
> > u8 buf[CRYPTO_MAX_TMP_
From: Salvatore Mesoraca
> Sent: 09 April 2018 14:55
>
> v2:
> As suggested by Herbert Xu, the blocksize and alignmask checks
> have been moved to crypto_check_alg.
> So, now, all the other separate checks are not necessary.
> Also, the defines have been moved to include/cr
From: Eric Biggers
> Sent: 14 March 2018 18:32
...
> Also, I recall there being a long discussion a while back about how
> __aligned(16) doesn't work on local variables because the kernel's stack
> pointer
> isn't guaranteed to maintain the alignment assumed by the compiler (see commit
> b8fbe71f7
From: Juergen Gross
> Sent: 19 December 2017 08:05
..
>
> Exchanging 2 registers can be done without memory access via:
>
> xor reg1, reg2
> xor reg2, reg1
> xor reg1, reg2
That'll generate horrid data dependencies.
ISTR that there are some optimisations for the stack,
so even 'push reg1', 'mov
From: Harsh Jain
> Sent: 03 October 2017 07:46
> It multiply GF(2^128) elements in the ble format.
> It will be used by chelsio driver to fasten gf multiplication.
^ speed up ??
David
From: Logan Gunthorpe
> Sent: 13 April 2017 23:05
> Straightforward conversion to the new helper, except due to
> the lack of error path, we have to warn if unmapable memory
> is ever present in the sgl.
>
> Signed-off-by: Logan Gunthorpe
> ---
> drivers/block/xen-blkfront.c | 33 +++
From: Daniel Axtens
> Sent: 15 March 2017 22:30
> Hi David,
>
> > While not part of this change, the unrolled loops look as though
> > they just destroy the cpu cache.
> > I'd like be convinced that anything does CRC over long enough buffers
> > to make it a gain at all.
> >
> > With modern (not t
From: Linuxppc-dev Daniel Axtens
> Sent: 15 March 2017 12:38
> The core nuts and bolts of the crc32c vpmsum algorithm will
> also work for a number of other CRC algorithms with different
> polynomials. Factor out the function into a new asm file.
>
> To handle multiple users of the function, a use
From: Andy Lutomirski
> Sent: 10 January 2017 23:25
> There are some hashes (e.g. sha224) that have some internal trickery
> to make sure that only the correct number of output bytes are
> generated. If something goes wrong, they could potentially overrun
> the output buffer.
>
> Make the test mo
From: George Spelvin
> Sent: 17 December 2016 15:21
...
> uint32_t
> hsiphash24(char const *in, size_t len, uint32_t const key[2])
> {
> uint32_t c = key[0];
> uint32_t d = key[1];
> uint32_t a = 0x6c796765 ^ 0x736f6d65;
> uint32_t b = d ^ 0x74656462 ^ 0x646f7261;
I've
From: George Spelvin
> Sent: 15 December 2016 23:29
> > If a halved version of SipHash can bring significant performance boost
> > (with 32b words instead of 64b words) with an acceptable security level
> > (64-bit enough?) then we may design such a version.
>
> I was thinking if the key could be
From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> These restore parity with the jhash interface by providing high
> performance helpers for common input sizes.
...
> +#define PREAMBLE(len) \
> + u64 v0 = 0x736f6d6570736575ULL; \
> + u64 v1 = 0x646f72616e646f6dULL; \
> + u64 v2 =
From: Jason A. Donenfeld
> Sent: 15 December 2016 20:30
> This gives a clear speed and security improvement. Siphash is both
> faster and is more solid crypto than the aging MD5.
>
> Rather than manually filling MD5 buffers, for IPv6, we simply create
> a layout by a simple anonymous struct, for w
From: Hannes Frederic Sowa
> Sent: 15 December 2016 14:57
> On 15.12.2016 14:56, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:50
> >> On 15.12.2016 13:28, David Laight wrote:
> >>> From: Hannes Frederi
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:50
> On 15.12.2016 13:28, David Laight wrote:
> > From: Hannes Frederic Sowa
> >> Sent: 15 December 2016 12:23
> > ...
> >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
> &
From: Hannes Frederic Sowa
> Sent: 15 December 2016 12:23
...
> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8
> bytes on 32 bit. Do you question that?
Yes.
The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc).
David
From: Hannes Frederic Sowa
> Sent: 14 December 2016 22:03
> On 14.12.2016 13:46, Jason A. Donenfeld wrote:
> > Hi David,
> >
> > On Wed, Dec 14, 2016 at 10:56 AM, David Laight
> > wrote:
> >> ...
> >>> +u64 siphash24(const u8 *data
From: Linus Torvalds
> Sent: 15 December 2016 00:11
> On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld wrote:
> >
> > Or does your reasonable dislike of "word" still allow for the use of
> > dword and qword, so that the current function names of:
>
> dword really is confusing to people.
>
> If
From: Behalf Of Jason A. Donenfeld
> Sent: 14 December 2016 18:46
...
> + ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined),
> end),
If you make the first argument 'const void *' you won't need the cast
on every call.
I'd also suggest making the key u64[2].
Davi
From: Jason A. Donenfeld
> Sent: 14 December 2016 13:44
> To: Hannes Frederic Sowa
> > __packed not only removes all padding of the struct but also changes the
> > alignment assumptions for the whole struct itself. The rule, the struct
> > is aligned by its maximum alignment of a member is no longe
From: Andy Lutomirski
> Sent: 12 December 2016 20:53
> The driver put a constant buffer of all zeros on the stack and
> pointed a scatterlist entry at it in two places. This doesn't work
> with virtual stacks. Use a static 16-byte buffer of zeros instead.
...
I didn't think you could dma from st
From: Paulo Flabiano Smorigo
> Sent: 19 July 2016 14:36
> Ignore assembly files generated by the perl script.
...
> diff --git a/drivers/crypto/vmx/.gitignore b/drivers/crypto/vmx/.gitignore
> new file mode 100644
> index 000..af4a7ce
> --- /dev/null
> +++ b/drivers/crypto/vmx/.gitignore
> @@
From: Paulo Flabiano Smorigo
> Sent: 11 July 2016 20:08
>
> This patch add XTS subroutines using VMX-crypto driver.
>
> It gives a boost of 20 times using XTS.
>
> These code has been adopted from OpenSSL project in collaboration
> with the original author (Andy Polyakov ).
Yep, typical openssl
From: Sowmini Varadhan
> Sent: 01 December 2015 18:37
...
> I was using esp-null merely to not have the crypto itself perturb
> the numbers (i.e., just focus on the s/w overhead for now), but here
> are the numbers for the stock linux kernel stack
> Gbps peak cpu util
> esp-null
From: Sowmini Varadhan
> Sent: 02 December 2015 12:12
> On (12/02/15 11:56), David Laight wrote:
> > > Gbps peak cpu util
> > > esp-null 1.8 71%
> > > aes-gcm-c-2561.6 79%
> > > aes-ccm-a-1280.7 96%
> > >
&g
From: Christophe Leroy
> Linux CodyingStyle recommends to use short variables for local
> variables. ptr is just good enough for those 3 lines functions.
> It helps keep single lines shorter than 80 characters.
...
> -static void to_talitos_ptr(struct talitos_ptr *talitos_ptr, dma_addr_t
> dma_add
From: Markus Stockhausen
> [PATCH v1 1/3] SHA1 for PPC/SPE - assembler
>
> This is the assembler code for SHA1 implementation with
> the SIMD SPE instruction set. With the enhanced instruction
> set we can operate on 2 32 bit words in parallel. That helps
> reducing the time to calculate W16-W79.
From: Markus Stockhausen
> 4K AES tables for big endian
I can't help feeling that you could give more information about how the
values are generated.
...
> + * These big endian AES encryption/decryption tables are designed to be
> simply
> + * accessed by a combination of rlwimi/lwz instruction
> By the way, I suspect previous code was chosen years ago because this
> version uses less stack but adds much more code bloat.
>
> size crypto/sha512_generic.o crypto/sha512_generic_old.o
>text data bss dec hex filename
> 17369 704 0 180734699 c
Doesn't this badly overflow W[] ..
> +#define SHA512_0_15(i, a, b, c, d, e, f, g, h) \
> + t1 = h + e1(e) + Ch(e, f, g) + sha512_K[i] + W[i]; \
...
> + for (i = 0; i < 16; i += 8) {
...
> + SHA512_0_15(i + 7, b, c, d, e, f, g, h, a);
> + }
David
--
To unsub
> Trying a dynamic memory allocation, and fallback on a single
> pre-allocated bloc of memory, shared by all cpus, protected by a
> spinlock
...
> -
> + static u64 msg_schedule[80];
> + static DEFINE_SPINLOCK(msg_schedule_lock);
> int i;
> - u64 *W = get_cpu_var(msg_schedule);
>
> Good catch. It can be generalized to any interrupts (soft and hard)
>
> Another solution is using two blocks, one used from interrupt context.
>
> static DEFINE_PER_CPU(u64[80], msg_schedule);
> static DEFINE_PER_CPU(u64[80], msg_schedule_irq);
>
> (Like we do for SNMP mibs on !x86 arches)
66 matches
Mail list logo