Hi Ted,
On Thu, Dec 22, 2016 at 6:41 AM, Theodore Ts'o wrote:
> The bottom line is that I think we're really "pixel peeping" at this
> point --- which is what obsessed digital photographers will do when
> debating the quality of a Canon vs Nikon DSLR by blowing up a photo by
> a thousand times, a
On Wed, Dec 21, 2016 at 9:01 PM, George Spelvin
wrote:
> Andy Lutomirski wrote:
>> I don't even think it needs that. This is just adding a
>> non-destructive final operation, right?
>
> It is, but the problem is that SipHash is intended for *small* inputs,
> so the standard implementations aren't
On Thu, Dec 22, 2016 at 03:49:39AM +0100, Jason A. Donenfeld wrote:
>
> Funny -- while you guys were sending this back & forth, I was writing
> my reply to Andy which essentially arrives at the same conclusion.
> Given that we're all arriving to the same thing, and that Ted shot in
> this directio
Andy Lutomirski wrote:
> I don't even think it needs that. This is just adding a
> non-destructive final operation, right?
It is, but the problem is that SipHash is intended for *small* inputs,
so the standard implementations aren't broken into init/update/final
functions.
There's just one big f
Hi George,
On Thu, Dec 22, 2016 at 4:55 AM, George Spelvin
wrote:
> Do we have to go through this? No, the benchmark was *not* bogus.
> Then I replaced the kernel #includes with the necessary typedefs
> and #defines to make it compile in user-space.
> * I didn't iterate 100K times, I timed the f
> Plus the benchmark was bogus anyway, and when I built a more specific
> harness -- actually comparing the TCP sequence number functions --
> SipHash was faster than MD5, even on register starved x86. So I think
> we're fine and this chapter of the discussion can come to a close, in
> order to mov
On Thu, Dec 22, 2016 at 3:49 AM, Jason A. Donenfeld wrote:
> I did have two objections to this. The first was that my SipHash
> construction is faster. But in any case, they're both faster than the
> current MD5, so it's just extra rice. The second, and the more
> important one, was that batching
Hi Andy & Hannes,
On Thu, Dec 22, 2016 at 3:07 AM, Hannes Frederic Sowa
wrote:
> I wonder if Ted's proposal was analyzed further in terms of performance
> if get_random_int should provide cprng alike properties?
>
> For reference: https://lkml.org/lkml/2016/12/14/351
>
> The proposal made sense t
> On Wed, Dec 21, 2016 at 5:13 PM, George Spelvin
>> After some thinking, I still like the "state-preserving" construct
>> that's equivalent to the current MD5 code. Yes, we could just do
>> siphash(current_cpu || per_cpu_counter, global_key), but it's nice to
>> preserve a bit more.
>>
>> It requ
Hi Andy,
On Thu, Dec 22, 2016 at 12:42 AM, Andy Lutomirski wrote:
> So this is probably good enough, and making it better is hard. Changing it
> to:
>
> u64 entropy = (u64)random_get_entropy() + current->pid;
> result = siphash(..., entropy, ...);
> secret->chaining += result + entropy;
>
> wou
On Wed, Dec 21, 2016 at 6:07 PM, Hannes Frederic Sowa
wrote:
> On 22.12.2016 00:42, Andy Lutomirski wrote:
>> On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote:
>>> unsigned int get_random_int(void)
>>> {
>>> - __u32 *hash;
>>> - unsigned int ret;
>>> -
>>> - if (arch_
On Wed, Dec 21, 2016 at 5:13 PM, George Spelvin
wrote:
> As a separate message, to disentangle the threads, I'd like to
> talk about get_random_long().
>
> After some thinking, I still like the "state-preserving" construct
> that's equivalent to the current MD5 code. Yes, we could just do
> sipha
On 22.12.2016 00:42, Andy Lutomirski wrote:
> On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote:
>> unsigned int get_random_int(void)
>> {
>> - __u32 *hash;
>> - unsigned int ret;
>> -
>> - if (arch_get_random_int(&ret))
>> - return ret;
>> -
>> - ha
On Wed, Dec 21, 2016 at 9:25 AM, Linus Torvalds
wrote:
> On Wed, Dec 21, 2016 at 7:55 AM, George Spelvin
> wrote:
>>
>> How much does kernel_fpu_begin()/kernel_fpu_end() cost?
>
> It's now better than it used to be, but it's absolutely disastrous
> still. We're talking easily many hundreds of cyc
On Thu, Dec 22, 2016 at 2:40 AM, Stephen Hemminger
wrote:
> The networking tree (net-next) which is where you are submitting to is
> technically
> closed right now.
That's okay. At some point in the future it will be open. By then v83
of this patch set will be shiny and done, just waiting for th
On Thu, 22 Dec 2016 00:02:11 +0100
"Jason A. Donenfeld" wrote:
> SipHash is a 64-bit keyed hash function that is actually a
> cryptographically secure PRF, like HMAC. Except SipHash is super fast,
> and is meant to be used as a hashtable keyed lookup function, or as a
> general PRF for short inpu
As a separate message, to disentangle the threads, I'd like to
talk about get_random_long().
After some thinking, I still like the "state-preserving" construct
that's equivalent to the current MD5 code. Yes, we could just do
siphash(current_cpu || per_cpu_counter, global_key), but it's nice to
pr
> 64-bit x86_64:
> [0.509409] test_siphash: SipHash2-4 cycles: 4049181
> [0.510650] test_siphash: SipHash1-3 cycles: 2512884
> [0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920
> [0.512904] test_siphash:JenkinsHash cycles: 978267
I'm not sure what these numbers m
Theodore Ts'o wrote:
> On Wed, Dec 21, 2016 at 01:37:51PM -0500, George Spelvin wrote:
>> SipHash annihilates the competition on 64-bit superscalar hardware.
>> SipHash dominates the field on 64-bit in-order hardware.
>> SipHash wins easily on 32-bit hardware *with enough registers*.
>> On register
On Wed, Dec 21, 2016 at 3:02 PM, Jason A. Donenfeld wrote:
> unsigned int get_random_int(void)
> {
> - __u32 *hash;
> - unsigned int ret;
> -
> - if (arch_get_random_int(&ret))
> - return ret;
> -
> - hash = get_cpu_var(get_random_int_hash);
> -
> - ha
Hi Ted,
On Thu, Dec 22, 2016 at 12:02 AM, Jason A. Donenfeld wrote:
> This duplicates the current algorithm for get_random_int/long
I should have mentioned this directly in the commit message, which I
forgot to update: this v7 adds the time-based key rotation, which,
while not strictly necessary
HalfSipHash, or hsiphash, is a shortened version of SipHash, which
generates 32-bit outputs using a weaker 64-bit key. It has *much* lower
security margins, and shouldn't be used for anything too sensitive, but
it could be used as a hashtable key function replacement, if the output
is never exposed
SHA1 is slower and less secure than SipHash, and so replacing syncookie
generation with SipHash makes natural sense. Some BSDs have been doing
this for several years in fact.
The speedup should be similar -- and even more impressive -- to the
speedup from the sequence number fix in this series.
S
This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.
Rather than manually filling MD5 buffers, for IPv6, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code. For IPv4, we pass the va
This duplicates the current algorithm for get_random_int/long, but uses
siphash instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
separates hashed fields into three values instead of one, in order to
increase diffusion.
The
SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function, or as a
general PRF for short input use cases, such as sequence numbers or RNG
chaining.
For the first usa
Hey folks,
Again we've made huge progress, with this latest version now shipping
Jean-Phillipe Aumasson's HalfSipHash, which should be much more competitive
with jhash (in addition to being more secure, of course).
There are dozens of little cleanups and improvements right and left throughout
thi
The md5_transform function is no longer used any where in the tree,
except for the crypto api's actual implementation of md5, so we can drop
the function from lib and put it as a static function of the crypto
file, where it belongs. There should be no new users of md5_transform,
anyway, since there
On Wed, Dec 21, 2016 at 11:27 PM, Theodore Ts'o wrote:
> And "with enough registers" includes ARM and MIPS, right? So the only
> real problem is 32-bit x86, and you're right, at that point, only
> people who might care are people who are using a space-radiation
> hardened 386 --- and they're not
On Wed, Dec 21, 2016 at 01:37:51PM -0500, George Spelvin wrote:
> SipHash annihilates the competition on 64-bit superscalar hardware.
> SipHash dominates the field on 64-bit in-order hardware.
> SipHash wins easily on 32-bit hardware *with enough registers*.
> On register-starved 32-bit machines, i
Christopher Covington reported a crash on aarch64 on recent Fedora
kernels:
kernel BUG at ./include/linux/scatterlist.h:140!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 752 Comm: cryptomgr_test Not tainted 4.9.0-11815-ge93b1cc #162
Hardware name: linux,dummy-virt
On Fri, Dec 16, 2016 at 07:41:23PM +0530, abed mohammad kamaluddin wrote:
>
> Thanks Herbert. Are there timelines or ongoing efforts for moving
> IPcomp/Ipsec to use acomp? Or any proposals that have been or need to
> be taken up in this regard.
Someone needs to write the patches :)
--
Email: He
On Wed, Dec 21, 2016 at 7:37 PM, George Spelvin
wrote:
> SipHash annihilates the competition on 64-bit superscalar hardware.
> SipHash dominates the field on 64-bit in-order hardware.
> SipHash wins easily on 32-bit hardware *with enough registers*.
> On register-starved 32-bit machines, it really
Eric Dumazet wrote:
> Now I am quite confused.
>
> George said :
>> Cycles per byte on 1024 bytes of data:
>> Pentium Core 2 Ivy
>> 4 Duo Bridge
>> SipHash-2-4 38.9 8.3 5.8
>> HalfSipHash-2-4 12.7 4.5 3.2
>> MD5
Linus wrote:
>> How much does kernel_fpu_begin()/kernel_fpu_end() cost?
>
> It's now better than it used to be, but it's absolutely disastrous
> still. We're talking easily many hundreds of cycles. Under some loads,
> thousands.
I think I've been thoroughly dissuaded, but just to clarify one thing
On Wed, Dec 21, 2016 at 7:55 AM, George Spelvin
wrote:
>
> How much does kernel_fpu_begin()/kernel_fpu_end() cost?
It's now better than it used to be, but it's absolutely disastrous
still. We're talking easily many hundreds of cycles. Under some loads,
thousands.
And I warn you already: it will
On Wed, 2016-12-21 at 11:39 -0500, Rik van Riel wrote:
> Does anybody still have a P4?
>
> If they do, they're probably better off replacing
> it with an Atom. The reduced power bills will pay
> for replacing that P4 within a year or two.
Well, maybe they have millions of units to replace.
>
>
On Mon, Dec 19, 2016 at 03:37:26PM -0800, Laura Abbott wrote:
> Christopher Covington reported a crash on aarch64 on recent Fedora
> kernels:
>
> kernel BUG at ./include/linux/scatterlist.h:140!
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 2 PID: 752 Comm: cryptomgr
On Mon, Dec 19, 2016 at 04:08:11PM +0530, Harsh Jain wrote:
> Hi Herbert,
>
> TLS default mode of operation is MAC-then-Encrypt for Authenc algos.
> Currently framework only supports EtM used in IPSec. User space
> programs like openssl cannot use af-alg interface to encrypt/decrypt
> in TLS mode.
On Wed, 2016-12-21 at 10:55 -0500, George Spelvin wrote:
> Actually, DJB just made a very relevant suggestion.
>
> As I've mentioned, the 32-bit performance problems are an x86-
> specific
> problem. ARM does very well, and other processors aren't bad at all.
>
> SipHash fits very nicely (and ru
On Wed, 2016-12-21 at 07:56 -0800, Eric Dumazet wrote:
> On Wed, 2016-12-21 at 15:42 +0100, Jason A. Donenfeld wrote:
> George said :
>
> > Cycles per byte on 1024 bytes of data:
> > Pentium Core 2 Ivy
> > 4 Duo Bridge
> > SipHash-2-4
Hi Eric,
On Wed, Dec 21, 2016 at 4:56 PM, Eric Dumazet wrote:
> That really was for 1024 bytes blocks, so pretty much useless for our
> discussion ?
>
> Reading your numbers last week, I thought SipHash was faster, but George
> numbers are giving the opposite impression.
>
> I do not have a P4 to
Hi George,
On Wed, Dec 21, 2016 at 4:55 PM, George Spelvin
wrote:
> Actually, DJB just made a very relevant suggestion.
>
> As I've mentioned, the 32-bit performance problems are an x86-specific
> problem. ARM does very well, and other processors aren't bad at all.
>
> SipHash fits very nicely (
On Wed, 2016-12-21 at 15:42 +0100, Jason A. Donenfeld wrote:
> Hi Eric,
>
> I computed performance numbers for both 32-bit and 64-bit using the
> actual functions in which talking about replacing MD5 with SipHash.
> The basic harness is here [1] if you're curious. SipHash was a pretty
> clear winn
Actually, DJB just made a very relevant suggestion.
As I've mentioned, the 32-bit performance problems are an x86-specific
problem. ARM does very well, and other processors aren't bad at all.
SipHash fits very nicely (and runs very fast) in the MMX registers.
They're 64 bits, and there are 8 of
Hi Eric,
I computed performance numbers for both 32-bit and 64-bit using the
actual functions in which talking about replacing MD5 with SipHash.
The basic harness is here [1] if you're curious. SipHash was a pretty
clear winner for both cases.
x86_64:
[1.714302] secure_tcpv6_sequence_number_m
Hi George,
On Wed, Dec 21, 2016 at 7:34 AM, George Spelvin
wrote:
> In fact, I have an idea. Allow me to make the following concrete
> suggestion for using HalfSipHash with 128 bits of key material:
>
> - 64 bits are used as the key.
> - The other 64 bits are used as an IV which is prepended to
Hello
I have some comment inline
On Wed, Dec 21, 2016 at 11:56:12AM +, george.cher...@cavium.com wrote:
> From: George Cherian
>
> Enable the CPT VF driver. CPT is the cryptographic Accelaration Unit
typo acceleration
[...]
> +static inline void update_input_data(struct cpt_request_info *
Hello
I have some comment inline
On Wed, Dec 21, 2016 at 11:56:11AM +, george.cher...@cavium.com wrote:
> From: George Cherian
>
> Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
typo driver
> found in Octeon-tx series of SoC's. CPT is the Cryptographic Acceleration
On 12/20/2016 10:41 AM, Binoy Jayan wrote:
> At a high level the goal is to maximize the size of data blocks that get
> passed
> to hardware accelerators, minimizing the overhead from setting up and tearing
> down operations in the hardware. Currently dm-crypt itself is a big blocker as
> it manua
From: George Cherian
Add the CPT options in crypto Kconfig and update the
crypto Makefile
Signed-off-by: George Cherian
Reviewed-by: David Daney
---
drivers/crypto/Kconfig | 1 +
drivers/crypto/Makefile | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/crypto/Kconfig b/drivers/cr
From: George Cherian
Enable the CPT VF driver. CPT is the cryptographic Accelaration Unit
in Octeon-tx series of processors.
Signed-off-by: George Cherian
Reviewed-by: David Daney
---
drivers/crypto/cavium/cpt/Makefile | 3 +-
drivers/crypto/cavium/cpt/cptvf.h| 135 +++
From: George Cherian
Enable the Physical Function diver for the Cavium Crypto Engine (CPT)
found in Octeon-tx series of SoC's. CPT is the Cryptographic Acceleration
Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and
asymmetric engines (AEs).
Signed-off-by: George Cherian
Revie
From: George Cherian
This series adds the support for Cavium Cryptographic Accelerarion Unit (CPT)
CPT is available in Cavium's Octeon-Tx SoC series.
The series was tested with ecryptfs and
54 matches
Mail list logo