Re: [PATCH v4] x86, crypto: ported aes-ni implementation to x86

2010-11-17 Thread Huang Ying
tion again > > * added alignment constraints for internal functions. > > > > > > arch/x86/crypto/aesni-intel_asm.S | 197 > > ++++++------ > > arch/x86/crypto/aesni-intel_glue.c | 22 +++- > > crypto/Kconfig | 12 ++-

Re: [PATCH v3] x86, crypto: ported aes-ni implementation to x86

2010-11-11 Thread Huang Ying
On Fri, 2010-11-12 at 15:30 +0800, Mathias Krause wrote: > On 12.11.2010, 01:33 Huang Ying wrote: > > Hi, Mathias, > > > > On Fri, 2010-11-12 at 06:18 +0800, Mathias Krause wrote: > >> All test were run five times in a row using a 256 bit key and doing i/o > &g

Re: [PATCH v3] x86, crypto: ported aes-ni implementation to x86

2010-11-11 Thread Huang Ying
57.5 255.3 > > Comparing the mean values gives us: > > x86: i586 aes-nidelta > ECB: 93.8123.3 +31.4% Why the improvement of ECB is so small? I can not understand it. It should be as big as CBC. Best Regards, Huang Ying > CBC: 84.8262.3 +209.3% >

Re: [PATCH v3] x86, crypto: ported aes-ni implementation to x86

2010-11-04 Thread Huang Ying
On Thu, 2010-11-04 at 00:38 -0700, Mathias Krause wrote: > On 03.11.2010, 23:27 Huang Ying wrote: > > On Wed, 2010-11-03 at 14:14 -0700, Mathias Krause wrote: > >> The AES-NI instructions are also available in legacy mode so the 32-bit > >> architecture may profit fro

Re: [PATCH v3] x86, crypto: ported aes-ni implementation to x86

2010-11-03 Thread Huang Ying
o test? Or you can add test_acipher_speed (similar with test_ahash_speed) to test cipher in asynchronous mode. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: AES-NI for x86?

2010-10-25 Thread Huang Ying
x27;t exist unless someone writes one. Are you volunteering? :) > > Ccing the author in case there is some reason why it won't work in > 32-bit mode. It can be done in 32-bit mode. We have not done that simply because we still have no time to work on that. You are welcome to do th

Re: [BUGFIX] Fix AES-NI CTR optimization compiling failure with gas 2.16.1

2010-03-23 Thread Huang Ying
Hi, Andrew, On Wed, 2010-03-24 at 05:23 +0800, Andrew Morton wrote: > On Fri, 12 Mar 2010 15:01:47 +0800 > Huang Ying wrote: > > > Andrew Morton reported that AES-NI CTR optimization failed to compile > > with gas 2.16.1, the error message is as follow: > >

[BUGFIX] Fix AES-NI CTR optimization compiling failure with gas 2.16.1

2010-03-11 Thread Huang Ying
S:753: Error: suffix or operands invalid for `movq' To fix this, a gas macro is defined to assemble movq with 64bit general purpose registers and XMM registers. The macro will generate the raw .byte sequence for needed instructions. Reported-by: Andrew Morton Signed-off-by: Huang Ying ---

[PATCH -v2] Speed testing support for ghash

2010-03-02 Thread Huang Ying
Because ghash needs setkey, the setkey and keysize template support for test_hash_speed is added. v2: - Move klen into struct hash_speed. Signed-off-by: Huang Ying --- crypto/tcrypt.c |7 +++ crypto/tcrypt.h | 29 + 2 files changed, 36 insertions

[PATCH] crypto, Speed testing support for ghash

2010-03-01 Thread Huang Ying
Because ghash needs setkey, the setkey and keysize template support for test_hash_speed is added. Signed-off-by: Huang Ying --- crypto/tcrypt.c | 122 +++- crypto/tcrypt.h |1 2 files changed, 78 insertions(+), 45 deletions(-) --- a

[PATCH] crypto: Add AES-NI accelerated CTR mode

2010-02-28 Thread Huang Ying
% reduction of ecryption/decryption time. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_asm.S | 115 arch/x86/crypto/aesni-intel_glue.c | 130 +++-- 2 files changed, 238 insertions(+), 7 deletions(-) --- a/arch/x86

[PATCH 1/2] x86, crypto, Use gas macro for AES-NI instructions

2009-11-09 Thread Huang Ying
-by: Huang Ying --- arch/x86/crypto/aesni-intel_asm.S | 517 -- 1 file changed, 173 insertions(+), 344 deletions(-) --- a/arch/x86/crypto/aesni-intel_asm.S +++ b/arch/x86/crypto/aesni-intel_asm.S @@ -16,6 +16,7 @@ */ #include +#include .text

[PATCH 2/2] x86, crypto, Use gas macro for PCLMULQDQ-NI and PSHUFB

2009-11-09 Thread Huang Ying
: Huang Ying --- arch/x86/crypto/ghash-clmulni-intel_asm.S | 29 ++--- 1 file changed, 10 insertions(+), 19 deletions(-) --- a/arch/x86/crypto/ghash-clmulni-intel_asm.S +++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S @@ -17,7 +17,7 @@ */ #include -#include

[BUGFIX -v3 for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-09 Thread Huang Ying
ach complete(), which accept struct aead_request *req instead of areq, so avoid using areq after it is destroyed. - Expand complete_for_next_step(). The fixing method is based on the idea of Herbert Xu. Signed-off-by: Huang Ying --- crypto/gcm.c |

Re: [BUGFIX -v2 for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-09 Thread Huang Ying
On Tue, 2009-11-10 at 11:10 +0800, Herbert Xu wrote: > On Tue, Nov 10, 2009 at 10:49:59AM +0800, Huang Ying wrote: > > > > Yes. This is for performance only. Because crypto_gcm_reqctx(req) is not > > so trivial (it needs access tfm), and used by every xxx_done function,

Re: [PATCH 2/2] x86, crypto, Use gas macro for AES-NI instructions

2009-11-09 Thread Huang Ying
On Tue, 2009-11-10 at 02:56 +0800, Herbert Xu wrote: > On Thu, Nov 05, 2009 at 02:44:17PM +0800, Huang Ying wrote: > > Old binutils do not support AES-NI instructions, to make kernel can be > > compiled by them, .byte code is used instead of AES-NI assembly > > instructions

Re: [BUGFIX -v2 for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-09 Thread Huang Ying
On Tue, 2009-11-10 at 03:02 +0800, Herbert Xu wrote: > On Mon, Nov 09, 2009 at 03:24:14PM +0800, Huang Ying wrote: > > The flow of the complete function (xxx_done) in gcm.c is as follow: > > Thanks the patch looks pretty good overall. > > > -static void gc

[BUGFIX -v2 for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-08 Thread Huang Ying
ach complete(), which accept struct aead_request *req instead of areq, so avoid using areq after it is destroyed. - Expand complete_for_next_step(). The fixing method is based on the idea of Herbert Xu. Signed-off-by: Huang Ying --- crypto/gcm.c |

Re: [BUGFIX for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-03 Thread Huang Ying
On Tue, 2009-11-03 at 23:53 +0800, Herbert Xu wrote: > On Tue, Nov 03, 2009 at 10:40:17AM +0800, Huang Ying wrote: > > The flow of the complete function (xxx_done) in gcm.c is as follow: > > > > void complete(struct crypto_async_request *areq, int err) &

Re: [PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-11-02 Thread Huang Ying
> > > I'm happy to revisit this once inst.h exists. > > No reason to not do most of the change first though, the way i suggested > it. How about something as below? But it seems not appropriate to put these bits into i387.h, that is, to combine C and gas syntax. Best Reg

[BUGFIX for crypto-dev] crypto, Fix irq_fpu_usable usage in clmulni-intel

2009-11-02 Thread Huang Ying
-intel_glue.c is not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying --- arch/x86/crypto/ghash-clmulni-intel_glue.c |8 1 file changed, 4 insertions(+), 4 deletions(-) --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c

[BUGFIX for .32] crypto, gcm, fix another complete call in complete fuction

2009-11-02 Thread Huang Ying
. - Expand complete_for_next_step(). Signed-off-by: Huang Ying --- crypto/gcm.c | 43 --- 1 file changed, 28 insertions(+), 15 deletions(-) --- a/crypto/gcm.c +++ b/crypto/gcm.c @@ -267,8 +267,7 @@ static int gcm_hash_final(struct aead_re return

[BUGFIX] Fix irq_fpu_usable usage in aesni

2009-10-18 Thread Huang Ying
not changed accordingly. This patch fixes this. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -82,7 +82,7 @@ static int

[PATCH -v4] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-15 Thread Huang Ying
, performance increase about 2x. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/ghash-clmulni-intel_asm.S | 157 + arch/x86/crypto/ghash-clmulni-intel_glue.c | 333 + arch/x86/include/asm/cpufeature.h

Re: [PATCH -v3] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-15 Thread Huang Ying
On Tue, 2009-09-15 at 22:42 +0800, Daniel Walker wrote: > On Tue, 2009-09-15 at 13:42 +0800, Huang Ying wrote: > > Hi, Herbert, > > > > The dependency to irq_fpu_usable has been merged by linus' tree. > >

[PATCH -v3] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-09-14 Thread Huang Ying
Hi, Herbert, The dependency to irq_fpu_usable has been merged by linus' tree. Best Regards, Huang Ying --> PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More inf

[PATCH -v3] x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h

2009-08-30 Thread Huang Ying
PCLMULQDQ accelerated GHASH implementation. v3: - Renamed to irq_fpu_usable to reflect the purpose of the function. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying CC: H. Peter Anvin --- arch/x86/crypto/aesni-intel_glue.c | 17 + arch/x86

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-13 Thread Huang Ying
lly we can't use seqiv on raw counter mode because it cannot > guarantee IV uniqueness. I think reverting to chainiv is the safer > option. I see seqiv is used in rfc3686 mode, it means seqiv can not be used on raw counter mode but can be used for rfc3686? Best Regards, Huang

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-11 Thread Huang Ying
On Thu, 2009-08-06 at 10:12 +0800, Huang Ying wrote: > On Wed, 2009-08-05 at 17:45 +0800, Herbert Xu wrote: > > On Mon, Aug 03, 2009 at 03:44:43PM +0800, Huang Ying wrote: > > > When doing "modeprobe tcrypt mode=10", the following error will show > > > in dmes

Re: [PATCH -v2 5/5] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-08-06 Thread Huang Ying
On Thu, 2009-08-06 at 15:17 +0800, Herbert Xu wrote: > On Mon, Aug 03, 2009 at 03:45:31PM +0800, Huang Ying wrote: > > PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, > > carry-less multiplication. More information about PCLMULQDQ can be > > fo

Re: [BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-05 Thread Huang Ying
On Wed, 2009-08-05 at 17:45 +0800, Herbert Xu wrote: > On Mon, Aug 03, 2009 at 03:44:43PM +0800, Huang Ying wrote: > > When doing "modeprobe tcrypt mode=10", the following error will show > > in dmesg. > > > > alg: skcipher: Failed to load transform for ctr(a

[PATCH -v2 2/5] crypto: Use GHASH digest algorithm in GCM

2009-08-03 Thread Huang Ying
asynchronous interface. v2: - Add parameter to gcm_base to choose ghash implementation. - Fix a memory leak about gcm_zeros (Thanks Sebastian) - Some minor fixes Signed-off-by: Huang Ying --- crypto/Kconfig |2 +- crypto/gcm.c | 580

[PATCH -v2 5/5] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-08-03 Thread Huang Ying
, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be defered to the cryptd kernel thread. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile

[PATCH -v2 4/5] x86: Move kernel_fpu_using to irq_is_fpu_using in asm/i387.h

2009-08-03 Thread Huang Ying
This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. v2: - Renamed to irq_is_fpu_using to reflect the real situation. Signed-off-by: Huang Ying CC: H. Peter Anvin --- arch/x86/crypto/aesni-intel_glue.c | 17 + arch/x86

[PATCH -v2 3/5] crypto: cryptd: Add support to access underlaying shash

2009-08-03 Thread Huang Ying
cryptd_alloc_ahash() will allocate a cryptd-ed ahash for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ahash, so the shash underlying can be gotten via cryptd_ahash_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 35

[PATCH -v2 1/5] crypto: Add GHASH digest algorithm for GCM

2009-08-03 Thread Huang Ying
GHASH is implemented as a shash algorithm. The actual implementation is copied from gcm.c. This makes it possible to add architecture/hardware accelerated GHASH implementation. v2: - Fix a bug in Makefile (Thanks Sebastian) - Some other minor fixes Signed-off-by: Huang Ying --- crypto

[BUGFIX] crypto: Fix ctr(aes) testing by specifying geniv

2009-08-03 Thread Huang Ying
with geniv, but defualt geniv mode may not work with ctr(aes). As that of rfc3686, this is fixed via specifying geniv mode to "seqiv". Signed-off-by: Huang Ying --- crypto/ctr.c |2 ++ 1 file changed, 2 insertions(+) --- a/crypto/ctr.c +++ b/crypto/ctr.c @@ -219,6 +219,8 @@ static st

Re: [RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-07-06 Thread Huang Ying
ren't any remaining DIGEST algorithms :) > > I'll get onto hmac. Thank you. Will post the updated version after you have done with hmac. Best Regards, Huang Ying -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...

Re: [RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-07-06 Thread Huang Ying
Hi, Herbert, On Sun, 2009-06-21 at 21:51 +0800, Herbert Xu wrote: > Huang Ying wrote: > > PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, > > carry-less multiplication. More information about PCLMULQDQ can be > > found at: > > > > http://

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-21 Thread Huang Ying
On Mon, 2009-06-22 at 10:03 +0800, Herbert Xu wrote: > On Mon, Jun 22, 2009 at 09:41:16AM +0800, Huang Ying wrote: > > > > Can crypto_alloc_ahash("ghash",...) select among different ghash > > implementation automatically based on priority? I think > > crypto

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-21 Thread Huang Ying
On Sun, 2009-06-21 at 21:46 +0800, Herbert Xu wrote: > Huang Ying wrote: > > > > + ghash = crypto_alloc_ahash("ghash", 0, 0); > > + if (IS_ERR(ghash)) > > + return PTR_ERR(ghash); > > We should add this as an extra parameter t

Re: [BUGFIX 2/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP flag in AES-NI accelerated ecb/cbc mode

2009-06-18 Thread Huang Ying
On Thu, 2009-06-18 at 19:40 +0800, Herbert Xu wrote: > On Mon, Jun 15, 2009 at 05:04:57PM +0800, Huang Ying wrote: > > Because AES-NI instructions will touch XMM state, corresponding code > > must be enclosed within kernel_fpu_begin/end, which used > > preempt_disable/enabl

Re: [RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-18 Thread Huang Ying
On Thu, 2009-06-18 at 15:27 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-18 10:08:27 [+0800]: > > >On Thu, 2009-06-18 at 04:04 +0800, Sebastian Andrzej Siewior wrote: > >> >+#include > >> >+#include > >> >+#include &

Re: [RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-17 Thread Huang Ying
On Thu, 2009-06-18 at 04:47 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-11 15:10:28 [+0800]: > > >Remove the dedicated GHASH implementation in GCM, and uses the GHASH > >digest algorithm instead. This will make GCM uses hardware accelerated >

Re: [RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-17 Thread Huang Ying
On Thu, 2009-06-18 at 04:04 +0800, Sebastian Andrzej Siewior wrote: > * Huang Ying | 2009-06-11 15:10:26 [+0800]: > > >GHASH is implemented as a shash algorithm. The actual implementation > >is copied from gcm.c. This makes it possible to add > >architecture/ha

Re: [RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-17 Thread Huang Ying
ous #TS faults, while AES and PCLMUL need to check whether MMX/SSE registers are available. After some thinking, I think something as follow may be more appropriate: /* This may be useful for someone else */ static inline bool fpu_using(void) { return !(read_cr0() & X86_CR0_TS); }

[BUGFIX 3/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP from fpu template

2009-06-15 Thread Huang Ying
kernel_fpu_begin/end used preempt_disable/enable, so sleep should be prevented between kernel_fpu_begin/end. Signed-off-by: Huang Ying --- arch/x86/crypto/fpu.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/x86/crypto/fpu.c +++ b/arch/x86/crypto/fpu.c @@ -48,7 +48,7

[BUGFIX 2/3] crypto: Remove CRYPTO_TFM_REQ_MAY_SLEEP flag in AES-NI accelerated ecb/cbc mode

2009-06-15 Thread Huang Ying
Because AES-NI instructions will touch XMM state, corresponding code must be enclosed within kernel_fpu_begin/end, which used preempt_disable/enable. So sleep should be prevented between kernel_fpu_begin/end. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c |4 1 file

[BUGFIX 1/3] crypto: Fix AES-NI cbc mode IV saving

2009-06-15 Thread Huang Ying
Original implementation of aesni_cbc_dec do not save IV if input length % 4 == 0. This will make decryption of next block failed. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_asm.S |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/arch/x86/crypto/aesni

[RFC 7/7] crypto: Add PCLMULQDQ accelerated GHASH implementation

2009-06-11 Thread Huang Ying
, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be deferred to the cryptd kernel thread. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile

[RFC 2/7] crypto: Use GHASH digest algorithm in GCM

2009-06-11 Thread Huang Ying
asynchronous interface. Signed-off-by: Huang Ying --- crypto/Kconfig |2 crypto/gcm.c | 531 +++-- 2 files changed, 367 insertions(+), 166 deletions(-) --- a/crypto/gcm.c +++ b/crypto/gcm.c @@ -12,6 +12,7 @@ #include #include #include

[RFC 6/7] x86: Move kernel_fpu_using to asm/i387.h

2009-06-11 Thread Huang Ying
This is used by AES-NI accelerated AES implementation and PCLMULQDQ accelerated GHASH implementation. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c |7 --- arch/x86/include/asm/i387.h|7 +++ 2 files changed, 7 insertions(+), 7 deletions(-) --- a/arch

[RFC 4/7] crypto: use crypto_shash instead of crypto_hash in cryptd hash

2009-06-11 Thread Huang Ying
crypto_hash interface has some issue and will be replaced by crypto_shash. This patch replace crypto_hash in cryptd hash with crypto_shash. Signed-off-by: Huang Ying --- crypto/cryptd.c | 118 ++-- 1 file changed, 73 insertions(+), 45

[RFC 1/7] crypto: Add GHASH digest algorithm for GCM

2009-06-11 Thread Huang Ying
GHASH is implemented as a shash algorithm. The actual implementation is copied from gcm.c. This makes it possible to add architecture/hardware accelerated GHASH implementation. Signed-off-by: Huang Ying --- crypto/Kconfig |7 + crypto/Makefile|2 crypto/ghash-generic.c

[RFC 5/7] crypto: cryptd: Add support to access underlaying shash

2009-06-11 Thread Huang Ying
cryptd_alloc_ahash() will allocate a cryptd-ed ahash for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ahash, so the shash underlying can be gotten via cryptd_ahash_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 35

[RFC 3/7] crypto: Add crypto_spawn_shash

2009-06-11 Thread Huang Ying
Needed to use shash in cryptd hash. Signed-off-by: Huang Ying --- crypto/shash.c |6 ++ include/crypto/algapi.h |8 2 files changed, 14 insertions(+) --- a/include/crypto/algapi.h +++ b/include/crypto/algapi.h @@ -240,6 +240,14 @@ static inline struct cipher_alg

[RFC 0/7] crypto: PCLMULQDQ accelerated GHASH

2009-06-11 Thread Huang Ying
Hi, Herbert, This patchset adds PCLMULQDQ accelerated GHASH. Because conversion from crypto_hash to crypto_shash has not been done, this patchset is not intended to be merged now. Please take a look at the general design. Best Regards, Huang Ying -- To unsubscribe from this list: send the line

Re: GCM benchmark

2009-04-09 Thread Huang Ying
On Thu, 2009-04-09 at 16:21 +0800, Herbert Xu wrote: > On Thu, Apr 09, 2009 at 03:50:21PM +0800, Huang Ying wrote: > > Hi, Herbert, > > > > I am working on GCM acceleration with Intel new PCLMULQDQ instructions > > now. Can you tell me how to do GCM benchma

Re: Accelerate GCM with PCLMULQDQ-NI

2009-03-29 Thread Huang Ying
On Sun, 2009-03-29 at 15:43 +0800, Herbert Xu wrote: > On Wed, Mar 18, 2009 at 04:52:12PM +0800, Huang Ying wrote: > > > > To accelerate GCM with it, I make the following design: > > > > 1. Implement ghash as an ahash algorithm, Use ghash in gcm > > imp

Accelerate GCM with PCLMULQDQ-NI

2009-03-18 Thread Huang Ying
that of AES-NI, that is, XMM registers are used. To accelerate GCM with it, I make the following design: 1. Implement ghash as an ahash algorithm, Use ghash in gcm implementation. 2. Provide a new implementation of ghash with PCLMULQDQ-NI. What do you think about that? Best Regards, Huang Ying

[PATCH -v3 3/3] crypto: Add AES-NI support for more modes

2009-03-17 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. v2: Add description of mode acceleration support in Kconfig v3: Fix some bugs of CTR block size, LRW and XTS min/max key size. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 267

[PATCH -v3 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-17 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. v2: Make FPU mode invisible to end user Signed-off-by: Huang Ying --- crypto/Kconfig |5 + crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 172

[PATCH -v3 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-17 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

[PATCH -v2 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-09 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. v2: Make FPU mode invisible to user Signed-off-by: Huang Ying --- crypto/Kconfig |5 + crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 172

[PATCH -v2 3/3] crypto: Add AES-NI support for more modes

2009-03-09 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. v2: Add description of mode acceleration support in Kconfig Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 256 + crypto/Kconfig

[PATCH -v2 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-09 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

[PATCH 3/3] crypto: Add AES-NI support for more modes

2009-03-04 Thread Huang Ying
cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. Signed-off-by: Huang Ying --- arch/x86/crypto/aesni-intel_glue.c | 256 + crypto/Kconfig |1 2 files changed, 257 insertions(+) --- a/arch

[PATCH 2/3] crypto: Add fpu template, a wrapper for blkcipher touching FPU

2009-03-04 Thread Huang Ying
uot; template, which makes these operations to be invoked for each request. Signed-off-by: Huang Ying --- crypto/Kconfig |7 ++ crypto/Makefile |1 crypto/fpu.c| 166 3 files changed, 174 insertions(+) --- a/crypto/Kconfig

[PATCH 1/3] crypto: Fix tfm allocation in cryptd_alloc_ablkcipher

2009-03-04 Thread Huang Ying
Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to allocate underlying tfm in cryptd_alloc_ablkcipher. Because crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of raw one, while cryptd_alloc_ablkcipher needed the raw one. Signed-off-by: Huang Ying --- crypto

Re: Bug of dm-crypt?

2009-02-27 Thread Huang Ying
Hi, Milan, On Fri, 2009-02-27 at 16:41 +0800, Milan Broz wrote: > Herbert Xu wrote: > > On Fri, Feb 27, 2009 at 01:31:56PM +0800, Huang Ying wrote: > >> I had ever heard from you that the only thing guaranteed in the > >> completion function of async ablkcipher cr

[BUGFIX] dm-crypt: Fix a bug of async cryption complete function

2009-02-27 Thread Huang Ying
equest. Signed-off-by: Huang Ying --- drivers/md/dm-crypt.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -60,6 +60,8 @@ struct dm_crypt_io { }; struct dm_crypt_request { + struct ablkcipher_reques

Bug of dm-crypt?

2009-02-26 Thread Huang Ying
ion: kcryptd_async_done. This makes my AES-NI cryptd usage panic. Do you think that is a bug? Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

[PATCH -v2 3/3] crypto: Uses kcrypto_wq instead of keventd_wq in chainiv

2009-02-09 Thread Huang Ying
Uses kcrypto_wq instead of keventd_wq in chainiv keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying --- crypto/Kconfig |1 + crypto/chainiv.c |3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) --- a/crypto/Kconfig +++ b

[PATCH -v2 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-09 Thread Huang Ying
-- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.192. Signed-off-by: Huang Ying --- crypto/Kconfig |1 crypto/cryptd.c | 220 ++-- 2 files

[PATCH -v2 1/3] crypto: Use dedicated workqueue for crypto subsystem

2009-02-09 Thread Huang Ying
Use dedicated workqueue for crypto subsystem A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying --- crypto/Kconfig

Re: [PATCH 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-03 Thread Huang Ying
On Tue, 2009-02-03 at 17:10 +0800, Andrew Morton wrote: > On Mon, 02 Feb 2009 14:42:20 +0800 Huang Ying wrote: > > > Original cryptd thread implementation has scalability issue, this > > patch solve the issue with a per-CPU thread implementation. > > > > struct c

[PATCH 3/3] crypto: Uses kcrypto_wq instead of keventd_wq in chainiv

2009-02-01 Thread Huang Ying
keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying --- crypto/Kconfig |1 + crypto/chainiv.c |3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -56,6 +56,7 @@ config

[PATCH 2/3] crypto: Per-CPU cryptd thread implementation based on kcrypto_wq

2009-02-01 Thread Huang Ying
594minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps --w end -- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance g

[PATCH 1/3] crypto: Use dedicated workqueue for crypto subsystem

2009-02-01 Thread Huang Ying
A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying --- crypto/Kconfig |3 +++ crypto/Makefile

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
On Thu, 2009-01-22 at 15:30 +0800, Herbert Xu wrote: > On Thu, Jan 22, 2009 at 03:15:58PM +0800, Huang Ying wrote: The only needed spin lock usage is cryptd_tfm_in_queue() now, I think we can protect that via RCU, what's your opinions? > > Yes. Except that, now we do not need a sp

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
ystem 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps --w cryptowq end -- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.1

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-02-01 Thread Huang Ying
Sorry for my late. Last week is Chinese new year holiday. On Sat, 2009-01-24 at 15:07 +0800, Andrew Morton wrote: > On Thu, 22 Jan 2009 10:32:17 +0800 Huang Ying wrote: > > > Use dedicate workqueue for crypto > > > > - A dedicated workqueue named kcrypto_wq is created.

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-21 Thread Huang Ying
On Thu, 2009-01-22 at 11:04 +0800, Herbert Xu wrote: > On Thu, Jan 22, 2009 at 10:32:17AM +0800, Huang Ying wrote: > > > > This is the first attempt to use a dedicate workqueue for crypto. It is > > not intended to be merged. Please feedback your comments, especially on > &

Re: [RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-21 Thread Huang Ying
On Fri, 2009-01-16 at 11:31 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 11:10:36AM +0800, Huang Ying wrote: > > > > The scalability of current cryptd implementation is not good. So a > > per-CPU cryptd kthread implementation is necessary. The per-CPU kthread > >

[PATCH crypto -v6 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 461 +++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

[PATCH crypto -v6 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
processing in cryptd_alloc_ablkcipher() Signed-off-by: Huang Ying --- crypto/cryptd.c | 35 +++ include/crypto/cryptd.h | 27 +++ 2 files changed, 62 insertions(+) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -12,6 +12,7

[PATCH crypto -v5 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 461 +++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

[PATCH crypto -v5 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
processing in cryptd_alloc_ablkcipher() Signed-off-by: Huang Ying --- crypto/cryptd.c | 33 + include/crypto/cryptd.h | 27 +++ 2 files changed, 60 insertions(+) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -12,6 +12,7

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
On Fri, 2009-01-16 at 11:26 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 10:37:02AM +0800, Huang Ying wrote: > > > > But after checking blkcipher_walk_done() in 2.6.28, If input argument > > err != 0 and walk->flags & BLKCIPHER_WALK_SLOW != 0, when > > blk

[RFC] per-CPU cryptd thread implementation based on workqueue

2009-01-15 Thread Huang Ying
, create a dedicate workqueue for crypto subsystem. This way, chainiv can use this crypto workqueue too. I will implement it if you have no plan to do it yourself. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
On Fri, 2009-01-16 at 09:53 +0800, Herbert Xu wrote: > On Fri, Jan 16, 2009 at 09:20:58AM +0800, Huang Ying wrote: > > On Thu, 2009-01-15 at 17:47 +0800, roel kluin wrote: > > > > > > + kernel_fpu_begin(); > > > > + while ((nbytes = walk.nbytes))

Re: [PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
e to break > out of the loop? > i.e. > > while (!err && (nbytes = walk.nbytes)) > > (if that's erroneous, it occurs in other places as well) It seems that it is a bug. But it seems that the similar code in geode-aes.c and padlock-aes.c has same bug. I think we should fix them too. Best Regards, Huang Ying signature.asc Description: This is a digitally signed message part

Re: [PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
On Thu, 2009-01-15 at 17:23 +0800, Herbert Xu wrote: > On Thu, Jan 15, 2009 at 05:21:47PM +0800, Huang Ying wrote: > > On Thu, 2009-01-15 at 16:47 +0800, Herbert Xu wrote: > > > On Thu, Jan 15, 2009 at 04:28:33PM +0800, Huang Ying wrote: > > > > > > >

Re: [PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
On Thu, 2009-01-15 at 16:47 +0800, Herbert Xu wrote: > On Thu, Jan 15, 2009 at 04:28:33PM +0800, Huang Ying wrote: > > > > + tfm = crypto_alloc_ablkcipher(cryptd_alg_name, type, mask); > > + BUG_ON(crypto_ablkcipher_tfm(tfm)->__crt_alg->cra_module != > > +

[PATCH crypto -v4 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-15 Thread Huang Ying
cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via cryptd_ablkcipher_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 30

[PATCH crypto -v4 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-15 Thread Huang Ying
. Signed-off-by: Huang Ying --- arch/x86/crypto/Makefile |3 arch/x86/crypto/aesni-intel_asm.S | 896 + arch/x86/crypto/aesni-intel_glue.c | 460 ++ arch/x86/include/asm/cpufeature.h |1 crypto/Kconfig

[RFC PATCH crypto -v3 2/2] AES-NI: Add support to Intel AES-NI instructions for x86_64 platform

2009-01-13 Thread Huang Ying
are implementation. - AES key scheduling algorithm is re-implemented with higher performance. - ablkcipher asynchronous machanism is used to delay a crypto request to work queue context upon FPU state is using by other kernel context. Signed-off-by: Huang Ying --- arch/x86/crypto/Makef

[RFC PATCH crypto -v3 1/2] AES-NI: Add support to access underlying blkcipher under cryptd ablkcipher

2009-01-13 Thread Huang Ying
cryptd_alloc_ablkcipher() will allocate a cryptd-ed ablkcipher for specified algorithm name. The new allocated one is guaranteed to be cryptd-ed ablkcipher, so the blkcipher underlying can be gotten via cryptd_ablkcipher_child(). Signed-off-by: Huang Ying --- crypto/cryptd.c | 19

Re: Use cryptd(%s) as cryptd-ed algorithm name instead of %s

2009-01-13 Thread Huang Ying
On Wed, 2009-01-14 at 14:53 +0800, Herbert Xu wrote: > On Wed, Jan 14, 2009 at 02:44:08PM +0800, Huang Ying wrote: > > Because: > > > > 1. if use %s, you can only request cryptd(), not > >cryptd(), because generated new algorithm instance has > >algori

Use cryptd(%s) as cryptd-ed algorithm name instead of %s

2009-01-13 Thread Huang Ying
version. Signed-off-by: Huang Ying --- crypto/cryptd.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -215,7 +215,9 @@ static struct crypto_instance *cryptd_al ctx->state = state; - memcpy(inst->alg.cra_nam

  1   2   >