So far my test reveal the following : # Note that the below test has been made on a Ryzen system #
[Without patch] * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s [With patch] * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m3.471s user 0m2.956s sys 0m0.516s ** Description changed: + [Impact] + + * Context: + + AMD added support in their processors for SHA Extensions[1] (CPU flag: + sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit + only (Confirmed with AMD representative). Current OpenSSL version in + Ryzens still calls SHA for SSSE3 routine as result a number of + extensions were effectively masked on Ryzen and shows no improvement. + + [1] /proc/cpuinfo + processor : 0 + vendor_id : AuthenticAMD + cpu family : 23 + model : 1 + model name : AMD Ryzen 5 1600 Six-Core Processor + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse + 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho + pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold + + [2] - sha_ni: SHA1/SHA256 Instruction Extensions + + [3] - https://en.wikipedia.org/wiki/Ryzen + ... + All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] + ... + + * Program to performs the CPUID check: + + Reference : + https://software.intel.com/en-us/articles/intel-sha-extensions + + ... Availability of the Intel® SHA Extensions on a particular processor + can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, + ECX=0):EBX.SHA [bit 29]. The following C function, using inline + assembly, performs the CPUID check: + + -- + int CheckForIntelShaExtensions() { + int a, b, c, d; + + // Look for CPUID.7.0.EBX[29] + // EAX = 7, ECX = 0 + a = 7; + c = 0; + + asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) + ); + + // Intel® SHA Extensions feature bit is EBX[29] + return ((b >> 29) & 1); + } + -- + + On CPU with sha_ni the program return "1". Otherwise it return "0". + + [Test Case] + + * Reproducible with Xenial/Zesty/Artful release. + + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 + + real 0m12.835s + user 0m12.344s + sys 0m0.484s + + + [Regression Potential] + + * None expected, it basically allow openssl to take benefit of sha + extension potential (mostly performance-wise) if AMD cpu has the + capability. + + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 + + real 0m3.471s + user 0m2.956s + sys 0m0.516s + + + [Other Info] + + * Debian Bug : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 + + * Upstream Repository : https://github.com/openssl/openssl.git + + * Upstream Commits : + 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. + ## This fix moves extended feature detection past basic feature detection where it belongs. + + f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. + ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. + + [Original Description] + * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. ** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { - int a, b, c, d; + int a, b, c, d; - // Look for CPUID.7.0.EBX[29] - // EAX = 7, ECX = 0 - a = 7; - c = 0; + // Look for CPUID.7.0.EBX[29] + // EAX = 7, ECX = 0 + a = 7; + c = 0; - asm volatile ("cpuid" - :"=a"(a), "=b"(b), "=c"(c), "=d"(d) - :"a"(a), "c"(c) - ); + asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) + ); - // Intel® SHA Extensions feature bit is EBX[29] - return ((b >> 29) & 1); + // Intel® SHA Extensions feature bit is EBX[29] + return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] - * Reproducible with Xenial/Zesty/Artful release. - - * Generated a checksum of a big file (e.g. 5GB file) with openssl - $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + * Reproducible with Xenial/Zesty/Artful release. + + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s + The performance are clearly better when using the patch which take + benefit of the sha extension. - [Regression Potential] + [Regression Potential] - * None expected, it basically allow openssl to take benefit of sha + * None expected, it basically allow openssl to take benefit of sha extension potential (mostly performance-wise) if AMD cpu has the capability. - * Generated a checksum of a big file (e.g. 5GB file) with openssl - $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m3.471s user 0m2.956s sys 0m0.516s + [Other Info] - [Other Info] - * Debian Bug : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 * Upstream Repository : https://github.com/openssl/openssl.git * Upstream Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. [Original Description] * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs