** Description changed: - AMD added support in their processors for SHA Extensions starting with - Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 - routine as result a number of extensions were effectively masked on - Ryzen and shows no improvement. + * Context - It has been brought to my attention that : - "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." + AMD added support in their processors for SHA Extensions[1] starting + with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for + SSSE3 routine as result a number of extensions were effectively masked + on Ryzen and shows no improvement. - After further verification on my side : + [1] /proc/cpuinfo + processor : 0 + vendor_id : AuthenticAMD + cpu family : 23 + model : 1 + model name : AMD Ryzen 5 1600 Six-Core Processor + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse + 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho + pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold - Extended feature flags were not pulled on AMD processors, as result a - number of extensions were effectively masked on Ryzen CPUs. It should - have been reported for Excavator since it implements AVX2 extension, but - apparently nobody noticed ... - The GitHub PR: - https://github.com/openssl/openssl/pull/2849 + * Program to performs the CPUID check + + Reference : + https://software.intel.com/en-us/articles/intel-sha-extensions + + -- + int CheckForIntelShaExtensions() { + int a, b, c, d; + + // Look for CPUID.7.0.EBX[29] + // EAX = 7, ECX = 0 + a = 7; + c = 0; + + asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) + ); + + // Intel® SHA Extensions feature bit is EBX[29] + return ((b >> 29) & 1); + } + -- + + * Upstream work: + + - Repository : https://github.com/openssl/openssl.git + - Commits : + + 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. + f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards.
** Description changed: * Context - AMD added support in their processors for SHA Extensions[1] starting - with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for - SSSE3 routine as result a number of extensions were effectively masked - on Ryzen and shows no improvement. + AMD added support in their processors for SHA Extensions[1] (sha_ni) + starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls + SHA for SSSE3 routine as result a number of extensions were effectively + masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold - * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { - int a, b, c, d; + int a, b, c, d; - // Look for CPUID.7.0.EBX[29] - // EAX = 7, ECX = 0 - a = 7; - c = 0; + // Look for CPUID.7.0.EBX[29] + // EAX = 7, ECX = 0 + a = 7; + c = 0; - asm volatile ("cpuid" - :"=a"(a), "=b"(b), "=c"(c), "=d"(d) - :"a"(a), "c"(c) - ); + asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) + ); - // Intel® SHA Extensions feature bit is EBX[29] - return ((b >> 29) & 1); + // Intel® SHA Extensions feature bit is EBX[29] + return ((b >> 29) & 1); } -- * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ** Description changed: * Context - AMD added support in their processors for SHA Extensions[1] (sha_ni) - starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls - SHA for SSSE3 routine as result a number of extensions were effectively - masked on Ryzen and shows no improvement. + AMD added support in their processors for SHA Extensions[1] / CPU flag: + sha_ni[2] starting with Ryzen CPU. Current OpenSSL version in Ryzens + still calls SHA for SSSE3 routine as result a number of extensions were + effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold + + [2] - sha_ni: SHA1/SHA256 Instruction Extensions * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ** Description changed: * Context AMD added support in their processors for SHA Extensions[1] / CPU flag: sha_ni[2] starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- + On CPU with sha_ni the program return "1". Otherwise it return "0". + * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ** Description changed: * Context AMD added support in their processors for SHA Extensions[1] / CPU flag: sha_ni[2] starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git + - Commits : - 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs