El 4/10/24 a las 20:37, Bill Allombert escribió:
The test suite works with 3th generation AMD EPYC processorss,
but fails with 4th generation.

Ah! Does the VM set /proc/cpuinfo correctly ?

I don't know. Is there an easy way to check that "/proc/cpuinfo is correct"?

Some years ago, I have seen problem when VM advertised support for CPU features.
that where not actually supported.

That would be certainly surprising from AWS, I would suppose they have
good engineers for such things.

/usr/bin/mlucas is actually a shellscript that try the various binaries
in /usr/libexec/mlucas/ until it find one that works on the machine.
by doing

/usr/libexec/mlucas/mlucas-avx512 -fftlen 192 -iters 100 -radset 0
/usr/libexec/mlucas/mlucas-avx2 -fftlen 192 -iters 100 -radset 0
/usr/libexec/mlucas/mlucas-avx -fftlen 192 -iters 100 -radset 0
/usr/libexec/mlucas/mlucas-sse2 -fftlen 192 -iters 100 -radset 0
... until this succeeds.

Could you tell me what you get on epyc 3 and epyc 4 ?

Sure.

On a c6a.large instance (EPYC 3th generation) the first one fails
with exit status 1 and this error:

INFO: testing qfloat routines...
System total RAM = 3825, free RAM = 193
INFO: 193 MB of free system RAM detected.
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other 
compatible], Version 14.2.0.
has_avx512: CPUID returns [a,b,c,d] = [  A00F11,   20800,FEFA3203,178BFBFF]
#define USE_AVX512 invoked but no FMA support detected on this CPU! Check 
get_cpuid functionality and CPU type.
ERROR: at line 2079 of file upstream/src/util.c
Assertion failed: #define USE_AVX512 invoked but no FMA support detected on 
this CPU! Check get_cpuid functionality and CPU type.

so the following one (mlucas-avx2) is chosen.

On a c7a.large instance (EPYC 4th generation), the first one exits with
exit status 0, but then: "mlucas -s m" segfaults, and creates a core
which I've just put here in case it helps:

https://people.debian.org/~sanvila/build-logs/mlucas/core.gz

(gdb says "Core was generated by `/usr/libexec/mlucas/mlucas-avx512 -s m'")

(Beware: It's 956 MB big when uncompressed).

I attach the contents of /proc/cpuinfo for the c6a.large and c7a.large 
instances.

Maybe one of those is broken and I can just remove it.
(I already disabled sse2 on i386 due to crashes)

Looks reasonable, since this is also a crash.

Thanks.
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 1
model name      : AMD EPYC 7R13 Processor
stepping        : 1
microcode       : 0xa0011d5
cpu MHz         : 2649.998
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf 
tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe 
popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm 
sse4a misalignsse 3dnowprefetch topoext invpcid_single ssbd ibrs ibpb stibp 
vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb 
sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt 
nrip_save vaes vpclmulqdq rdpid
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 
spec_store_bypass srso
bogomips        : 5299.99
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 1
model name      : AMD EPYC 7R13 Processor
stepping        : 1
microcode       : 0xa0011d5
cpu MHz         : 2649.998
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf 
tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe 
popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm 
sse4a misalignsse 3dnowprefetch topoext invpcid_single ssbd ibrs ibpb stibp 
vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb 
sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt 
nrip_save vaes vpclmulqdq rdpid
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 
spec_store_bypass srso
bogomips        : 5299.99
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 17
model name      : AMD EPYC 9R14
stepping        : 1
microcode       : 0xa101148
cpu MHz         : 2599.998
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf 
tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic 
movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy 
abm sse4a misalignsse 3dnowprefetch topoext perfctr_core invpcid_single ssbd 
perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 
invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd 
sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero 
xsaveerptr rdpru wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes 
vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid flush_l1d
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips        : 5199.99
TLB size        : 3584 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 17
model name      : AMD EPYC 9R14
stepping        : 1
microcode       : 0xa101148
cpu MHz         : 2599.998
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf 
tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic 
movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy 
abm sse4a misalignsse 3dnowprefetch topoext perfctr_core invpcid_single ssbd 
perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 
invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd 
sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero 
xsaveerptr rdpru wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes 
vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid flush_l1d
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips        : 5199.99
TLB size        : 3584 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:

Reply via email to