https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81616
--- Comment #22 from Jan Hubicka <hubicka at ucw dot cz> --- Hi, this is same base (so you can see there is some noise) compared to haswell tuning 164.gzip 1400 57.1 2452 * 1400 58.7 2384 * 175.vpr 1400 37.1 3776 * 1400 38.3 3659 * 176.gcc 1100 20.0 5500 * 1100 20.1 5464 * 181.mcf 1800 21.6 8327 * 1800 20.9 8617 * 186.crafty 1000 20.4 4905 * 1000 21.0 4760 * 197.parser 1800 51.3 3506 * 1800 51.9 3466 * 252.eon 1300 18.2 7162 * 1300 19.2 6781 * 253.perlbmk X X 254.gap X X 255.vortex X X 256.bzip2 1500 42.4 3537 * 1500 44.1 3401 * 300.twolf 3000 56.4 5317 * 3000 56.3 5328 * Est. SPECint_base2000 4632 Est. SPECint2000 4548 168.wupwise 1600 28.2 5667 * 1600 28.7 5580 * 171.swim 3100 26.3 11807 * 3100 27.4 11304 * 172.mgrid 1800 26.0 6930 * 1800 31.0 5810 * 173.applu 2100 25.5 8239 * 2100 25.6 8193 * 177.mesa 1400 23.4 5970 * 1400 22.9 6116 * 178.galgel X X 179.art 2600 10.9 23807 * 2600 10.4 25014 * 183.equake 1300 12.9 10039 * 1300 12.9 10060 * 187.facerec 1900 17.3 11009 * 1900 20.8 9135 * 188.ammp 2200 34.2 6441 * 2200 34.2 6428 * 189.lucas 2000 20.7 9683 * 2000 20.7 9679 * 191.fma3d 2100 29.7 7060 * 2100 31.5 6660 * 200.sixtrack 1100 38.6 2847 * 1100 40.9 2687 * 301.apsi 2600 33.1 7866 * 2600 32.7 7952 * Est. SPECfp_base2000 8045 Est. SPECfp2000 7766 So mes, arta and mcf sems to benefit from Haswell tunning. Mesa is vectorization problem (we vectorize cold loop and introduce too much of register pressure) What is however interesting is that zen tuning with 256bit vectorization seems to be worse than haswell tuning. I will run haswell with 128bit vector size. What your matrix multiplication benchmark runs into is issue with multiply and add instruction. Once machine is free I will try it, but disabling fmadd may solve the regression. Honza Honza