Re: Interesting statistics on vectorization for Skylake avx512 (i9-7900) - 8.1 vs. 7.3.
On Thu, May 3, 2018 at 8:43 PM, Toon Moene wrote: > Consider the attached Fortran code (the most expensive routine, > computation-wise, in our weather forecasting model). > > verint.s.7.3 is the result of: > > gfortran -g -O3 -S -march=native -mtune=native verint.f > > using release 7.3. > > verint.s.8.1 is the result of: > > gfortran -g -O3 -S -march=native -mtune=native verint.f > > using the recently released GCC 8.1. > > $ wc -l verint.s.7.3 verint.s.8.1 > 7818 verint.s.7.3 > 6087 verint.s.8.1 > > $ grep vfma verint.s.7.3 | wc -l > 381 > $ grep vfma verint.s.8.1 | wc -l > 254 > > but: > > $ grep vfma verint.s.7.3 | grep -v ss | wc -l > 127 > $ grep vfma verint.s.8.1 | grep -v ss | wc -l > 127 > > and: > > $ grep movaps verint.s.7.3 | wc -l > 306 > $ grep movaps verint.s.8.3 | wc -l > 270 > > Finally: > > $ grep zmm verint.s.7.3 | wc -l > 1494 > $ grep zmm verint.s.8.1 | wc -l > 0 > $ grep ymm verint.s.7.3 | wc -l > 379 > $ grep ymm verint.s.8.1 | wc -l > 1464 > > I haven't had the opportunity to test this for speed (is quite complicated, > as I have to build several support libraries with 8.1, like openmpi, netcdf, > hdf{4|5}, fftw ...) GCC 8 has changes to prefer AVX256 by default for Skylake-avx512, even with AVX512 available. You can change that with -mprefer-vector-width=512 or by changing the avx256_optimal tune via -mtune-ctrl=^avx256_optimal There are now also measures in place to avoid fma in certain situations where it doesn't help performance. So - performance measurements would be nice to have ;) Richard. > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ > Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Fwd: OMB-A-130
{©}Copyright2047ICC.cc3.* -- Forwarded message -- From: cc647...@gmail.com Date: Apr 25, 2018 6:21 PM Subject: OMB-A-130 To: Cc: >>>((ASU-2016-01))(2018-03)//regarding CONSECUTIVE CONCURRENT WILLFULL CONSTITUTIONAL & INTERNATIONAL TRADE VIOLATIONS! Request::REQUEST;(please r u STRONG ENOUGH 2 HAVE AN "ACTIVE" Conscious???)FROM:: (((Ms.CHRISTINA MILDRED CHAVEZ)))vs. State of New Mexico Secretary of State &/ or all " affiliates"(cash-control/sba& ha) ASSET ACCOUNTS=$45 BILLION FROM MY PERSONAL BUSINESS UNIVERSAL GOVT CONTRACTOR)(X.3)w&i(EULA PATENTS; NOTICE!FAR52.227.19.CC3.*{©}2047).Personal Services Consultant(Psc7)WHD/EBSA/SSA"LEGACY" INVESTMENT ACCOUNT(S)RETIRED MILITARY .CIA&IRS&SEC&DEPARTMENT OF(PIA)FISCAL SERVICES U.S.TREASURY.INTERNATIONAL SECURITIES ENTITY{{SG1}}.see JSC INQUIRY No.2013-077(ALSO VIOLATIONS HB-1,HB2,)SUPREME COURT DOCKET NO.34,601(Court Judge Conduct Violations NMSA 1978 RULES-21-206(A),21-101,21-102,21-103,21-303,21-204(B)(C), et.al. NET INCOME LOSSES EXCEEDS PUBLIC HEALTH &/Or SAFETY RISKS LOSS BASE MINIMUMS. 7-1-4-3 nmsa 1978/CONTINUED NON COMPLIANCE MALICIOUS INDIFFERENCE TO"REASONABLE_MAN_STANDARD". National Defense Authorization Act FY2016/Note 552a Title5 P.L.111-203,124 Stat.2081/1693d&1693f TitleX/ Title 1 Violations. There is No "Competent" Attorney's CPA'$ Available in the State of New Nexico to help Resolve Mandated Required_ Already Adjucated_PAYROLL,RETIREMENT& INVESTMENT ACCOUNT VIOLATIONS. Dear Sir. I am 10 Years of " Information Collection Reporting Entity" Active Contributing Earned Income & Investment Asset Issuer.( No Notification or Comprehensive Communication from New Mexico State Financial Administration.) I Require "Independent" Investment Individual Plan EXECUTIVE OF ESTATE & Advance Directives.Asset Accounts in Virginia_Industrial Savings&Loan Virginia, NYFRB. ASSET ACCOUNTS.IRREVOCAVLE LETTERS OF CREDIT FROM U.S.DEPT.OF TREASURY(Fiscal Services) IRS LR'$ (Taxes,Security Together) DOD, DOL, GOA, GPA,[[ILC.USA 1.ILC]]{07}& Others. My mobile phone(505)357-4566. Thank You Sir. {©}Copyright2047ICC.cc3.*
Re: GCC Compiler Optimization ignores or mistreats MFENCE memory barrier related instruction
Hi Alex , Agree that float division don't touch memory ,but fdiv result (stack register ) is stored back to a memory i.e fResult . So compiler barrier in the inline asm i.e ::memory should prevent the shrinkage of instructions like "fstps fResult(%rip)"behind the fence ? BTW ,if we make fDivident and fResult = 0.0f gloabls,the code emitted looks ok i.e #gcc -S test.c -O3 -mmmx -mno-sse flds.LC0(%rip) fstsfDivident(%rip) fdivs .LC1(%rip) fstps fResult(%rip) #APP # 10 "test.c" 1 mfence # 0 "" 2 #NO_APP fldsfResult(%rip) movl$.LC2, %edi xorl%eax, %eax fstpl (%rsp) callprintf So i strongly believe that ,its compiler issue and please feel free correct me in any case. Thank you and waiting for your reply. ~Umesh On Fri, Apr 13, 2018 at 5:58 PM, Alexander Monakov wrote: > On Fri, 13 Apr 2018, Vivek Kinhekar wrote: >> The mfence instruction with memory clobber asm instruction should create a >> barrier between division and printf instructions. > > No, floating-point division does not touch memory, so the asm does not (and > need not) restrict its motion. > > Alexander
ANN: gcc-python-plugin 0.16
gcc-python-plugin is a plugin for GCC 4.6 onwards which embeds the CPython interpreter within GCC, allowing you to write new compiler warnings in Python, generate code visualizations, etc. This releases adds support for gcc 7 and gcc 8 (along with continued support for gcc 4.6, 4.7, 4.8, 4.9, 5 and 6). The upstream location for the plugin has moved from fedorahosted.org to https://github.com/davidmalcolm/gcc-python-plugin Additionally, this release contains the following improvements: * add gcc.RichLocation for GCC 6 onwards * gcc.Location * add caret, start, finish attributes for GCC 7 onwards * add gcc.Location.offset_column() method Tarball releases are available at: https://github.com/davidmalcolm/gcc-python-plugin/releases Prebuilt-documentation can be seen at: http://gcc-python-plugin.readthedocs.org/en/latest/index.html The plugin and checker are Free Software, licensed under the GPLv3 or later. Enjoy! Dave Malcolm
gcc-8-20180504 is now available
Snapshot gcc-8-20180504 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/8-20180504/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-8-branch revision 259960 You'll find: gcc-8-20180504.tar.xzComplete GCC SHA256=b49b674524449c999c0966271c2fc4488a2db8cec8d65e78ba6665408577f572 SHA1=9b4f388d4c8f58d0a4fcfe888a7bc8ca86679d39 Diffs from 8-20180427 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.