https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608
Bug ID: 90608 Summary: Inline masked minlo/maxloc calls Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Created attachment 46402 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46402&action=edit minloc example One of the benchmarks we care about performs much better when compiled with ifort than with gfortran. One of the differences seems to be due to ifort inlining calls to some minloc/maxloc intrinsics whereas gfortran emits calls to libgfortran, for example _gfortran_mminloc0_4_i4 to perform various minloc/maxloc operations combined with a mask over array sections. Inlining these intrinsics seems to enable further optimisations in the pipeline like vectorisation Attached is a small-but-representable standalone example of the kind of calls it would be good to inline. I'm not too familiar with Fortran and the frontend but it seems to inline some minloc/maxloc intrinsics already in trans-intrinsic.c. Would it be possible to beef it up?