https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79347

--- Comment #3 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Further work needed. I did not fix the actual issue in vectorizer because I am
really lost in the new prologue code. I hope Bin will chime in and help me ;)

The situation is not that bad overall, but it could be significantly better ;)

For tramp3d the mismatch count looks as follows:

tramp.ii.156t.ch_vect          185
tramp.ii.157t.ifcvt            267
tramp.ii.158t.vect             1076
...
tramp.ii.226t.optimized        888

the increase from 185 to 267 are the fake mismatches that should go away after
folding, so vectorizer itself introduce about 800 new errors.

On gcc 6 the scores are:
tramp3d-v4.cpp.146t.ch_vect    331
tramp3d-v4.cpp.147t.ifcvt      331
tramp3d-v4.cpp.148t.vect       1050
....
tramp3d-v4.cpp.211t.optimized  1039

So the overall mismatch count is worse, but vectorizer is somewhat less
disturbing. The main reason for decrease of mismatch count is stronger early
opts. On mainline I get:
tramp.ii.091t.ccp2             0
tramp.ii.093t.cunrolli         17
tramp.ii.094t.backprop         17
tramp.ii.095t.phiprop          17
tramp.ii.096t.forwprop2        17
tramp.ii.097t.objsz2           17
tramp.ii.098t.alias            17
tramp.ii.099t.retslot          17
tramp.ii.100t.fre3             42
tramp.ii.101t.mergephi2        40
tramp.ii.102t.thread1          98
tramp.ii.103t.vrp1             265
tramp.ii.105t.dce2             169

while gcc 6 gets:
tramp3d-v4.cpp.084t.oaccdevlow 0
tramp3d-v4.cpp.086t.ccp2       0
tramp3d-v4.cpp.087t.cunrolli   23
tramp3d-v4.cpp.088t.backprop   23
tramp3d-v4.cpp.089t.phiprop    23
tramp3d-v4.cpp.090t.forwprop2  23
tramp3d-v4.cpp.091t.objsz2     23
tramp3d-v4.cpp.092t.alias      23
tramp3d-v4.cpp.093t.retslot    23
tramp3d-v4.cpp.094t.fre3       31
tramp3d-v4.cpp.095t.mergephi2  29
tramp3d-v4.cpp.096t.vrp1       466
tramp3d-v4.cpp.098t.dce2       426


Report on vrp1 itself is wrong because it dumps bbs multiple times, but clearly
vrp1+thread1 now introduce a lot fewer problems than vrp1 in gcc 6. Most of
those mismatches are justified becuase we prove some branches with non-0
probability to be impossible or we thread in a way contradicting profile.

Honza

Reply via email to