kmclaughlin abandoned this revision.
kmclaughlin added a comment.

I just wanted to give an update on this patch, which I'm abandoning for the 
time being:

@lebedev.ri raised some good questions about the approach taken and whether the 
additional compile time spent would be worth the additional opportunities for 
vectorisation. After posting the last update, I collected some benchmark 
results using Spec2017 to get a better understanding of the impact of these 
changes and found that several benchmarks showed performance regressions for 
fixed-width.

The biggest outliers (in terms of percentage runtime change) were:
520.omnetpp_r: -3.00%
500.perlbench_r: -2.00%
502.gcc_r: -1.52%

I also collected the results after adding in a threshold number of cases to be 
unswitched (set to 4), as was included in the first draft of this patch. This 
also showed some regressions in the benchmarks run and no significant 
improvements. Both sets of results showed increased compile times for many 
benchmarks.

The same benchmarks as above, with the threshold of 4 set:
520.omnetpp_r: -3.46%
500.perlbench_r: -1.20%
502.gcc_r: -1.22%

Results were collected on a Neoverse-N1 machine. Given that these results 
indicate this isn't the best approach to take, I'm abandoning the patch for 
now. When this is picked up in future, it will likely be better to follow 
either the suggestion to prevent canonicalisation of branches & compares into 
switch statements (under a given number of cases) in the first place, or to 
teach the loop vectoriser to recognise switches.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108138/new/

https://reviews.llvm.org/D108138

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D108138: [WIP] Re... Kerry McLaughlin via Phabricator via cfe-commits

Reply via email to