Hi Jakub, I looked through your patch that looks good enough although it likely must be improved to get better vectorization for AVX-2. One general issue is that you introduced a new pass to undo if-conversion leading to one restriction on if-conversion that prohibited to chain the conditions: /* Avoid creating mask loads/stores if we'd need to chain conditions, to make it easier to undo them. */
I assume that you can do it without undo but simply creating a copy of handled loop and restoring/deletion it in case of fail or success (such approach is used by many compilers for software-pipelining loops aka modulo scheduling). Is it difficult to implement in gcc framework or you simply missed it. Also the current implementation is base on if-conversion although predication is more preferable for it and can allow us to vectorize more loop patterns. What is your opinion? Best regards. Yuri.