[Abe wrote:]
TTBOMK, that "expense" is temporary, pending fixes/improvement to the new converter. IOW, I don`t think any of the relevant regressions are as a result of "essential complexity".
[Richard wrote:]
I don't see how you can fix this without scatter support in the CPU.
Well, for some test cases, I can see in the GIMPLE dumps that a/the scratchpad wasn`t needed here and wasn`t needed there. IOW, the converter that "knows how" to indirect via scratchpads is doing so _always_ rather than only when it`s truly needed. This is causing additional indirection which the vectorizer rejects. Making the new converter "smarter" should fix this. Overall, however, I see your point and agree with it. Gather support is needed for the loads and scatter for the stores, in the general case. [Richard wrote:]
But then you should restrict the if-conversion to targets that can do gather vectorization.
I think adding this requirement -- to generalize, _two_ req.s: _gather_ support for _load_ if conversion with [a] scratchpad[s], and scatter support for _store_ if conversion with [a] scratchpad[s] -- is a reasonable and probably wise goal. Doing so should eventually enable us to conclude automatically "it is safe to enable this kind of if conversion for/on this target" and enable the relevant transformations automatically when given e.g. "-O3" and autovectorization. [Abe wrote]
As of now, the relevant "improvement" in/due-to the new converter is not always effective [...] We intend to fix this.
[Richard wrote:]
This should be fixed before the patch can go in.
The current plan is to add new switches that control which [if any] if converter is active, and to have the old one _and_ the new one be included in GCC trunk for the time being. Then I/we can have more eyes on the new code, helping with bugs etc., without having any impact at all on code that only either is compiled with the old flags [which shall retain their old semantics for now] or with no relevant flags whatsoever. Is that plan OK? [Richard wrote:]
There is the existing mechanism to if-convert only a copy of the loop guarded with IFN_LOOP_VECOTRIZED so the cost model is that of the vectorizer when it tries to vectorize the if-converted copy. If it doesn't vectorize that copy the non-if-converted loop is preserved, otherwise it is deleted.
I think I found the code to which you are referring here, and I tried to use it [so as to not reduplicate efforts], but my first effort at doing so was unsuccessful AFAIK. Trying again -- this time with help from Jakub, I hope -- is on my work agenda. [Richard wrote:]
I've argued in the past that the GIMPLE if-converter should always go that path (thus strictly be a vectorization enabler only).
If I understand correctly what you mean by that, then I agree.
if-conversion is already on by default if vectorizing.
> We only do not introduce store-data-races by default. Thanks. Do you know if the e.g. GCC 5.1 [and 5.2, if bug fix not included] release[s] were "safe" even without your recent bug fix [July 10] for code _not_ compiled with "-ftree-loop-if-convert-stores"? I ask b/c the test case I wrote which was able to cause GCC to generate crashing code [for a correct-under-low-optimization test case] intentionally had a crash due to dereferencing a null pointer in a _load_. IIRC, getting GCC to generate that bad code required "-ftree-loop-if-convert-stores" despite it being a faulty _load_. FYI: I have made some progress on the SPEC testing, but I do not yet have the work at a point where I`m ready to report numbers. Regards, Abe