On 12/7/18 11:48 AM, Wilco Dijkstra wrote: > Hi, > >>> Ultimately, the best solution here will probably depend on which we >>> think is more likely, copysign or the example I give above. >> I'd tend to suspect we'd see more pure integer bit twiddling than the >> copysign stuff. > > All we need to do is to clearly separate the integer and FP/SIMD cases. > Copysign should always expand into a pattern that cannot generate > integer instructions. This could be done by adding a bit/bif pattern with > UNSPEC for the DI/SImode case or use V2DI/V2SI in the copysign > expansion. As I've noted, adding those unspecs is likely to get in the way of things like CSE, combine, etc.
> >> Could we have the bfxil pattern have an alternative that accepts vector >> regs and generates bit in appropriate circumstances? > > We already do that in too many cases, and it only makes the problem > worse since the register allocator cannot cost these patterns at all (let > alone accurately). This is particularly bad when the expansions are > wildly different and emit extra instructions which cannot be optimized > after register allocation. I'm not sure what you mean by it can't cost them. Costs certainly factor into the algorithms used by IRA/LRA. But I would agree that it's not particularly good at costing across register banks and modeling the cost of reloads it'll have to generate if it needs to move a value from one bank to another. > > We simply need to make an early choice which register file to use. GCC fundamentally isn't designed to do that. > >> Hmm, maybe the other way around would be better. Have the "bit" >> pattern with a general register alternative that generates bfxil when >> presented with general registers. > > We already have that, and that's a very complex pattern which already > results in inefficient integer code. > > For the overlapping cases between bfi and bfxil the mid-end should really > simplify one into the other to avoid having to have multiple MD patterns > for equivalent expressions. This may solve the problem. Well, the bfxil pattern is general enough to handle both, the problem is it only works on one register file. > >> I would generally warn against hiding things in unspecs like you've >> suggested above. We're seeing cases where that's been on in the x86 >> backend and it's inhibiting optimizations in various places. > > In the general case we can't describe a clear preference for a specific > register file without support for scalar vector types (eg. V1DI, V1SI) or > having a way to set virtual register preferences at expand time. I'm going to step away from this problem. It looked like it might be trackable, but there's clearly a lot more to it and someone with more experience on aarch64 will have to run with it. Patch withdrawn. jeff