https://github.com/RKSimon requested changes to this pull request.

We really shouldn't need the 128 AND 256 variants. This is a general hop 
pattern that you should be able to adapt:
```c
using T = __v16hi;
T hop(T x, T y) {
    unsigned NumElems = sizeof(x) / sizeof(x[0]);
    unsigned NumLanes = sizeof(x) / 16;
    unsigned NumElemsPerLane = NumElems / NumLanes;
    unsigned HalfElemsPerLane = NumElemsPerLane / 2;

    T r;
    for (unsigned L = 0; L != NumElems; L += NumElemsPerLane) {
        for (unsigned E = 0; E != HalfElemsPerLane; ++E) {
            r[L + E] = x[L+(2*E)+0] - x[L+(2*E)+1];
        }
        for (unsigned E = 0; E != HalfElemsPerLane; ++E) {
            r[L + E + HalfElemsPerLane] = y[L+(2*E)+0] - y[L+(2*E)+1];
        }
    }
    return r;
}
```

https://github.com/llvm/llvm-project/pull/156822
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to