Looks great to me
Jose
- Original Message -
> From: Roland Scheidegger
>
> SSE can't handle true vector shifts (with variable shift count),
> so llvm is turning them into a mess of extracts, scalar shifts and inserts.
> It is however possible to emulate them in lp_build_minify with floa
On 11/05/2013 11:22 AM, srol...@vmware.com wrote:
From: Roland Scheidegger
SSE can't handle true vector shifts (with variable shift count),
so llvm is turning them into a mess of extracts, scalar shifts and inserts.
It is however possible to emulate them in lp_build_minify with float muls,
whic
From: Roland Scheidegger
SSE can't handle true vector shifts (with variable shift count),
so llvm is turning them into a mess of extracts, scalar shifts and inserts.
It is however possible to emulate them in lp_build_minify with float muls,
which should be way faster (saves over 20 instructions p