Re: [Mesa-dev] [PATCH] gallivm: optimize lp_build_minify for sse

2013-11-05 Thread Jose Fonseca
Looks great to me Jose - Original Message - > From: Roland Scheidegger > > SSE can't handle true vector shifts (with variable shift count), > so llvm is turning them into a mess of extracts, scalar shifts and inserts. > It is however possible to emulate them in lp_build_minify with floa

Re: [Mesa-dev] [PATCH] gallivm: optimize lp_build_minify for sse

2013-11-05 Thread Brian Paul
On 11/05/2013 11:22 AM, srol...@vmware.com wrote: From: Roland Scheidegger SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, whic

[Mesa-dev] [PATCH] gallivm: optimize lp_build_minify for sse

2013-11-05 Thread sroland
From: Roland Scheidegger SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions p