Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-25 Thread Jason Ekstrand
On Fri, Oct 24, 2014 at 10:38 AM, Daniel Stone wrote: > Hi, > > On 24 October 2014 18:51, Emil Velikov wrote: > >> Sigh... why can't everyone be like Gentoo - set compiler flags and >> rebuild for your machine/cpu :P >> >> Apart from the Makefile.sources change spotted by Matt, can you make use

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-25 Thread Timothy Arceri
On Fri, 2014-10-24 at 09:11 -0700, Matt Turner wrote: > On Fri, Oct 24, 2014 at 5:47 AM, Timothy Arceri wrote: > > Makes use of SSE to speed up compute of min and max elements > > > > Callgrind cpu usage results from pts benchmarks: > > > > Openarena 0.8.8: 3.67% -> 1.03% > > UrbanTerror: 2.36% ->

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Timothy Arceri
On Fri, 2014-10-24 at 18:38 +0100, Daniel Stone wrote: > Hi, > > On 24 October 2014 18:51, Emil Velikov > wrote: > Sigh... why can't everyone be like Gentoo - set compiler flags > and > rebuild for your machine/cpu :P > > Apart from the Makefile.sources ch

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Roland Scheidegger
Am 24.10.2014 um 23:06 schrieb Ian Romanick: > On 10/24/2014 05:47 AM, Timothy Arceri wrote: >> Makes use of SSE to speed up compute of min and max elements >> >> Callgrind cpu usage results from pts benchmarks: >> >> Openarena 0.8.8: 3.67% -> 1.03% >> UrbanTerror: 2.36% -> 0.81% >> >> Signed-off-b

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Matt Turner
On Fri, Oct 24, 2014 at 2:06 PM, Ian Romanick wrote: > On 10/24/2014 05:47 AM, Timothy Arceri wrote: >> + vec_count = count & ~0x3; >> + ui_indices_ptr = (__m128i*)ui_indices; >> + for (i = 0; i < vec_count / 4; i++) { >> + ui_indices4 = _mm_loadu_si128(&ui_indices_ptr[i]);

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Ian Romanick
On 10/24/2014 05:47 AM, Timothy Arceri wrote: > Makes use of SSE to speed up compute of min and max elements > > Callgrind cpu usage results from pts benchmarks: > > Openarena 0.8.8: 3.67% -> 1.03% > UrbanTerror: 2.36% -> 0.81% > > Signed-off-by: Timothy Arceri > --- > src/mesa/Android.libmesa

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Daniel Stone
Hi, On 24 October 2014 18:51, Emil Velikov wrote: > Sigh... why can't everyone be like Gentoo - set compiler flags and > rebuild for your machine/cpu :P > > Apart from the Makefile.sources change spotted by Matt, can you make use > of USE_SSE41 ? Take a look at commit b3121bfd413 for the whys an

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Emil Velikov
Hi Timothy, On 24/10/14 12:47, Timothy Arceri wrote: > Makes use of SSE to speed up compute of min and max elements > > Callgrind cpu usage results from pts benchmarks: > > Openarena 0.8.8: 3.67% -> 1.03% > UrbanTerror: 2.36% -> 0.81% > Sigh... why can't everyone be like Gentoo - set compiler fl

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Matt Turner
On Fri, Oct 24, 2014 at 5:47 AM, Timothy Arceri wrote: > Makes use of SSE to speed up compute of min and max elements > > Callgrind cpu usage results from pts benchmarks: > > Openarena 0.8.8: 3.67% -> 1.03% > UrbanTerror: 2.36% -> 0.81% > > Signed-off-by: Timothy Arceri > --- > src/mesa/Android.

Re: [Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Timothy Arceri
On Fri, 2014-10-24 at 23:47 +1100, Timothy Arceri wrote: > +#ifdef __SSE4_1__ > +#include "main/glheader.h" Just noticed this extra header after sending out the patch. Fixed now. > +#include "main/sse_minmax.h" > +#include > + > +void > +_mesa_uint_array_min_max(const unsigned *ui_indices, unsi

[Mesa-dev] [PATCH V2] mesa: add SSE optimisation for glDrawElements

2014-10-24 Thread Timothy Arceri
Makes use of SSE to speed up compute of min and max elements Callgrind cpu usage results from pts benchmarks: Openarena 0.8.8: 3.67% -> 1.03% UrbanTerror: 2.36% -> 0.81% Signed-off-by: Timothy Arceri --- src/mesa/Android.libmesa_dricore.mk | 3 +- src/mesa/Makefile.am| 3 +-