Nice, just a few comments below.
On 10/22/2014 10:02 PM, Timothy Arceri wrote:
Makes use of SSE to speed up compute of min and max elements
Callgrind cpu usage results from pts benchmarks:
Openarena 0.8.8: 3.67% -> 1.03%
UrbanTerror: 2.36% -> 0.81%
Signed-off-by: Timothy Arceri
---
src/me
On 10/24/2014 06:28 AM, Timothy Arceri wrote:
On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
Nice, just a few comments below.
On 10/22/2014 10:02 PM, Timothy Arceri wrote:
Makes use of SSE to speed up compute of min and max elements
Callgrind cpu usage results from pts benchmarks:
Ope
On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
> Nice, just a few comments below.
>
>
> On 10/22/2014 10:02 PM, Timothy Arceri wrote:
> > Makes use of SSE to speed up compute of min and max elements
> >
> > Callgrind cpu usage results from pts benchmarks:
> >
> > Openarena 0.8.8: 3.67% -> 1
On Thu, 2014-10-23 at 17:08 -0400, Ilia Mirkin wrote:
> On Thu, Oct 23, 2014 at 4:56 PM, Timothy Arceri wrote:
> > On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
> >>
> >> Can something similar be done for 16-bit values?
> >>
> >
> > Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics t
On Thu, Oct 23, 2014 at 2:08 PM, Ilia Mirkin wrote:
> On Thu, Oct 23, 2014 at 4:56 PM, Timothy Arceri wrote:
>> On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
>>>
>>> Can something similar be done for 16-bit values?
>>>
>>
>> Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics too.
>
>
On Thu, Oct 23, 2014 at 4:56 PM, Timothy Arceri wrote:
> On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
>>
>> Can something similar be done for 16-bit values?
>>
>
> Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics too.
And those only need SSE2 iirc. There are also _mm256_* intrinsi
On Thu, Oct 23, 2014 at 1:56 PM, Timothy Arceri wrote:
> On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
>>
>> Can something similar be done for 16-bit values?
>>
>
> Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics too.
And for 8-bit as well, which is part of SSE(1).
___
On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote:
>
> Can something similar be done for 16-bit values?
>
Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics too.
> -Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists
On Thu, 2014-10-23 at 09:20 -0700, Matt Turner wrote:
> On Thu, Oct 23, 2014 at 2:13 AM, Timothy Arceri <
> t_arc...@yahoo.com.au> wrote:
> > On Wed, 2014-10-22 at 22:49 -0700, Matt Turner wrote:
> > > On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner > > > wrote:
> > > > On Wed, Oct 22, 2014 at 9:02
On Thu, Oct 23, 2014 at 2:13 AM, Timothy Arceri wrote:
> On Wed, 2014-10-22 at 22:49 -0700, Matt Turner wrote:
>> On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner wrote:
>> > On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri
>> > wrote:
>> >> I almost wasn't going to bother sending this out since it
On Wed, 2014-10-22 at 22:49 -0700, Matt Turner wrote:
> On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner wrote:
> > On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri
> > wrote:
> >> I almost wasn't going to bother sending this out since it uses SSE4.1
> >> and its recommended to use glDrawRangeElemen
On Wed, Oct 22, 2014 at 10:30 PM, Matt Turner wrote:
> On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri wrote:
>> I almost wasn't going to bother sending this out since it uses SSE4.1
>> and its recommended to use glDrawRangeElements anyway. But since these games
>> are still ofter used for benchm
On Wed, Oct 22, 2014 at 9:02 PM, Timothy Arceri wrote:
> Makes use of SSE to speed up compute of min and max elements
>
> Callgrind cpu usage results from pts benchmarks:
>
> Openarena 0.8.8: 3.67% -> 1.03%
> UrbanTerror: 2.36% -> 0.81%
>
> Signed-off-by: Timothy Arceri
> ---
> src/mesa/Makefile
Makes use of SSE to speed up compute of min and max elements
Callgrind cpu usage results from pts benchmarks:
Openarena 0.8.8: 3.67% -> 1.03%
UrbanTerror: 2.36% -> 0.81%
Signed-off-by: Timothy Arceri
---
src/mesa/Makefile.am | 3 +-
src/mesa/main/sse_minmax.c| 75
14 matches
Mail list logo