Re: SSE in libthr

2015-04-14 Thread Eric van Gyzen
SE to offset the extra context-switch cost. SSE does not provide a clear benefit in the current libthr code with the current compiler, but it does provide a clear loss in some cases. Therefore, disabling SSE in libthr is a non-loss for most, and a gain for some. I refrained from disabling SSE in lib

Re: SSE in libthr

2015-04-06 Thread John Baldwin
On Saturday, March 28, 2015 10:41:48 AM Adrian Chadd wrote: > Ok, so how do we reduce the amount of FPU save and restores, or make > them cheaper? Or make them more useful. If you are using SSE/AVX more often between context switches in ways that are beneficial then that might offset the cost of

Re: SSE in libthr

2015-03-28 Thread Adrian Chadd
Ok, so how do we reduce the amount of FPU save and restores, or make them cheaper? -a ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@f

Re: SSE in libthr

2015-03-28 Thread John-Mark Gurney
Eric van Gyzen wrote this message on Fri, Mar 27, 2015 at 17:43 -0400: > On 03/27/2015 16:49, Rui Paulo wrote: > > > > Regarding your patch, I think we should disable even more, if possible. > > How about: > > > > CFLAGS+=-mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 > > Yes, I was co

Re: SSE in libthr

2015-03-28 Thread David Chisnall
On 28 Mar 2015, at 13:54, Julian Elischer wrote: > > the point is that clang will do this anywhere it can, because it isn't taking > into account the > side effects, just the speed of the commands themselves. This is also something that is not going to decrease. Clang now enables the SLP vect

Re: SSE in libthr

2015-03-28 Thread Julian Elischer
-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_LINK(m) do {\ (m)->m_qe.tqe_prev = NULL; \ (m)->m_qe.tqe_

Re: SSE in libthr

2015-03-28 Thread Konstantin Belousov
a non-trivial > > amount. I'd like to disable SSE in libthr. > > How about saving and restoring the FPU/SSE state eagerly instead of the > current CR0.TS-based lazy method? There is overhead associated with #NM > exception handling (fpudna) which is not worth it if FPU/SSE

Re: SSE in libthr

2015-03-27 Thread Tomoaki AOKI
If SIMD instructions are used for string proceccing, and FPU(AVX) contexts are NOT saved/restored properly on process (thread) switching, possibly processed string is destroyed by other process (thread). Can't it be a security risk? (Broken string parameter for syscalls, etc) If so, FPU (AVX) cont

Re: SSE in libthr

2015-03-27 Thread Tomoaki AOKI
read_mutex_unlock. This reduces performance by a non-trivial amount. I'd > like to disable SSE in libthr. > > In more detail: > > In libthr/thread/thr_mutex.c, we find the following: > > #define MUTEX_INIT_LINK(m) do {

Re: SSE in libthr

2015-03-27 Thread Adrian Chadd
On 27 March 2015 at 16:03, Alan Somers wrote: > On Fri, Mar 27, 2015 at 4:36 PM, Adrian Chadd wrote: >> hi, >> >> please don't try to microoptimise crap like strlen(). >> >> The TL;DR for performant high-throughput code is: if strlen() or >> memcpy() is the thing that's costing you the most, you'

Re: SSE in libthr

2015-03-27 Thread Alan Somers
On Fri, Mar 27, 2015 at 4:36 PM, Adrian Chadd wrote: > hi, > > please don't try to microoptimise crap like strlen(). > > The TL;DR for performant high-throughput code is: if strlen() or > memcpy() is the thing that's costing you the most, you're doing it > wrong. > > > > -adrian I respectfully di

Re: SSE in libthr

2015-03-27 Thread Adrian Chadd
hi, please don't try to microoptimise crap like strlen(). The TL;DR for performant high-throughput code is: if strlen() or memcpy() is the thing that's costing you the most, you're doing it wrong. -adrian ___ freebsd-current@freebsd.org mailing list

Re: SSE in libthr

2015-03-27 Thread Eric van Gyzen
On 03/27/2015 16:49, Rui Paulo wrote: > > Regarding your patch, I think we should disable even more, if possible. How > about: > > CFLAGS+=-mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 Yes, I was considering copying all of the similar flags that we use in the kernel. That seems wise.

Re: SSE in libthr

2015-03-27 Thread Konstantin Belousov
y a non-trivial amount. > > I'd > > like to disable SSE in libthr. > > > > In more detail: > > > > In libthr/thread/thr_mutex.c, we find the following: > > > > #define MUTEX_INIT_LINK(m) do {\ > >

Re: SSE in libthr

2015-03-27 Thread Jilles Tjoelker
On Fri, Mar 27, 2015 at 03:26:17PM -0400, Eric van Gyzen wrote: > In a nutshell: > Clang emits SSE instructions on amd64 in the common path of > pthread_mutex_unlock. This reduces performance by a non-trivial > amount. I'd like to disable SSE in libthr. How about saving and

Re: SSE in libthr

2015-03-27 Thread Rui Paulo
On Mar 27, 2015, at 12:26, Eric van Gyzen wrote: > > In a nutshell: > > Clang emits SSE instructions on amd64 in the common path of > pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd > like to disable SSE in libthr. > > In more

Re: SSE in libthr

2015-03-27 Thread Daniel Eischen
On Fri, 27 Mar 2015, Eric van Gyzen wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. This makes sense to me.

Re: SSE in libthr

2015-03-27 Thread Adrian Chadd
Wow. I remember seeing this in the work application - all packet pushing in userland, but there are locks being acquired. I was wondering what exactly was triggering the FPU save/restore code. Now I know. Yes, if there are no other objections, I'd love to see this in -HEAD and stable/10. -adrian

SSE in libthr

2015-03-27 Thread Eric van Gyzen
In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_L