Re: GCC does not optimize well enough with vectors on bitshift

2025-03-11 Thread Matt Godbolt
While this doesn't affect your example in this particular case: please don't use `-march-native` on Compiler Explorer for these examples - this will pick whatever architecture your individual query is served from which may be any of the available AMD or Intel CPUs we run on. There ought to be a pop

Re: On(c)e more: optimizer failure

2021-08-21 Thread Matt Godbolt
Ok! Thanks; sorry for the misunderstanding on my side. --matt On Sat, Aug 21, 2021 at 2:53 PM Stefan Kanthak wrote: > Matt Godbolt wrote: > > > I believe your example doesn't take into account that the values can be > NaN > > which compares false in all situatio

Re: On(c)e more: optimizer failure

2021-08-21 Thread Matt Godbolt
I believe your example doesn't take into account that the values can be NaN which compares false in all situations. If you allow the compiler to optimize without supporting NaN (-ffast-math), I think it generates the code you want: https://godbolt.org/z/1ra7zcsnd --matt On Sat, Aug 21, 2021 at 1:

Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread Matt Godbolt
On Mon, Jan 5, 2015 at 11:53 AM, DJ Delorie wrote: > > Matt Godbolt writes: >> GCC's code generation uses a "load; add; store" for volatiles, instead >> of a single "add 1, [metric]". > > GCC doesn't know if a target's load/add/store

Re: volatile access optimization (C++ / x86_64)

2014-12-30 Thread Matt Godbolt
On Tue, Dec 30, 2014 at 5:05 AM, Torvald Riegel wrote: > I agree with Andrew. My understanding of volatile is that the generated > code must do exactly what the abstract machine would do. That makes sense. I suppose I don't understand what the difference is in terms of an abstract machine of "lo

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Matt Godbolt
> On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: > Is it faster? Have you measured it? Is it so much faster that it's critical > for your > application? Well, I couldn't really leave this be: I did a little bit of benchmarking using my company's proprietary benchmarking library, which I

Re: volatile access optimization (C++ / x86_64)

2014-12-27 Thread Matt Godbolt
On Sat, Dec 27, 2014 at 11:57 AM, Andrew Haley wrote: > On 27/12/14 00:02, Matt Godbolt wrote: >> On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley wrote: >>> On 26/12/14 22:49, Matt Godbolt wrote: >>>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >>>

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 5:20 PM, NightStrike wrote: > Have you tried release and acquire/consume instead? Yes; these emit the same instructions in this case. http://goo.gl/e94Ya7 Regards, Matt

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley wrote: > On 26/12/14 22:49, Matt Godbolt wrote: >> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: >>> On 26/12/14 20:32, Matt Godbolt wrote: >> I realise my understanding could be wrong here! >> If not though, b

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 4:51 PM, Marc Glisse wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50677 Thanks Marc

Re: volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley wrote: > On 26/12/14 20:32, Matt Godbolt wrote: >> Is there a reason why (in principal) the volatile increment can't be >> made into a single add? Clang and ICC both emit the same code for the >> volatile and non-volatile cas

volatile access optimization (C++ / x86_64)

2014-12-26 Thread Matt Godbolt
Hi all, I'm investigating ways to have single-threaded writers write to memory areas which are then (very infrequently) read from another thread for monitoring purposes. Things like "number of units of work done". I initially modeled this with relaxed atomic operations. This generates a "lock xad

Re: Missed optimization case

2014-12-23 Thread Matt Godbolt
On Tue, Dec 23, 2014 at 2:25 PM, Andi Kleen wrote: > > Please file a bug with a test case. No need to worry about the phase > too much initially, just fill in a reasonable component. > Thanks - filed as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64396 -matt

Missed optimization case

2014-12-22 Thread Matt Godbolt
Hi all, While digging into some GCC-generated code, I noticed a missed opportunity in GCC that Clang and ICC seem to take advantage of. All versions of GCC (up to 4.9.0) seem to have the same trouble. The following source (for x86_64) shows up the problem: - #include #define add_carry32(sum

Re: Build problem with 4.8.0 RC-20130316 and in-tree binutils

2013-03-20 Thread Matt Godbolt
On Wed, Mar 20, 2013 at 8:42 AM, Ian Lance Taylor wrote: > On Wed, Mar 20, 2013 at 6:36 AM, Matt Godbolt wrote: >> >> Thanks for the quick reply. I definitely have --enable-shared set in >> the configuration, and had so with 4.7. However I'm not certain that >&g

Re: Build problem with 4.8.0 RC-20130316 and in-tree binutils

2013-03-20 Thread Matt Godbolt
tt On Wed, Mar 20, 2013 at 8:18 AM, Ian Lance Taylor wrote: > On Wed, Mar 20, 2013 at 5:35 AM, Matt Godbolt wrote: >> >> I'm having trouble building the RC 4.8.0 with an in-tree binutils on >> an Ubuntu 12.04 x86_64. It seems that while building GCC, the runtime >>

Build problem with 4.8.0 RC-20130316 and in-tree binutils

2013-03-20 Thread Matt Godbolt
Hi all, I'm having trouble building the RC 4.8.0 with an in-tree binutils on an Ubuntu 12.04 x86_64. It seems that while building GCC, the runtime library path does not include the objdir/prev-*/.libs directories; so whenever any of the built binutils programs are run they fail as their shared li