Re: Project Ranger
> The Ranger is far enough along now that we have confidence in both its > approach and ability to perform, and would like to solicit feedback on > what you think of it, any questions, possible uses, as well as > potential requirements to integrate with trunk later this stage. The PDF document mentions that you first intended to support symbolic ranges but eventually dropped them as "too complex, and ultimately not necessary". I don't entirely disagree with the former part, but I'm curious about the latter part: how do you intent to deal in the long term with cases that do require symbolic information to optimize things? The TODO page seems to acknowledge the loophole but only mentions a plan to deal with equivalences, which is not sufficient in the general case (as acknowledged too on the page). -- Eric Botcazou
Re: Enabling -ftree-slp-vectorize on -O2/Os
On Tue, May 29, 2018 at 5:24 PM Allan Sandfeld Jensen wrote: > On Dienstag, 29. Mai 2018 16:57:56 CEST Richard Biener wrote: > > > > so the situation improves but isn't fully fixed (STLF issues maybe?) > > > That raises the question if it helps in these cases even in -O3? That's a good question indeed. We might end up (partly) BB vectorizing loop bodies that we'd otherwise loop vectorize with SLP. Benchmarking with BB vectorization disabled at -O3+ isn't something I've done in the past. I'm now doing a 2-run with -march=haswell -Ofast [-fno-tree-slp-vectorize] for the FP benchmarks. Note that there were some cases where disabling vectorization wholly improved things. > Anyway it doesn't look good for it. Did the binary size at least improve with > prefer-avx128, or was that also worse or insignificant? Similar to the AVX258 results. I guess where AVX256 applied we now simply do two vector ops with AVX128. Richard. > 'Allan
Re: Project Ranger
On 05/30/2018 03:41 AM, Eric Botcazou wrote: The Ranger is far enough along now that we have confidence in both its approach and ability to perform, and would like to solicit feedback on what you think of it, any questions, possible uses, as well as potential requirements to integrate with trunk later this stage. The PDF document mentions that you first intended to support symbolic ranges but eventually dropped them as "too complex, and ultimately not necessary". I don't entirely disagree with the former part, but I'm curious about the latter part: how do you intent to deal in the long term with cases that do require symbolic information to optimize things? The TODO page seems to acknowledge the loophole but only mentions a plan to deal with equivalences, which is not sufficient in the general case (as acknowledged too on the page). First, we'll collect the cases that demonstrate a unique situation we care about. I have 4 very specific case that show current shortcomings.. Not just with the Ranger, but a couple we don't handle with VRP today. .. I'll eventually get those put onto the wiki so the list can be updated. I think most of these cases that care about symbolics are not so much range related, but rather an algorithmic layer on top. Any follow on optimization to either enhance or replace vrp or anything similar will simply use the ranger as a client. If it turns out there are cases where we *have* to remember the end point of a range as a symbolic, then the algorithm to track that symbolic along with the range, and request a re-evaluation of the range when the value of that symbolic is changes. Thats the high-level view. I'm not convinced the symbolic has to be in the range in order to solve problems for 2 reasons: 1) The Ranger maintains some definition chains internally and has a clear idea of what ssa_names can affect the outcome of a range. It tracks all these dependencies on a per-bb basis in the gori-map structure as imports and exports. . The iterative approach I mentioned in the document would use this info to decide that ranges in a block need to be re-evaluated because an input to this block has changed. This is similar to the way VRP and other passes iterate until nothing changes. If we get a better range for an ssa_name than we had before, push it on the stack to look for potential re-evaluation, and keep going. ThIs is what I referred to as the Level 4 ranger in the document. I kind of glossed over it because I didn't want to get into the full-blown design I originally had, nor to rework it based on the current incarnation of the Ranger. I wanted to primarily focus on what we currently have that is working so we can move forward with it. I don't think we need to track the symbolic name in the range because the Ranger tracks these dependencies for each ssa-name and can indicate when we may need to reevaluate them. There is an exported routine from the block ranger : tree single_import (tree name); If there is a sole import, it will return the ssa-name that NAME is dependent on that can affect the range of NAME. We added that API so Aldy's new threading code could utilizes this ability to a small degree.. Bottom line: The ranger has information that a pass can use to decide that a range could benefit from being reevaluated. This identifies any symbolic component of a range from that block. 2) This entire approach is modeled on walking the IL to evaluate a range. If we put symbolics and expressions in the range, we are really duplicating information that is already in the IL, and have to make a choice of exactly what and how we do it.. BB3: j_1 = q_6 / 2 i_2 = j_1 + 3 if ( i_2 < k_4) we could store the range of k_4 as [i_2 + 1, MAX] (which seems the obvious one) we could also store it as [j_1 + 4, MAX] or even [q_6 / 2 + 4, MAX]. But we have to decide in advance, and we have extra work to do if it turns out to be one of the other names we ended up wanting.. At some point later on we decide we either don't know anything about i_2 (or j_1, or q_6), or we have found a range for it, and now need to take that value and evaluate the expression stashed in the range in order to get the final result. Note that whatever algorithm is doing this must also keep track of this range somehow in order to use it later. With the Ranger model, the same algorithm gets a range, and if it thinks it might need to be re-evaluated for whatever reason can just track the an extra bit of info (like i_2 for instance) along side the range (rather than in it). If we thinks the range needs to be re-evaluated , it can simply request a new range from the ranger. You also don't have to decide whether to track the range with i_2 or j_1 (or even q_6). The Ranger can tell you that the range it gives you for k_4 is accurate unless you get a new value for q_6. That is really what you want to track. You might later wan
Re: Project Ranger
On Tue, 2018-05-29 at 19:53 -0400, Andrew MacLeod wrote: [...snip...] > The code is located on an svn branch *ssa-range*. It is based on > trunk > at revision *259405***circa mid April 2018. Is this svn branch mirrored on gcc's git mirror? I tried to clone it from there, but failed. [...snip...] Thanks Dave
Re: Project Ranger
On 05/30/2018 10:39 AM, David Malcolm wrote: On Tue, 2018-05-29 at 19:53 -0400, Andrew MacLeod wrote: [...snip...] The code is located on an svn branch *ssa-range*. It is based on trunk at revision *259405***circa mid April 2018. Is this svn branch mirrored on gcc's git mirror? I tried to clone it from there, but failed. [...snip...] Thanks Dave I don't know :-) I know that svn diff -r 259405 worked and appeared to give me a diff of everything. Aldy uses git, maybe he can tell you. that was a merge from trunk to start with to an existing branch I had cut a month earlier. . the ACTUAL original branch cut was revision *258524 if that makes any difference * Andrew
Re: Project Ranger
On Mai 30 2018, David Malcolm wrote: > On Tue, 2018-05-29 at 19:53 -0400, Andrew MacLeod wrote: > > [...snip...] > >> The code is located on an svn branch *ssa-range*. It is based on >> trunk >> at revision *259405***circa mid April 2018. > > Is this svn branch mirrored on gcc's git mirror? I tried to clone it > from there, but failed. It's in refs/remotes/ssa-range, which isn't fetched by default. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: Project Ranger
On 05/30/2018 10:51 AM, Andreas Schwab wrote: On Mai 30 2018, David Malcolm wrote: On Tue, 2018-05-29 at 19:53 -0400, Andrew MacLeod wrote: [...snip...] The code is located on an svn branch *ssa-range*. It is based on trunk at revision *259405***circa mid April 2018. Is this svn branch mirrored on gcc's git mirror? I tried to clone it from there, but failed. It's in refs/remotes/ssa-range, which isn't fetched by default. Right. From my tree: $ git branch -a |grep svn.ssa.range remotes/svn/ssa-range
Re: Enabling -ftree-slp-vectorize on -O2/Os
On Wed, May 30, 2018 at 11:25 AM Richard Biener wrote: > On Tue, May 29, 2018 at 5:24 PM Allan Sandfeld Jensen > wrote: > > On Dienstag, 29. Mai 2018 16:57:56 CEST Richard Biener wrote: > > > > > > so the situation improves but isn't fully fixed (STLF issues maybe?) > > > > > That raises the question if it helps in these cases even in -O3? > That's a good question indeed. We might end up (partly) BB vectorizing > loop bodies that we'd otherwise loop vectorize with SLP. Benchmarking > with BB vectorization disabled at -O3+ isn't something I've done in the > past. I'm now doing a 2-run with -march=haswell -Ofast > [-fno-tree-slp-vectorize] > for the FP benchmarks. Base Base BasePeak Peak Peak Benchmarks Ref. Run Time Ratio Ref. Run Time Ratio -- -- - --- - - 410.bwaves 13590178 76.4 * 13590180 75.5 S 410.bwaves 13590180 75.6 S 13590179 76.0 * 416.gamess 19580604 32.4 S 19580576 34.0 S 416.gamess 19580604 32.4 * 19580575 34.0 * 433.milc 9180339 27.1 *9180345 26.6 S 433.milc 9180343 26.7 S9180343 26.8 * 434.zeusmp 9100234 38.9 *9100234 38.9 S 434.zeusmp 9100234 38.8 S9100234 38.9 * 435.gromacs 7140251 28.5 *7140251 28.4 * 435.gromacs 7140252 28.3 S7140252 28.3 S 436.cactusADM 11950278 43.0 S 11950222 53.8 S 436.cactusADM 11950223 53.7 * 11950221 54.1 * 437.leslie3d 9400214 43.9 *9400215 43.6 * 437.leslie3d 9400217 43.3 S9400222 42.4 S 444.namd 8020302 26.5 S8020303 26.5 S 444.namd 8020302 26.6 *8020303 26.5 * 447.dealII 11440259 44.2 * 11440246 46.6 * 447.dealII 11440259 44.1 S 11440246 46.6 S 450.soplex 8340219 38.0 *8340219 38.0 * 450.soplex 8340221 37.7 S8340221 37.7 S 453.povray 5320108 49.2 *5320109 48.7 S 453.povray 5320108 49.1 S5320109 48.8 * 454.calculix 8250270 30.6 *8250269 30.6 * 454.calculix 8250271 30.5 S8250270 30.5 S 459.GemsFDTD10610308 34.5 S 10610306 34.7 S 459.GemsFDTD10610306 34.7 * 10610306 34.7 * 465.tonto9840428 23.0 S9840423 23.3 * 465.tonto9840426 23.1 *9840423 23.2 S 470.lbm 13740253 54.4 S 13740252 54.5 * 470.lbm 13740252 54.5 * 13740252 54.5 S 481.wrf 11170265 42.1 * 11170265 42.2 S 481.wrf 11170266 42.1 S 11170264 42.3 * 482.sphinx3 19490401 48.6 * 19490402 48.5 S 482.sphinx3 19490405 48.1 S 19490399 48.9 * so we can indeed see similar detrimental effects on 416.gamess; 447.dealII seems to improve with BB vectorization. That means the 416.gamess slowdown is definitely worth investigating since it reproduces with both AVX128 and AVX256 and with loop vectorization. I'll open a bug for it. > Note that there were some cases where disabling vectorization wholly > improved things. > > Anyway it doesn't look good for it. Did the binary size at least improve > with > > prefer-avx128, or was that also worse or insignificant? > Similar to the AVX258 results. I guess where AVX256 applied we now simply > do two vector ops with AVX128. > Richard. > > 'Allan
bootstrap failure due to declaration mismatch in r260956
Honza, I think your r260956 is missing the following hunk: Index: include/simple-object.h === --- include/simple-object.h (revision 260969) +++ include/simple-object.h (working copy) @@ -203,7 +203,7 @@ simple_object_release_write (simple_object_write * extern const char * simple_object_copy_lto_debug_sections (simple_object_read *src_object, const char *dest, - int *err); + int *err, int rename); #ifdef __cplusplus } Martin
Re: bootstrap failure due to declaration mismatch in r260956
On Wed, 30 May 2018, Martin Sebor wrote: > I think your r260956 is missing the following hunk: If this fixes the bootstrap for you (also ran into this myself just now), can you please go ahead and commit? We can always sort out things later, if there are details to be tweaked, but fixing a bootstrap failure with a simple one-liner like this, let's not get process-heavy and just do it. ;-) Gerald
Re: bootstrap failure due to declaration mismatch in r260956
On 05/30/2018 12:27 PM, Gerald Pfeifer wrote: On Wed, 30 May 2018, Martin Sebor wrote: I think your r260956 is missing the following hunk: If this fixes the bootstrap for you (also ran into this myself just now), can you please go ahead and commit? We can always sort out things later, if there are details to be tweaked, but fixing a bootstrap failure with a simple one-liner like this, let's not get process-heavy and just do it. ;-) Jakub already committed the missing change in r260970 so boostrap should be working again. Thanks Martin
Re: bootstrap failure due to declaration mismatch in r260956
> On 05/30/2018 12:27 PM, Gerald Pfeifer wrote: > >On Wed, 30 May 2018, Martin Sebor wrote: > >>I think your r260956 is missing the following hunk: > > > >If this fixes the bootstrap for you (also ran into this myself > >just now), can you please go ahead and commit? > > > >We can always sort out things later, if there are details to be > >tweaked, but fixing a bootstrap failure with a simple one-liner > >like this, let's not get process-heavy and just do it. ;-) > > Jakub already committed the missing change in r260970 so boostrap > should be working again. I apologize for that. I left svn commit waiting for commit log entry and did not notice that :( Thanks for fixing it! Honza > > Thanks > Martin >
gcc-6-20180530 is now available
Snapshot gcc-6-20180530 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/6-20180530/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 6 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-6-branch revision 260976 You'll find: gcc-6-20180530.tar.xzComplete GCC SHA256=34b2ec9f1c047cde51d35c7aea31952f0f40006dd64df78943edfffc6294d9ff SHA1=9ab3ed7e4e237611dcfabf42d4e46bbb9fb5a7a4 Diffs from 6-20180523 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-6 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.