Infering that the condition of a for loop is initially true?
This is more of a question than a bug report, so I'm trying to send it to the list rather than filing a bugzilla issue. I think it's quite common to write for- and while-loops where the condition is always initially true. A simple example might be double average (const double *a, size_t n) { double sum; size_t i; assert (n > 0); for (i = 0, sum = 0; i < n; i++) sum += a[i]; return sum / n; } The programmer could do the microptimization to rewrite it as a do-while-loop instead. It would be nice if gcc could infer that the condition is initially true, and convert to a do-while loop automatically. Converting to a do-while-loop should produce slightly better code, omitting the typical jump to enter the loop at the end where the condition is checked. It would also make analysis of where variables are written more accurate, which is my main concern at the moment. My questions are: 1. Does gcc attempt to do this optimization? 2. If it does, how often does it succeed on loops in real programs? 3. Can I help the compiler to do that inference? The code I had some trouble with is at https://git.lysator.liu.se/nettle/nettle/blob/master/ecc-mod.c. A simplified version with only the interesting code path would be void ecc_mod (mp_size_t mn, mp_size_t bn, mp_limb_t *rp) { mp_limb_t hi; mp_size_t sn = mn - bn; mp_size_t rn = 2*mn; assert (bn < mn); while (rn >= 2 * mn - bn) { rn -= sn; ... code which sets hi ... } ... code which uses hi ... } The intention is that the loop will run at least once, but nevertheless, in this context, rewriting as do-while would make the code uglier, imo. It looks obvious at first glance that the initial value rn = 2*mn makes the condition rn >= 2*mn - bn true. All values are unsigned, and the assert backs up that mn - bn won't underflow. This is also used with pretty small mn, so that 2*mn won't overflow, but unfortunately, that's not obvious to the compiler. In theory, the function could be called with, e.g., on a 64-bit machine, mn == (1 << 63) and bn == 1, and then the initial loop condition would be 0 >= ~0, which is false. So overflow cases seems to rule out the optimization. I've seen gcc warn about hi being used unused in this function (but only on sparc, and on a machine I no longer use, so I'm afraid I can't provide details). I've also seen warnings from the clang static analyzer (which I guess makes different tradeoffs than gcc -Wall when it comes to false positives). I would guess it's quite common that conditions which are always true for relevant inputs may be false due to overflow with extreme inputs, which in the case of unsigned variables doesn't leave any "undefined behaviour"-freedom to the compiler. What's the best way to tell the compiler, to promise that there will be no overflows involving mn? I could try adding an assert, like rn = 2*mn; assert (rn > mn); to rule out overflow, but that looks a bit unobvious to humans and possibly unobvious to the compiler too. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.
Re: Infering that the condition of a for loop is initially true?
Marc Glisse writes: > assert is not what you want, since it completely disappears with > -DNDEBUG. clang has __builtin_assume, with gcc you want a test and > __builtin_unreachable. Replacing your assert with > if(n==0)__builtin_unreachable(); To me, it would make sense to have assert expand to something like that, when runtime checks are disabled with -NDEBUG. After all, is a program fails an assertion, it can not be expected to produce any meaningful results, even with run-time checks disabled using -DNDEBUG. And it's somewhat counter-intutive, if -DNDEBUG produces a binary in which the code downstream from assert statements are slightly less well optimized. I imagine the C standard might specify that assert "completely disappears" if -DNDEBUG is defined. But maybe there's room for a -fstrict-assert optimization flag, to let assert always guide optimization? Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.
Re: Infering that the condition of a for loop is initially true?
Jeff Law writes: >> 1. Does gcc attempt to do this optimization? > Yes. It happens as a side effect of jump threading and there are also > dedicated passes to rotate the loop. > >> >> 2. If it does, how often does it succeed on loops in real programs? > Often. The net benefit is actually small though and sometimes this kind > of loop rotation can impede vectorization. Thanks for the info. >> 3. Can I help the compiler to do that inference? > In general, I'd advise against it. You end up with ugly code which > works with specific versions of the compiler, but which needs regular > tweaking as the internal implementations of various optimizers change > over time. What would make sense to me is some annotation of valid range of variables, if it can be done in a way which is friendly to both compiler and humans. > I'd have to have a self-contained example to dig into what's really > going on, but my suspicion is either overflow or fairly weak range data > and simplification due to the symbolic ranges. Self-contained example below (the #define of GMP_NUMB_BITS must be manually changed if tested on some architecture where long isn't 64 bits). Actually, I see now that the mn variable is read from a struct field of type unsigned short. So as long as size_t (or unsigned long) is larger than unsigned short, the expression 2*mn can't overflow, and a compiler tracking possible range of mn as the range of unsigned short should be able to infer that the loop condition of the second loop is initially true. To make the same inference for the first loop (which doesn't matter for validity of the hi variable), the compiler would also need to be told that bn > 0. Regards, /Niels typedef unsigned long mp_size_t; typedef unsigned long mp_limb_t; typedef mp_limb_t *mp_ptr; typedef const mp_limb_t *mp_srcptr; /* Must match szie of mp_limb_t */ #define GMP_NUMB_BITS 64 #define assert(x) do {} while(0) mp_limb_t mpn_addmul_1 (mp_ptr, mp_srcptr, mp_size_t, mp_limb_t); mp_limb_t mpn_add_n (mp_ptr, mp_srcptr, mp_srcptr, mp_size_t); mp_limb_t sec_add_1 (mp_limb_t *rp, mp_limb_t *ap, mp_size_t n, mp_limb_t b); #define cnd_add_n(cnd, rp, ap, n) mpn_addmul_1 ((rp), (ap), (n), (cnd) != 0) struct ecc_modulo { unsigned short bit_size; unsigned short size; unsigned short B_size; const mp_limb_t *m; /* B^size mod m. Expected to have at least 32 leading zeros (equality for secp_256r1). */ const mp_limb_t *B; /* 2^{bit_size} - p, same value as above, but shifted. */ const mp_limb_t *B_shifted; }; /* Computes r mod m, input 2*m->size, output m->size. */ void ecc_mod (const struct ecc_modulo *m, mp_limb_t *rp) { mp_limb_t hi; mp_size_t mn = m->size; mp_size_t bn = m->B_size; mp_size_t sn = mn - bn; mp_size_t rn = 2*mn; mp_size_t i; unsigned shift; assert (bn < mn); /* FIXME: Could use mpn_addmul_2. */ /* Eliminate sn limbs at a time */ if (m->B[bn-1] < ((mp_limb_t) 1 << (GMP_NUMB_BITS - 1))) { /* Multiply sn + 1 limbs at a time, so we get a mn+1 limb product. Then we can absorb the carry in the high limb */ while (rn > 2 * mn - bn) { rn -= sn; for (i = 0; i <= sn; i++) rp[rn+i-1] = mpn_addmul_1 (rp + rn - mn - 1 + i, m->B, bn, rp[rn+i-1]); rp[rn-1] = rp[rn+sn-1] + mpn_add_n (rp + rn - sn - 1, rp + rn - sn - 1, rp + rn - 1, sn); } goto final_limbs; } else { /* The loop below always runs at least once. But the analyzer doesn't realize that, and complains about hi being used later on without a well defined value. */ #ifdef __clang_analyzer__ hi = 0; #endif while (rn >= 2 * mn - bn) { rn -= sn; for (i = 0; i < sn; i++) rp[rn+i] = mpn_addmul_1 (rp + rn - mn + i, m->B, bn, rp[rn+i]); hi = mpn_add_n (rp + rn - sn, rp + rn - sn, rp + rn, sn); hi = cnd_add_n (hi, rp + rn - mn, m->B, mn); assert (hi == 0); } } if (rn > mn) { final_limbs: sn = rn - mn; for (i = 0; i < sn; i++) rp[mn+i] = mpn_addmul_1 (rp + i, m->B, bn, rp[mn+i]); hi = mpn_add_n (rp + bn, rp + bn, rp + mn, sn); hi = sec_add_1 (rp + bn + sn, rp + bn + sn, mn - bn - sn, hi); } shift = m->size * GMP_NUMB_BITS - m->bit_size; if (shift > 0) { /* Combine hi with top bits, add in */ hi = (hi << shift) | (rp[mn-1] >> (GMP_NUMB_BITS - shift)); rp[mn-1] = (rp[mn-1] & (((mp_limb_t) 1 << (GMP_NUMB_BITS - shift)) - 1)) + mpn_addmul_1 (rp, m->B_shifted, mn-1, hi); } else { hi = cnd_add_n (hi, rp, m->B_shifted, mn); assert (hi == 0); } } -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.
Re: Infering that the condition of a for loop is initially true?
Andreas Schwab writes: > The problem is that assert is not allowed to evaluate the expression > with -DNDEBUG, and side effects must not be carried out. I'm suggesting that with -DNDEBUG, assert(x) should let the compiler assume that x is true, but without producing any code to evaluate it at runtime. E.g, given void foo (size_t n) { size_t i; assert (n < 17); for (i = n; i < 17; i++) { ... } } the compiler could infer that for the loop body is executed at least once for any valid input to the function. (Where assert is used to declare what valid inputs are like). Then I guess one can't implement that as simply as #define assert(x) do {if (!(x)) __builtin_unreachable() } while(0) if that would require the compiler to generate code to evaluate x at runtime (I'm not familiar with how __builtin_unreachable works). So one might need something like __builtin_assume which never produces any code to evaluate its argument at runtime. Regarsd, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.
Re: Infering that the condition of a for loop is initially true?
Joseph Myers writes: > On Mon, 18 Sep 2017, Niels Möller wrote: >> I'm suggesting that with -DNDEBUG, assert(x) should let the compiler >> assume that x is true, but without producing any code to evaluate it at >> runtime. > > There's no requirement that x is even a valid expression with -DNDEBUG. > Consider code that does > > int x; > #ifndef NDEBUG > int other_variable_used_in_assertion = something (); > #endif > /* ... */ > assert (other_variable_used_in_assertion == x); Ouch, didn't think about that case. And I'd expect there's a lot of real code like that out there. That makes extending the standard assert facility more difficult. Still, reusing the familiar name "assert" in the source code would be nice, but the behavior I ask for would need to be triggered by something other than -DNDEBUG, a gcc-specific define or command line option. And I guess we can't hijack something like -DNDEBUG=2, even though I'd expect that to be pretty close to working in practice. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677. Internet email is subject to wholesale government surveillance.
Broken links to online gcc manual
I tried to find the gcc manual online, and it was harder than I expected. The page https://www.gnu.org/software/gcc/ links to https://www.gnu.org/software/gcc/onlinedocs/, which is dead (404 Not found). And the long list at https://www.gnu.org/manual/ links to http://gcc.gnu.org/onlinedocs/gcc/, which also is dead (also 404). Not sure if the problem is on the gcc or gnu side of the web, or both. I'm using iceweasel 17.0.9, with javascript mostly disabled, in case the client matters. BTW, most, but not all, of the links I found use https, see above. I imagine you strive for using https exclusively. Regards, /Niels Möller -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
[gnu.org #858034] Re: Broken links to online gcc manual
Jonathan Wakely writes: > On 21 September 2013 09:11, Niels Möller wrote: >> The page https://www.gnu.org/software/gcc/ links to >> https://www.gnu.org/software/gcc/onlinedocs/, which is dead (404 Not >> found). > > Works for me, I get a 302 redirect: After a night's sleep, I realize that most likely, my browser was configured to aggressively try to always use https rather than http. Don't remember turning that on, and I haven't yet tried to diable it, but I have to investigate. Anyway, that aggressive behaviour totally breaks with the virtual hosting setup at gcc.gnu.org/sourceware.org. Probably nothing easy you can do about that in your end. Regard, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
gcc extensibility
I originally wrote this email as a reply to Ian Lance Taylor on a different list, and he suggested that I send it also to the gcc list. Please cc me on replies, since I'm not subscribed to the list. I hope I'm not being too off-topic or off-the-mark. Let me write down some reflections on gcc extensibility, even if I'm not familiar at all with gcc internals. 1. I imagine the plugin API ought to stay in plain C, right? 2. Then there are at least two ways to think about the plugin API to, e.g., the gcc tree abstraction. Either one can define a C API one think the plugins will like, and then implement that on top of the internal C++ interfaces. These will be small wrapper functions, which lets the internal interfaces evolve without affecting the plugins. Or one sticks to a single "unified" tree API, to be used *both* internally and by plugins. I suspect the second option is the right one, because it brings some equality between plugin authors and gcc developers. It should make it easier to adopt a plugin into gcc proper. Together with (1), this forces the internal interface to be C rather than C++, which I guess you may see as a serious disadvantage. Going for a unified API, one gets less independence between plugins and gcc internals, but in return, one gets less clutter, and I think it may improve quality. Otherwise, it seems likely that one ends up with an internal interface which is powerful but somewhat ugly (internal use only, right?) and an external interface which may be beautiful on the surface, but in practice it's a second class citizen and awkward for doing interesting things with. 3. What is the purpose of the plugin API? I can see that it should make it easier to prototype new optimization passes. Both for educational purposes, and research, as well as by the gcc developers themselves. One of the goals you stated was "I think parts of GCC needs to move toward being a component of tools other than pure compilation, such as refactoring, profile analysis, debugging." I think we can all agree that's highly desirable. To be more concrete, I think it would be useful have access to information from the parse tree, from the symbol table (both for compiler and linker), dataflow analysis, etc, when editing C code in emacs. Is a plugin API the right tool for that type of integration? Or should one also have a gcc library, and have various other specialized tools link to that library? Maybe the organization of valgrind could provide some inspiration, with a couple of different tools on top of the same machinery. Happy hacking, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
Re: gcc extensibility
Gabriel Dos Reis writes: > It is a false equality. The needs of plugins authors are not necessarily > the same as the need of GCC development itself. I'm not so sure of that. To be concrete, can you give some examples of things that a plugin might want to do with an interface to the tree-abstraction in gcc, which gcc itself would never do? And vice versa? The typical needs may be different, but I don't think it's wise to a priori rule out the possibility of designing, e.g, a tree API, which is both powerful enough and clean enough to serve well both plugins and the rest of gcc. > You really do not want the existence of plugins to hinder (internal) > evolution of GCC itself. Agreed. Question is how much interface stability is needed by plugins, and how often gcc internals must change important interfaces. I would expect that binary compatibility is not terribly important, i.e., it's acceptable if plugins sometimes have to be recompiled even for a new minor release (and we don't want to allow proprietary binary-only plugins anyway, right?). Breaking source level compatibility ought to be avoided for minor releases. But the current situation, where, due to the name mangling, it seems difficult for a plugin to be compatible even with different configurations of the *same* minor release of gcc, seems a bit too inconvenient. Then I'd also expect that certain interfaces, e.g., the tree interface, can be a lot more stable than others. Bottom line: When interface changes in gcc really are needed, then just do them (preferably in connection with a major release), and plugins will have to follow. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.