Infering that the condition of a for loop is initially true?

2017-09-14 Thread Niels Möller
This is more of a question than a bug report, so I'm trying to send it
to the list rather than filing a bugzilla issue.

I think it's quite common to write for- and while-loops where the
condition is always initially true. A simple example might be

double average (const double *a, size_t n) 
{
  double sum;
  size_t i;

  assert (n > 0);
  for (i = 0, sum = 0; i < n; i++) 
sum += a[i];
  return sum / n;
}

The programmer could do the microptimization to rewrite it as a
do-while-loop instead. It would be nice if gcc could infer that the
condition is initially true, and convert to a do-while loop
automatically. 

Converting to a do-while-loop should produce slightly better code,
omitting the typical jump to enter the loop at the end where the
condition is checked. It would also make analysis of where variables are
written more accurate, which is my main concern at the moment.

My questions are:

1. Does gcc attempt to do this optimization? 

2. If it does, how often does it succeed on loops in real programs?

3. Can I help the compiler to do that inference?

The code I had some trouble with is at
https://git.lysator.liu.se/nettle/nettle/blob/master/ecc-mod.c. A
simplified version with only the interesting code path would be

void
ecc_mod (mp_size_t mn, mp_size_t bn, mp_limb_t *rp)
{
  mp_limb_t hi;
  mp_size_t sn = mn - bn;
  mp_size_t rn = 2*mn;

  assert (bn < mn);

  while (rn >= 2 * mn - bn)
{
  rn -= sn;
  ... code which sets hi ...
}

  ... code which uses hi ...
}

The intention is that the loop will run at least once, but
nevertheless, in this context, rewriting as do-while would make the
code uglier, imo. 

It looks obvious at first glance that the initial value rn = 2*mn
makes the condition rn >= 2*mn - bn true. All values are unsigned, and
the assert backs up that mn - bn won't underflow.

This is also used with pretty small mn, so that 2*mn won't overflow,
but unfortunately, that's not obvious to the compiler. In theory, the
function could be called with, e.g., on a 64-bit machine, mn ==
(1 << 63) and bn == 1, and then the initial loop condition would be 
0 >= ~0, which is false. So overflow cases seems to rule out the
optimization. 

I've seen gcc warn about hi being used unused in this function (but
only on sparc, and on a machine I no longer use, so I'm afraid I can't
provide details). I've also seen warnings from the clang static
analyzer (which I guess makes different tradeoffs than gcc -Wall when
it comes to false positives).

I would guess it's quite common that conditions which are always true
for relevant inputs may be false due to overflow with extreme inputs,
which in the case of unsigned variables doesn't leave any "undefined
behaviour"-freedom to the compiler.

What's the best way to tell the compiler, to promise that there will
be no overflows involving mn? I could try adding an assert, like

  rn = 2*mn; assert (rn > mn);

to rule out overflow, but that looks a bit unobvious to humans and
possibly unobvious to the compiler too.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


Re: Infering that the condition of a for loop is initially true?

2017-09-18 Thread Niels Möller
Marc Glisse  writes:

> assert is not what you want, since it completely disappears with
> -DNDEBUG. clang has __builtin_assume, with gcc you want a test and
> __builtin_unreachable. Replacing your assert with
> if(n==0)__builtin_unreachable();

To me, it would make sense to have assert expand to something like that,
when runtime checks are disabled with -NDEBUG.

After all, is a program fails an assertion, it can not be expected to
produce any meaningful results, even with run-time checks disabled using
-DNDEBUG. And it's somewhat counter-intutive, if -DNDEBUG produces a
binary in which the code downstream from assert statements are slightly
less well optimized.

I imagine the C standard might specify that assert "completely
disappears" if -DNDEBUG is defined. But maybe there's room for a
-fstrict-assert optimization flag, to let assert always guide
optimization?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


Re: Infering that the condition of a for loop is initially true?

2017-09-18 Thread Niels Möller
Jeff Law  writes:

>> 1. Does gcc attempt to do this optimization? 
> Yes.  It happens as a side effect of jump threading and there are also
> dedicated passes to rotate the loop.
>
>> 
>> 2. If it does, how often does it succeed on loops in real programs?
> Often.  The net benefit is actually small though and sometimes this kind
> of loop rotation can impede vectorization.

Thanks for the info.

>> 3. Can I help the compiler to do that inference?
> In general, I'd advise against it.  You end up with ugly code which
> works with specific versions of the compiler, but which needs regular
> tweaking as the internal implementations of various optimizers change
> over time.

What would make sense to me is some annotation of valid range of
variables, if it can be done in a way which is friendly to both compiler
and humans.

> I'd have to have a self-contained example to dig into what's really
> going on, but my suspicion is either overflow or fairly weak range data
> and simplification due to the symbolic ranges.

Self-contained example below (the #define of GMP_NUMB_BITS must be
manually changed if tested on some architecture where long isn't 64
bits).

Actually, I see now that the mn variable is read from a struct field of
type unsigned short. So as long as size_t (or unsigned long) is larger
than unsigned short, the expression 2*mn can't overflow, and a compiler
tracking possible range of mn as the range of unsigned short should be
able to infer that the loop condition of the second loop is initially
true.

To make the same inference for the first loop (which doesn't matter for
validity of the hi variable), the compiler would also need to be told
that bn > 0.

Regards,
/Niels

typedef unsigned long mp_size_t;
typedef unsigned long mp_limb_t;
typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;
/* Must match szie of mp_limb_t */
#define GMP_NUMB_BITS 64

#define assert(x) do {} while(0)

mp_limb_t mpn_addmul_1 (mp_ptr, mp_srcptr, mp_size_t, mp_limb_t);
mp_limb_t mpn_add_n (mp_ptr, mp_srcptr, mp_srcptr, mp_size_t);
mp_limb_t
sec_add_1 (mp_limb_t *rp, mp_limb_t *ap, mp_size_t n, mp_limb_t b);
#define cnd_add_n(cnd, rp, ap, n) mpn_addmul_1 ((rp), (ap), (n), (cnd) != 0)

struct ecc_modulo
{
  unsigned short bit_size;
  unsigned short size;
  unsigned short B_size;

  const mp_limb_t *m;
  /* B^size mod m. Expected to have at least 32 leading zeros
 (equality for secp_256r1). */
  const mp_limb_t *B;
  /* 2^{bit_size} - p, same value as above, but shifted. */
  const mp_limb_t *B_shifted;
};

/* Computes r mod m, input 2*m->size, output m->size. */
void
ecc_mod (const struct ecc_modulo *m, mp_limb_t *rp)
{
  mp_limb_t hi;
  mp_size_t mn = m->size;
  mp_size_t bn = m->B_size;
  mp_size_t sn = mn - bn;
  mp_size_t rn = 2*mn;
  mp_size_t i;
  unsigned shift;

  assert (bn < mn);

  /* FIXME: Could use mpn_addmul_2. */
  /* Eliminate sn limbs at a time */
  if (m->B[bn-1] < ((mp_limb_t) 1 << (GMP_NUMB_BITS - 1)))
{
  /* Multiply sn + 1 limbs at a time, so we get a mn+1 limb
 product. Then we can absorb the carry in the high limb */
  while (rn > 2 * mn - bn)
{
  rn -= sn;

  for (i = 0; i <= sn; i++)
rp[rn+i-1] = mpn_addmul_1 (rp + rn - mn - 1 + i, m->B, bn, 
rp[rn+i-1]);
  rp[rn-1] = rp[rn+sn-1]
+ mpn_add_n (rp + rn - sn - 1, rp + rn - sn - 1, rp + rn - 1, sn);
}
  goto final_limbs;
}
  else
{
  /* The loop below always runs at least once. But the analyzer
 doesn't realize that, and complains about hi being used later
 on without a well defined value. */
#ifdef __clang_analyzer__
  hi = 0;
#endif
  while (rn >= 2 * mn - bn)
{
  rn -= sn;

  for (i = 0; i < sn; i++)
rp[rn+i] = mpn_addmul_1 (rp + rn - mn + i, m->B, bn, rp[rn+i]);
 
  hi = mpn_add_n (rp + rn - sn, rp + rn - sn, rp + rn, sn);
  hi = cnd_add_n (hi, rp + rn - mn, m->B, mn);
  assert (hi == 0);
}
}

  if (rn > mn)
{
final_limbs:
  sn = rn - mn;
  
  for (i = 0; i < sn; i++)
rp[mn+i] = mpn_addmul_1 (rp + i, m->B, bn, rp[mn+i]);

  hi = mpn_add_n (rp + bn, rp + bn, rp + mn, sn);
  hi = sec_add_1 (rp + bn + sn, rp + bn + sn, mn - bn - sn, hi);
}

  shift = m->size * GMP_NUMB_BITS - m->bit_size;
  if (shift > 0)
{
  /* Combine hi with top bits, add in */
  hi = (hi << shift) | (rp[mn-1] >> (GMP_NUMB_BITS - shift));
  rp[mn-1] = (rp[mn-1] & (((mp_limb_t) 1 << (GMP_NUMB_BITS - shift)) - 1))
+ mpn_addmul_1 (rp, m->B_shifted, mn-1, hi);
}
  else
{
  hi = cnd_add_n (hi, rp, m->B_shifted, mn);
  assert (hi == 0);
}
}



-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


Re: Infering that the condition of a for loop is initially true?

2017-09-18 Thread Niels Möller
Andreas Schwab  writes:

> The problem is that assert is not allowed to evaluate the expression
> with -DNDEBUG, and side effects must not be carried out.

I'm suggesting that with -DNDEBUG, assert(x) should let the compiler
assume that x is true, but without producing any code to evaluate it at
runtime.

E.g, given

  void foo (size_t n)
  {
size_t i;
assert (n < 17);
for (i = n; i < 17; i++)
  { ... }
  }

the compiler could infer that for the loop body is executed at least
once for any valid input to the function. (Where assert is used to
declare what valid inputs are like).

Then I guess one can't implement that as simply as

#define assert(x) do {if (!(x)) __builtin_unreachable() } while(0)

if that would require the compiler to generate code to evaluate x at
runtime (I'm not familiar with how __builtin_unreachable works). So one
might need something like __builtin_assume which never produces any code
to evaluate its argument at runtime.

Regarsd,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


Re: Infering that the condition of a for loop is initially true?

2017-09-18 Thread Niels Möller
Joseph Myers  writes:

> On Mon, 18 Sep 2017, Niels Möller wrote:

>> I'm suggesting that with -DNDEBUG, assert(x) should let the compiler
>> assume that x is true, but without producing any code to evaluate it at
>> runtime.
>
> There's no requirement that x is even a valid expression with -DNDEBUG.  
> Consider code that does
>
>   int x;
> #ifndef NDEBUG
>   int other_variable_used_in_assertion = something ();
> #endif
>   /* ... */
>   assert (other_variable_used_in_assertion == x);

Ouch, didn't think about that case. And I'd expect there's a lot of real
code like that out there.

That makes extending the standard assert facility more difficult.

Still, reusing the familiar name "assert" in the source code would be
nice, but the behavior I ask for would need to be triggered by something
other than -DNDEBUG, a gcc-specific define or command line option. And I
guess we can't hijack something like -DNDEBUG=2, even though I'd expect
that to be pretty close to working in practice.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


Broken links to online gcc manual

2013-09-21 Thread Niels Möller
I tried to find the gcc manual online, and it was harder than I
expected.

The page https://www.gnu.org/software/gcc/ links to
https://www.gnu.org/software/gcc/onlinedocs/, which is dead (404 Not
found). And the long list at https://www.gnu.org/manual/ links to
http://gcc.gnu.org/onlinedocs/gcc/, which also is dead (also 404).

Not sure if the problem is on the gcc or gnu side of the web, or both.

I'm using iceweasel 17.0.9, with javascript mostly disabled, in case the
client matters.

BTW, most, but not all, of the links I found use https, see above. I
imagine you strive for using https exclusively.

Regards,
/Niels Möller

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.



[gnu.org #858034] Re: Broken links to online gcc manual

2013-09-21 Thread Niels Möller
Jonathan Wakely  writes:

> On 21 September 2013 09:11, Niels Möller wrote:
>> The page https://www.gnu.org/software/gcc/ links to
>> https://www.gnu.org/software/gcc/onlinedocs/, which is dead (404 Not
>> found).
>
> Works for me, I get a 302 redirect:

After a night's sleep, I realize that most likely, my browser was
configured to aggressively try to always use https rather than http.
Don't remember turning that on, and I haven't yet tried to
diable it, but I have to investigate.

Anyway, that aggressive behaviour totally breaks with the virtual
hosting setup at gcc.gnu.org/sourceware.org. Probably nothing easy you
can do about that in your end.

Regard,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


gcc extensibility

2012-03-29 Thread Niels Möller
I originally wrote this email as a reply to Ian Lance Taylor on a
different list, and he suggested that I send it also to the gcc list.
Please cc me on replies, since I'm not subscribed to the list. I hope
I'm not being too off-topic or off-the-mark.

Let me write down some reflections on gcc extensibility, even if I'm not
familiar at all with gcc internals.

1. I imagine the plugin API ought to stay in plain C, right?

2. Then there are at least two ways to think about the plugin API to,
   e.g., the gcc tree abstraction.

   Either one can define a C API one think the plugins will like, and
   then implement that on top of the internal C++ interfaces. These will
   be small wrapper functions, which lets the internal interfaces evolve
   without affecting the plugins.

   Or one sticks to a single "unified" tree API, to be used *both*
   internally and by plugins.

   I suspect the second option is the right one, because it brings some
   equality between plugin authors and gcc developers. It should make it
   easier to adopt a plugin into gcc proper. Together with (1), this
   forces the internal interface to be C rather than C++, which I guess
   you may see as a serious disadvantage.

   Going for a unified API, one gets less independence between plugins
   and gcc internals, but in return, one gets less clutter, and I think
   it may improve quality. Otherwise, it seems likely that one ends up
   with an internal interface which is powerful but somewhat ugly
   (internal use only, right?) and an external interface which may be
   beautiful on the surface, but in practice it's a second class citizen
   and awkward for doing interesting things with.

3. What is the purpose of the plugin API? I can see that it should make
   it easier to prototype new optimization passes. Both for educational
   purposes, and research, as well as by the gcc developers themselves.

   One of the goals you stated was "I think parts of GCC needs to move
   toward being a component of tools other than pure compilation, such
   as refactoring, profile analysis, debugging."

   I think we can all agree that's highly desirable. To be more
   concrete, I think it would be useful have access to information from
   the parse tree, from the symbol table (both for compiler and linker),
   dataflow analysis, etc, when editing C code in emacs. Is a plugin API
   the right tool for that type of integration? Or should one also have
   a gcc library, and have various other specialized tools link to that
   library?

   Maybe the organization of valgrind could provide some inspiration,
   with a couple of different tools on top of the same machinery.

Happy hacking,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.



Re: gcc extensibility

2012-03-29 Thread Niels Möller
Gabriel Dos Reis  writes:

> It is a false equality.  The needs of plugins authors are not necessarily
> the same as the need of GCC development itself.

I'm not so sure of that. To be concrete, can you give some examples of
things that a plugin might want to do with an interface to the
tree-abstraction in gcc, which gcc itself would never do? And vice
versa?

The typical needs may be different, but I don't think it's wise to a
priori rule out the possibility of designing, e.g, a tree API, which is
both powerful enough and clean enough to serve well both plugins and the
rest of gcc.

> You really do not want the existence of plugins to hinder (internal)
> evolution of GCC itself.

Agreed. Question is how much interface stability is needed by plugins,
and how often gcc internals must change important interfaces. I would
expect that binary compatibility is not terribly important, i.e., it's
acceptable if plugins sometimes have to be recompiled even for a new
minor release (and we don't want to allow proprietary binary-only
plugins anyway, right?).

Breaking source level compatibility ought to be avoided for minor
releases. But the current situation, where, due to the name mangling, it
seems difficult for a plugin to be compatible even with different
configurations of the *same* minor release of gcc, seems a bit too
inconvenient.

Then I'd also expect that certain interfaces, e.g., the tree interface,
can be a lot more stable than others.

Bottom line: When interface changes in gcc really are needed, then just
do them (preferably in connection with a major release), and plugins
will have to follow.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.