[Bug tree-optimization/28364] New: poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com
Consider the following test program:

#include 
bool has_bad_chars(std::string const & path)
{
  for (std::string::const_iterator c = path.begin(); c != path.end(); c++)
{
  unsigned char x = static_cast(*c);
  if (x <= 0x1f || x == 0x5c || x == 0x7f)
return true;
}
  return false;
}

At -O2, GCC 4.1 chooses to duplicate the entire body of the loop for no good
reason; the code is not rendered more straight-line by this, and in fact it
executes strictly more instructions even for a single-character string.  I'll
attach an assembly file showing what it did (Z13has_bad_charsRKSs) and what it
should have done (_Z14has_bad_chars2RKSs).  The bad transformation is done by
the .t45.ch pass, which acronym does not mean anything to me.


-- 
   Summary: poor optimization choices when iterating over a
std::string (probably not c++-specific)
   Product: gcc
   Version: 4.1.2
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zackw at panix dot com
 GCC build triplet: i486-linux-gnu
  GCC host triplet: i486-linux-gnu
GCC target triplet: i486-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #1 from zackw at panix dot com  2006-07-12 22:33 ---
Created an attachment (id=11874)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11874&action=view)
assembly output (bad on top, hand-corrected below)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #3 from zackw at panix dot com  2006-07-12 22:42 ---
I should mention that the exact command line flags were -O2
-fomit-frame-pointer -march=pentium4, and that I hand-tweaked the label numbers
for ease of reading.

Also, -fno-tree-ch does suppress this bad optimization, but in exchange we get
mildly worse code from the loop optimizer proper - it uses [reg+reg] indexing
and a 0..n count instead of [reg] indexing and a base..limit count.  The code
is pretty short so I'll just paste it here (meaningless labels removed):

_Z17has_bad_chars_newRKSs:
pushl   %ebx
movl8(%esp), %eax
movl(%eax), %eax
xorl%ecx, %ecx
movl-12(%eax), %ebx
.L2:
cmpl%ecx, %ebx
je  .L10
movzbl  (%ecx,%eax), %edx
cmpb$31, %dl
jbe .L4
cmpb$92, %dl
je  .L4
addl$1, %ecx
cmpb$127, %dl
jne .L2
.L4:
movl$1, %eax
popl%ebx
.p2align 4,,2
ret
.L10:
xorl%eax, %eax
popl%ebx
.p2align 4,,2
ret

Looking at the code, I see that the entire purpose of tree-ch is to duplicate
loop bodies in this fashion, and the justification given is that it "increases
effectiveness of code motion and reduces the need for loop preconditioning",
which I take to cover the above degradation in addressing mode choice.  I'm not
an optimizer expert, but surely there is a way to get the best of both worlds
here...?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #4 from zackw at panix dot com  2006-07-12 22:48 ---
I remembered that I had a build of 4.2 from last week lying around.  It
generates even worse code - still with the duplication of most of the loop,
plus a bunch of unnecessary register fiddling and bad addressing mode choice.


-- 

zackw at panix dot com changed:

   What|Removed |Added

  Known to fail||4.1.2 4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #7 from zackw at panix dot com  2006-07-12 23:19 ---
Created an attachment (id=11875)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=11875&action=view)
C test case (with interesting implications)

I've found a plain C test case.  In the process, I've found that the way
libstdc++  is coded interacts interestingly with the optimizer.

In the attached file, has_bad_chars_bad is a literal translation to C of the
code seen by the optimizers after inlining for the original C++ test case. 
Yes, libstdc++  does the moral equivalent of ((struct
rep*)path)[-1].len.  This function compiles to the same bad code as my original
test case.

has_bad_chars_good, on the other hand, is how I naively thought  worked
on the first read-through.  That one compiles to code which looks optimal to
me.  I suspect some optimizer or other is not smart enough to see through this
particular construct ... it would be good to make it do so, since we want
libstdc++  to generate good code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #8 from zackw at panix dot com  2006-07-12 23:21 ---
Zdenek: I don't see how you can say that what we get is optimal code "unless
optimizing for size".  The code generated *will* be slower than the
alternative.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #10 from zackw at panix dot com  2006-07-12 23:33 ---
I-cache.  Also, more iterations before the branch predictors figure out what's
going on.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #12 from zackw at panix dot com  2006-07-13 03:09 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

> > I-cache.
> this only matters if this increase in code size will make the code go
> out of instruction cache.

The real program that this is taken from is a large C++ application
which is guaranteed to go out of cache - it's got slightly less than
four megabytes of .text - the actual goal is to make sure all of its
inner inner inner loops do stay in cache.  And this is one of 'em.

> > Also, more iterations before the branch predictors figure out what's
> > going on.
> But also possibly more consistent behavior with respect to branch
> prediction, in case the loop is often exited in the first iteration.

Again, in real life I know a priori that the function is rarely, if
ever, called with a zero-length string.

-

I went through the tree dumps for my week-old 4.2.0 for the test
program with a fine comb.  They are quite instructive.  If tree-ch
were doing what it says on the label -- copying the loop header --
everything would be fine; dom1 cleans up the two copies of the header
later.  However, ch isn't just copying the loop header; it is also
copying the *entire body of the loop*, which nothing can fix. I call
that a clear bug.

zw


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #14 from zackw at panix dot com  2006-07-13 03:40 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

It's a validation routine, yes, which is run over every pathname the
program is working on, and there can be hundreds or thousands of
those.

And why the heck shouldn't I be able to use std::string in inner
loops?  I sure don't want to be using bare char*...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-12 Thread zackw at panix dot com


--- Comment #17 from zackw at panix dot com  2006-07-13 04:23 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

> One more comment, a loop with an early exit is whole issue and that is the
> reason why all of the code in the loop is considered the header. There are a
> couple of loop headers in this case, one for each exit which makes it harder 
> to
> deal with in general.

I didn't know that, and it is not obvious from the optimizer dumps.  Thanks for
explaining.

> What you did not mention is which how would this loop exit normally, via the
> return 1 or all the way through the loop.  There is no enough information from
> GCC's point of view to figure that out without profiling (for this case).  GCC
> is assuming that the loop exits in the first if statement which seems
> reasoniable. Maybe you should try with profiling information and see what GCC
> does for this testcase.

 Feedback-directed optimization is only good for making
compilers look better on benchmarks.  It's useless in real life. 

I can, in fact, get good code out of gcc 4.1 by beating it over the
head with __builtin_expect, but I don't think I should have to do
that.  I think my suggested version is better code no matter whether
or not the loop exits early.

4.2 still makes what I consider to be bad addressing mode choices
after that change, but Zdenek did say he would look at that.  It also
puts the "return 1" exit block in the middle of the loop in spite of
being told that all three conditions leading to that are unlikely.

struct rep
{
  unsigned long len;
  unsigned long alloc;
  unsigned long dummy;
};

struct data
{
  char * ptr;
};

struct string
{
  struct rep R;
  struct data D;
};

int
has_bad_chars(struct data *path)
{
  char *c;
  for (c = path->ptr;
   __builtin_expect(c < path->ptr + ((struct rep *)path)[-1].len, 1);
   c++)
{
  unsigned char x = (unsigned char)(*c);
  if (__builtin_expect(x <= 0x1f || x == 0x5c || x == 0x7f, 0))
return 1;
}
  return 0;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread zackw at panix dot com


--- Comment #20 from zackw at panix dot com  2006-07-13 08:25 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

> > However, ch isn't just copying the loop header; it is also
> > copying the *entire body of the loop*, which nothing can fix. I call
> > that a clear bug.
>
> how do you define a loop header?

I was under the impression it was just the one basic block called out
in the .ch dump, e.g.

;; Loop 1
;;  header 6, latch 5
;;  depth 1, level 1, outer 0

-- basic block 6 happens to contain just the code from the syntactic
loop condition.  Andrew informs me that this is wrong, and that in
this case the header is the entire loop, but I will come back at that
with 'ch should never be duplicating the entire loop; if the header is
the entire loop, it should do something more sensible, like duplicate
just the first basic block or something.'


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2006-07-13 Thread zackw at panix dot com


--- Comment #21 from zackw at panix dot com  2006-07-13 08:28 ---
Subject: Re:  poor optimization choices when iterating over a std::string
(probably not c++-specific)

> on your real program, how much performance do you gain by hand-rewriting
> the assembler to look the way you like?  Just to make sure there really
> is a problem.

I'm a little annoyed by the suggestion that this wouldn't be a real
problem if I couldn't measure a performance difference.

Depending on workload, other activity on the same machine, and phase
of moon, this loop is between .1% and 1% of runtime, and my tweaks
make it go about a third faster.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364



[Bug libstdc++/31464] New: Extension request: publicly visible forward-declaration headers for and all STL containers

2007-04-03 Thread zackw at panix dot com
It would be nice if there were publicly visible headers analogous to 
for  and all the STL headers.  To be more concrete: for every STL
header   there should be a  that contains *only*
forward declarations of the types with complete declarations in . 
The existing  is an example of exactly what I have in mind,
and should be promoted to .

The value of these headers would be primarily in writing application headers;
it is common to need to include  or  or whatever solely to be able
to refer to the types it defines in function prototypes.  By providing
forward-declaration headers, the application headers would not need to drag in
the full text of the STL headers.  Application source files could then include
the complete-declaration headers only for the types they actually used.


-- 
   Summary: Extension request: publicly visible forward-declaration
headers for  and all STL containers
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31464



[Bug fortran/31519] spurious ICE messages when module does not compile

2007-04-15 Thread zackw at panix dot com


--- Comment #6 from zackw at panix dot com  2007-04-15 21:22 ---
Bugs where the compiler proper crashes when run under the driver, but not when
run directly, can often be reproduced by varying the amount of space taken up
by environment variables, e.g.

x=
while :; do
  x="x$x"
  X=$x ./f951 arguments... || break
done


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31519



[Bug preprocessor/11242] [mingw32] #include takes my memory directory instead of the standard memory header file

2007-04-20 Thread zackw at panix dot com


--- Comment #6 from zackw at panix dot com  2007-04-21 02:56 ---
I am inclined to think that this is an operating system bug and should be
worked around in the mingw32 libraries, not in cpplib.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11242



[Bug driver/30972] Call to _access has invalid parameter when linked with msvcrt (for vista)

2007-09-06 Thread zackw at panix dot com


--- Comment #5 from zackw at panix dot com  2007-09-06 17:35 ---
Nah, probably no one will ever get round to it and it's not that important.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30972



[Bug c/37591] New: suppress "signed and unsigned" warnings when signed value known to be positive

2008-09-19 Thread zackw at panix dot com
In a case such as this, GCC ought to be able to prove that the signed variable
is positive and therefore suppress "signed and unsigned" warnings.  I see this
in both C and C++.

#define MAX(a,b) ((a) > (b) ? (a) : (b))
#define MIN(a,b) ((a) < (b) ? (a) : (b))

unsigned int
constrain(unsigned int index, unsigned int offset, unsigned int limit)
{
  int adj = index - offset;
  adj = MAX(adj, 0);
  return MIN(adj, limit); /* { dg-bogus "signed and unsigned" } */
}


-- 
   Summary: suppress "signed and unsigned" warnings when signed
value known to be positive
   Product: gcc
   Version: 4.3.2
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: enhancement
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37591



[Bug c/37591] suppress "signed and unsigned" warnings when signed value known to be positive

2008-09-19 Thread zackw at panix dot com


--- Comment #2 from zackw at panix dot com  2008-09-19 21:28 ---
I'd be fine with it being like uninitialized value warnings.  The false
positives here are *really* annoying.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37591



[Bug c/37591] suppress "signed and unsigned" warnings when signed value known to be positive

2008-09-22 Thread zackw at panix dot com


--- Comment #5 from zackw at panix dot com  2008-09-22 15:46 ---
I'm not monitoring consensus of developers anymore, but I think we *should*
either move these warnings to the middle end or do some CCP/VRP in the front
ends.  The -Wuninitialized warnings are a lot less trouble than the
signed/unsigned warnings, right now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37591



[Bug tree-optimization/57230] New: tree-ssa-strlen incorrectly optimizes a strlen to 0

2013-05-09 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57230

Bug ID: 57230
   Summary: tree-ssa-strlen incorrectly optimizes a strlen to 0
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

GCC 4.7 and 4.8 mis-optimize this test case:

  int main(void)
  {
char pstring[] = " hello world";
pstring[0] = (char) (__builtin_strlen(pstring) - 1);
__builtin_printf("%zd\n", __builtin_strlen(pstring));
return 0;
  }

The value written to pstring[0] is (char)11, which is nonzero, so both calls to
__builtin_strlen should return 12.  However, tree-ssa-strlen replaces the
second call with a constant 0.

Here are tree dumps right before (-fdump-tree-phiopt2) ...

  main ()
  {
char pstring[13];
long unsigned int _3;
unsigned char _4;
unsigned char _5;
char _6;
long unsigned int _8;

:
pstring = " hello world";
_3 = __builtin_strlen (&pstring);
_4 = (unsigned char) _3;
_5 = _4 + 255;
_6 = (char) _5;
pstring[0] = _6;
_8 = __builtin_strlen (&pstring);
__builtin_printf ("%zd\n", _8);
pstring ={v} {CLOBBER};
return 0;
  }

... and right after (-fdump-tree-strlen):

  main ()
  {
char pstring[13];
long unsigned int _3;
unsigned char _4;
unsigned char _5;
char _6;
long unsigned int _8;

:
pstring = " hello world";
_3 = __builtin_strlen (&pstring);
_4 = (unsigned char) _3;
_5 = _4 + 255;
_6 = (char) _5;
pstring[0] = _6;
_8 = 0;
__builtin_printf ("%zd\n", _8);
pstring ={v} {CLOBBER};
return 0;
  }

This is both a missed and an incorrect optimization; it should have replaced
the *first* call to __builtin_strlen with 12, and ideally also been able to
figure out that the second call was unaffected by the write to pstring[0]. 
However, only the replacement of the second call with 0 is a correctness issue.

Simplified from Debian #707118
<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=707118> (which I'm about to
reopen).  Observed with both 4.7.3 and 4.8.0.


[Bug tree-optimization/57230] [4.7/4.8/4.9 Regression] tree-ssa-strlen incorrectly optimizes a strlen to 0

2013-05-10 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57230

--- Comment #5 from Zack Weinberg  ---
It might be good to include stores to nonzero offsets in the test case, and
stores to bytes that used to be internal NULs, something like

int main(void)
{
  char s[] = "abc\0def";
  s[1] = (char) strlen(s);
  if (strlen(s) != 3) abort();
  s[2] = (char) strlen(s);
  if (strlen(s) != 3) abort();
  s[3] = (char) strlen(s);
  if (strlen(s) != 7) abort();
  return 0;
}

I dunno how much of this is worth trying to optimize, but I'll also mention
that if the values at a certain offset before and after a store are not known
for certain but are both known to be nonzero, that store doesn't have to
invalidate a cached string length.


[Bug tree-optimization/57230] [4.7/4.8/4.9 Regression] tree-ssa-strlen incorrectly optimizes a strlen to 0

2013-05-10 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57230

--- Comment #8 from Zack Weinberg  ---
(In reply to Jakub Jelinek from comment #6)
> We have just one strlen pass instance, and even if we optimize the first
> strlen
> there, having strlen pass duplicate constant propagation functionality just
> to handle this weird testcase (storing strlen sizes inside of characters is
> nothing common) doesn't make sense.

I think I gave the wrong impression there; I was trying to exhibit a more
comprehensive *correctness* test for this transformation, using strlen() as a
convenient-to-hand function known in context to return a nonzero value.  Lemme
try again.

  extern __SIZE_TYPE__ strlen(const char *);
  extern int rand(void);
  extern void abort(void) __attribute__((noreturn));
  #define assert(x) do { if (!(x)) abort(); } while (0)

  /* Returns a value that might or might not be zero. */
  static inline char maybe_zero() { return rand() & 0x10; }
  /* Returns an unknown but definitely-nonzero value. */
  static inline char nonzero() { return (rand() & 0x10) | 0x01; }

  int main(void)
  {
char s[] = "abc\0def";

/* These operations do not change the length of the string. */
s[0] = nonzero();  assert(strlen(s) == 3);
s[1] = nonzero();  assert(strlen(s) == 3);
s[2] = nonzero();  assert(strlen(s) == 3);
s[3] = '\0';   assert(strlen(s) == 3);
s[4] = '\0';   assert(strlen(s) == 3);
s[4] = 'd';assert(strlen(s) == 3);

/* These operations change the length of the string, to a
   value (in principle) computable at compile time. */
s[3] = nonzero();  assert(strlen(s) == 7);
s[5] = '\0';   assert(strlen(s) == 5);
s[2] = '\0';   assert(strlen(s) == 2);
s[2] = 'c';assert(strlen(s) == 5);
s[5] = nonzero();  assert(strlen(s) == 7);

/* This operation requires runtime recalculation of the
   length of the string. */
s[4] = maybe_zero();
assert(strlen(s) == (s[4] ? 7 : 4));
return 0;
  }

Of all of that, the most important thing IMHO (given the original bug report)
is to verify that writing '\0' at a nonzero *offset* within a string does not
cause a subsequent strlen() to report the length of the string as zero.


[Bug c++/41233] New: Templated conversion operator produces symbol name that won't demangle

2009-09-02 Thread zackw at panix dot com
This program

  class Source;
  class Dest;

  struct converter
  {
converter(Source *s) {}
template operator D*();
  };

  Dest* f(Source* s) { return converter(s); }

when compiled by several different versions of g++ (I tried 4.3.4 and 4.4.1;
the person who asked me to reduce the test case found the problem on OSX with
some version of Apple-modified gcc 4.2), produces this symbol name for the
conversion operator:

  _ZN9convertercvPT_I4DestEEv

c++filt from binutils 2.19 will not demangle this symbol.  I have not been able
to figure out for certain whether this is an incorrect mangling, a demangler
bug, or both.  These variations demangle to close approximations of the right
thing:

  _ZN9convertercvT_IN4DestEEEv -> converter::operator Dest()
  _ZN9convertercvT_IPN4DestEEEv -> converter::operator Dest*()

but 

  _ZN9convertercvPT_IN4DestEEEv

does not demangle, either.


-- 
   Summary: Templated conversion operator produces symbol name that
won't demangle
   Product: gcc
   Version: 4.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41233



[Bug c/47623] New: false *negative* uninitialized warning

2011-02-06 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47623

   Summary: false *negative* uninitialized warning
   Product: gcc
   Version: 4.5.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


Consider

int foo(int x)
{
   int a;
   if (x) return a;
   return 0;
}

This function should trigger the "‘a’ may be used uninitialized" warning when
compiled with -O -Wall, but it doesn't; there are no diagnostics at all.  The
assembly output (x86-64) is just

foo:
movl$0, %eax
ret

Looking at optimization dumps, this is what we have right before CCP1:

foo (int x)
{
  int a;

:
  if (x_2(D) != 0)
goto ;
  else
goto ;

:
  a_4 = a_3(D);
  goto ;

:
  a_5 = 0;

:
  # a_1 = PHI 
  return a_1;
}

but right *after* CCP1, we have instead

:
  # a_1 = PHI 
  return 0;

-- I presume that CCP notices that one of the inputs to the PHI is undefined
and ignores that possibility under the "well, if that happened, there would be
undefined behavior, so we can assume that can't happen" principle.  CDDCE1 then
erases the remainder of the code and we're left with

foo (int x)
{
:
  return 0;
}

which has no uninitialized variable accesses, of course.  I'm not sure exactly
where uninitialized warnings happen these days, but clearly it's after this
point.

As far as a fix, I recommend either that any pass that can make use of the
"that variable is uninitialized so we're going to ignore the possibility of
that control path" principle should emit -Wuninitialized warnings, or (my
personal preference) we scrap the notion of doing -Wuninitialized from inside
the optimizers, and switch to something predictable, like Java's definite
assignment rules.


[Bug c++/47723] internal compiler: Segmentation fault - gcc 4.5.2 Arch Linux

2011-02-13 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47723

--- Comment #1 from Zack Weinberg  2011-02-13 23:27:10 
UTC ---
> I have run the build a second time and it makes it past the point
> where it segfaulted, 

This means your problem is almost certainly *not* a bug in GCC, but rather a
hardware fault: most commonly a bad RAM module.  Please run a memory tester
against your computer (a good one is at http://www.memtest.org/ ) -- you may
have to run it for several hours for definitive results.


[Bug tree-optimization/57315] New: LTO and/or vectorizer performance regression on salsa20 core, 4.7->4.8

2013-05-17 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57315

Bug ID: 57315
   Summary: LTO and/or vectorizer performance regression on
salsa20 core, 4.7->4.8
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

I'm seeing a significant performance regression from 4.7 to 4.8 (targeting
x86-64) on the "salsa20" core function (this is a stream cipher).  Repro
instructions:

$ git clone git://github.com/zackw/rngstats.git
# ...

$ make -s cipher-test CC=gcc-4.7 && ./cipher-test >&/dev/null && ./cipher-test
KAT:   aes128... ok
KAT:   aes256... ok
KAT: arc4... ok
KAT:  isaac64... ok
KAT:  salsa20_128... ok
KAT:  salsa20_256... ok
TIME:  aes128... 2000 keys,   3.47834s ->  574.987 keys/s
TIME:  aes256... 2000 keys,   3.62452s ->  551.797 keys/s
TIME:arc4... 2000 keys,   2.21746s ->  901.933 keys/s
TIME: isaac64... 2000 keys,   2.03467s ->  982.962 keys/s
TIME: salsa20_128... 2000 keys,   2.31960s ->  862.217 keys/s
TIME: salsa20_256... 2000 keys,   2.31932s ->  862.320 keys/s

$ make -s clean cipher-test CC=gcc-4.8 && ./cipher-test >&/dev/null &&
./cipher-test
KAT:   aes128... ok
KAT:   aes256... ok
KAT: arc4... ok
KAT:  isaac64... ok
KAT:  salsa20_128... ok
KAT:  salsa20_256... ok
TIME:  aes128... 2000 keys,   2.49224s ->  802.491 keys/s
TIME:  aes256... 2000 keys,   3.62372s ->  551.919 keys/s
TIME:arc4... 2000 keys,   2.22794s ->  897.689 keys/s
TIME: isaac64... 2000 keys,   2.05087s ->  975.194 keys/s
TIME: salsa20_128... 2000 keys,   3.53085s ->  566.436 keys/s
TIME: salsa20_256... 2000 keys,   2.53003s ->  790.505 keys/s

The regression shows in the last two TIME: lines for each build.  The relevant
code is probably in ciphers/salsa20.c, or else in worker.c.

Note that there are other programs in this repository, and they require unusual
libraries to build.  I recommend you do not attempt a "make all", and if you
get errors, try commenting out the CFLAGS.mpi and LIBS.mpi lines in the
Makefile.


[Bug tree-optimization/57315] LTO and/or vectorizer performance regression on salsa20 core, 4.7->4.8

2013-05-28 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57315

--- Comment #2 from Zack Weinberg  ---
Created attachment 30210
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30210&action=edit
self-contained test case

Here's a self-contained test case.

$ gcc-4.7 -std=c99 -O2 -march=native salsa20-regr.c && ./a.out
 875.178 keys/s
$ gcc-4.8 -std=c99 -O2 -march=native salsa20-regr.c && ./a.out
 808.869 keys/s

$ gcc-4.7 -std=c99 -O3 -march=native salsa20-regr.c && ./a.out
 867.879 keys/s
$ gcc-4.8 -std=c99 -O3 -march=native salsa20-regr.c && ./a.out
 800.794 keys/s

$ gcc-4.7 -std=c99 -O3 -fwhole-program -march=native salsa20-regr.c && ./a.out 
 606.605 keys/s
$ gcc-4.8 -std=c99 -O3 -fwhole-program -march=native salsa20-regr.c && ./a.out 
 571.935 keys/s

These numbers are stable to within about 1 key/s.  So there's a 6-8% regression
from 4.7 to 4.8 regardless of optimization level, but also -O3 and -O3
-fwhole-program are inferior to -O2 for this program, with both compilers. 
(-O2 -fwhole-program is within noise of just -O2 for both.)

With 4.8, -march=native on my computer expands to

-march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm
-mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mavx
-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c
-mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt
--param l1-cache-size=0 --param l1-cache-line-size=0 --param l2-cache-size=256
-mtune=corei7-avx


[Bug c++/50436] New: Crash or hang on invalid template code

2011-09-16 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50436

 Bug #: 50436
   Summary: Crash or hang on invalid template code
Classification: Unclassified
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


This (invalid) test case causes gcc 4.6.1 to crash (segmentation fault). 
Variations cause it to go into an infinite loop instead.  I tripped over the
infinite loop while delta-minimizing a much, much larger test case for an
unrelated problem.

template  struct VI {};
template 
struct IP
{
  static const bool r = IP::r;
};
template  struct V
{
  VI::r> vi;
};
struct X;
struct Y
{
  V v;
};


[Bug c++/50436] Crash or hang on invalid template code

2011-09-16 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50436

--- Comment #1 from Zack Weinberg  2011-09-16 19:13:13 
UTC ---
Here's a variant that hangs.

template  struct VI {};
template 
struct IP
{
  static const bool r = IP::r;
};
template 
struct V
{
  static const bool r = IP::r;
  VI vi;
};
struct X;
struct Y
{
  V v;
}


[Bug c++/50442] New: Constructing T from implicit conversion to T& ambiguous in C++0x mode, not C++98

2011-09-16 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50442

 Bug #: 50442
   Summary: Constructing T from implicit conversion to T&
ambiguous in C++0x mode, not C++98
Classification: Unclassified
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


This test case ...

template  struct MoveRef { operator T& () {} };
template  MoveRef  Move(T&) {}
struct Thing {};
Thing foo(const Thing* p) { return Thing(Move(*p)); }

... generates these diagnostics from g++ 4.6.1 in c++0x/gnu++0x mode, but not
in c++98/gnu++98 mode:

t.cc: In function ‘Thing foo(const Thing*)’:
t.cc:4:50: error: call of overloaded ‘Thing(MoveRef)’ is ambiguous
t.cc:4:50: note: candidates are:
t.cc:3:8: note: constexpr Thing::Thing(const Thing&)
t.cc:3:8: note: constexpr Thing::Thing(Thing&&)

This is a regression from g++ 4.5.x and earlier.


[Bug c++/50595] New: template overload resolution insufficiently sensitive to name dependency?

2011-10-02 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50595

 Bug #: 50595
   Summary: template overload resolution insufficiently sensitive
to name dependency?
Classification: Unclassified
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


Created attachment 25401
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25401
test case

Per http://stackoverflow.com/questions/7630806 the expected output of the
attached test case should be

f(char): 1
f(int):  T(1)
f(int):  t
f(char): 1
f(char): T(1)
f(char): t

since the second and third calls to f() in g() have arguments that are
template-dependent, so resolution of *those* function calls (but not the first
call) should be deferred until the point of instantiation (h()), at which time
both f(int) and f(char) are visible. What actually happens with g++ 4.6 is

f(char): 1
f(char): T(1)
f(char): t
f(char): 1
f(char): T(1)
f(char): t

I'm not 100% convinced by the argument above, but I am convinced enough to see
what y'all think.


[Bug driver/30972] Call to _access has invalid parameter when linked with msvcrt (for vista)

2007-05-10 Thread zackw at panix dot com


--- Comment #3 from zackw at panix dot com  2007-05-10 08:36 ---
What the heck are we doing calling access() in the first place?  We should just
go ahead and try to execute .../cc1.exe for all relevant paths.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30972



[Bug libstdc++/31906] "-Xcompiler" is inserted after "-Xlinker" when building libstdc++

2007-05-11 Thread zackw at panix dot com


--- Comment #3 from zackw at panix dot com  2007-05-12 07:34 ---
I believe Mike means to suggest you use -Wl,yada,yada,yada instead of -Xlinker
yada -Xlinker yada -Xlinker yada.  (I also suspect this will help.)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31906



[Bug c++/17422] Duplicate work, missed optimizations for registration of static destructors

2007-06-05 Thread zackw at panix dot com


--- Comment #4 from zackw at panix dot com  2007-06-06 00:41 ---
Subject: Re:  Duplicate work, missed optimizations for registration of static
destructors

It's better, but build_cleanup is still being called twice in the
non-__cxa_atexit case, and I can't tell whether there is any
extremely-short-lived garbage still in the __cxa_atexit case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17422



[Bug middle-end/32441] [4.3 Regression] ICE in expand_expr_real_1, at expr.c:7109

2007-07-06 Thread zackw at panix dot com


--- Comment #9 from zackw at panix dot com  2007-07-06 17:59 ---
Ian Taylor and I found a simpler patch that does not use langhooks, and also
has the virtue of working. ;-)  Committed as rev 126424.


-- 

zackw at panix dot com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32441



[Bug target/32659] New: abort during reload on (subreg:PSI (reg:DI ...) 4)

2007-07-06 Thread zackw at panix dot com
The attached test case produces this RTL sequence after lreg, when compiled at
-O2 -mcpu=m32cm:

(insn 12 11 13 2 ba-bug.ii:24 (set (reg:DI 26 [ p.2 ])
(mem/s:DI (reg/f:PSI 31) [3 S8 A8])) 176 {movdi_splittable}
(expr_list:R
EG_DEAD (reg/f:PSI 31)
(nil)))

(insn 13 12 14 2 ba-bug.ii:24 (set (reg:PSI 25 [ p$second ])
(subreg:PSI (reg:DI 26 [ p.2 ]) 4)) 172 {movpsi_op} (nil))

[For reference, in -mcpu=m32cm mode, this target uses PSImode for 24-bit
pointers, which appear to be being padded to 32 bits in memory.]

During processing of insn 13, reload demands a MODE_PARTIAL_INT mode with at
least 64 bits.  There is no such mode, and the compiler aborts in
smallest_mode_for_size.

The test case is simplified from libstdc++ bitmap_alloc.cc.  It is extremely
fragile; any further simplification and the problem RTL sequence will be
eliminated by earlier optimizers.


-- 
   Summary: abort during reload on (subreg:PSI (reg:DI ...) 4)
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zackw at panix dot com
GCC target triplet: m32c-unknown-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32659



[Bug target/32659] abort during reload on (subreg:PSI (reg:DI ...) 4)

2007-07-06 Thread zackw at panix dot com


--- Comment #1 from zackw at panix dot com  2007-07-06 22:12 ---
Created an attachment (id=13858)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13858&action=view)
test case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32659



[Bug target/32659] abort during reload on (subreg:PSI (reg:DI ...) 4)

2007-07-06 Thread zackw at panix dot com


--- Comment #2 from zackw at panix dot com  2007-07-07 00:05 ---


*** This bug has been marked as a duplicate of 32656 ***


-- 

zackw at panix dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32659



[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220

2007-07-06 Thread zackw at panix dot com


--- Comment #2 from zackw at panix dot com  2007-07-07 00:05 ---
*** Bug 32659 has been marked as a duplicate of this bug. ***


-- 

zackw at panix dot com changed:

   What|Removed |Added

 CC||zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656



[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220

2007-07-06 Thread zackw at panix dot com


--- Comment #3 from zackw at panix dot com  2007-07-07 00:06 ---
Created an attachment (id=13859)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13859&action=view)
Minimized test case

Here's a minimal test case (from my duplicate report of this, #32659).  I
analyzed it a bit there - the important thing to know is that reload wants a
MODE_PARTIAL_INT with 64 bits; as the target has no such mode, we abort.  I
think reload wants this in order to deal with a (subreg:PSI (reg:DI))
construct.


-- 

zackw at panix dot com changed:

   What|Removed |Added

  Attachment #13856|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656



[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220

2007-07-06 Thread zackw at panix dot com


--- Comment #4 from zackw at panix dot com  2007-07-07 00:07 ---
Adding DJ to cc: list, confirming.


-- 

zackw at panix dot com changed:

   What|Removed |Added

 CC||dj at redhat dot com
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-07-07 00:07:50
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656



[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220

2007-07-06 Thread zackw at panix dot com


--- Comment #5 from zackw at panix dot com  2007-07-07 00:15 ---
Created an attachment (id=13860)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13860&action=view)
Even smaller test case


-- 

zackw at panix dot com changed:

   What|Removed |Added

  Attachment #13859|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656



[Bug middle-end/32656] [4.3 regression] m32c: ICE in smallest_mode_for_size, at stor-layout.c:220

2007-07-06 Thread zackw at panix dot com


--- Comment #6 from zackw at panix dot com  2007-07-07 00:20 ---
Created an attachment (id=13861)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13861&action=view)
and smaller still

I can't believe I didn't think of these reductions three hours ago...


-- 

zackw at panix dot com changed:

   What|Removed |Added

  Attachment #13860|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32656



[Bug c/88576] -fno-math-errno causes GCC to consider that malloc does not set errno

2018-12-22 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88576

Zack Weinberg  changed:

   What|Removed |Added

 CC||zackw at panix dot com

--- Comment #5 from Zack Weinberg  ---
The C standard doesn't say malloc _will_ set errno on failure, but it also
doesn't say it _won't_, and all library functions are allowed to clobber the
value of errno unless it is specifically documented that they won't (N1570 7.5
[Errors], para 3, last sentence).

In any case, an option named -fno-math-errno has no business affecting the
treatment of functions that have nothing to do with mathematics.

[Bug c/88576] -fno-math-errno causes GCC to consider that malloc does not set errno

2018-12-22 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88576

Zack Weinberg  changed:

   What|Removed |Added

 CC||rguenther at suse dot de

--- Comment #8 from Zack Weinberg  ---
(In reply to Aurelien Jarno from comment #7)
> (In reply to Zack Weinberg from comment #5)
> > The C standard doesn't say malloc _will_ set errno on failure, but it also
> 
> Well at least POSIX says:
> Otherwise, it shall return a null pointer and set errno to indicate the
> error. 

Yeah, I wasn't denying that, I was responding to Andrew taking the attitude
that this was fine because ISO C proper _didn't_ say that.

I dug into the code a little and it looks like this was an intentional re-use
of -fno-math-errno to also mean "and neither will malloc", in the patch for PR 
42944.  I think that's wrong, but perhaps Richard Biener would like to argue
otherwise...

[Bug c/88576] -fno-math-errno causes GCC to consider that malloc does not set errno

2018-12-22 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88576

--- Comment #9 from Zack Weinberg  ---
... whoops, hit send a little too early.  AFAICT, the relevant code is
call_may_clobber_ref_p_1 in tree-ssa-alias.c; I would say that the uses of
flag_errno_math under the cases BUILT_IN_MALLOC, ALIGNED_ALLOC, CALLOC, STRDUP,
STRNDUP, POSIX_MEMALIGN, REALLOC in that function are all wrong, and GCC should
unconditionally assume errno may be clobbered by those builtins.  If this
behavior is felt to be valuable, it should get its own -f switch.

(The uses of flag_errno_math under BUILT_IN_GAMMA*, LGAMMA*, and REMQUO* are
appropriate, though, as those _are_ math functions.)

[Bug middle-end/88576] -fno-math-errno causes GCC to consider that malloc does not set errno

2019-01-10 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88576

--- Comment #14 from Zack Weinberg  ---
I don't see why it would _ever_ make sense for -fno-alloc-errno to default to
the setting of -fno-math-errno.  The math functions and the memory allocation
functions are independent components of the C library.  Each toggle's default
should be settable independently by the target configuration.

[Bug c/3885] Incorrect "invalid suffix on integer constant" error

2018-04-11 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3885

Zack Weinberg  changed:

   What|Removed |Added

 CC||zackw at panix dot com

--- Comment #14 from Zack Weinberg  ---
I think we should reopen this bug and treat it as a request to improve the
diagnostics.

GCC is correct to reject tokens of the form "0x000E-0x0001", but the
accumulation of duplicate bug reports suggests that we could do a better job of
explaining what is wrong.  Proposal: whenever we are about to issue the
"invalid suffix on (integer/floating) constant" error, if the first character
of the "invalid suffix" is + or -, we say instead something like

error: invalid numeric constant ""
note: ('+'/'-') after ('e'/'p'/'E'/'P') continues a pp-number
note: to perform (addition/subtraction), put a space before the ('+'/'-')

Printing the entire pp-number token instead of just the "suffix", mentioning
the standardese term "pp-number", and calling the token an "invalid numeric
constant" instead of an "invalid integer/floating constant" should clue people
in to _why_ this apparently perverse tokenization is happening.

[Bug rtl-optimization/86028] New: Dead stores created by va_start/va_arg are not fully cleaned up

2018-06-01 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86028

Bug ID: 86028
   Summary: Dead stores created by va_start/va_arg are not fully
cleaned up
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

On any ABI where arguments to a variadic function are passed in the same places
that they would be if they were arguments to a non-variadic function, it should
be possible to optimize 'foo_wrapper' in the following test program all the way
down to a tail-call to 'foo' and nothing else:

#include 

extern int a;
extern int b;
extern void *c;

int __attribute__((noinline))
foo(int x, int y, void *z)
{
  a = x;
  b = y;
  c = z;
  return 0;
}

int
foo_wrapper(int x, int y, ...)
{
  va_list ap;
  void *z;
  va_start(ap, y);
  z = va_arg(ap, void *);
  va_end(ap);
  return foo(x, y, z);
}

('foo' is included in this test program so that one can easily verify that no
argument shuffling is needed.)

gcc-8.1 targeting x86-64-linux, x86-32-linux, or aarch64-linux (all of which
meet the above ABI requirement) does not manage to do this.  It actually does
the best job for x86-32, where everything is on the stack:

foo_wrapper:
pushl   12(%esp)
pushl   12(%esp)
pushl   12(%esp)
callfoo
addl$12, %esp
ret

This is literally duplicating 'foo_wrapper's incoming arguments into a new
frame in order to call 'foo'.  The instructions are unnecessary, but they are
not dead in the formal sense.   Perhaps the issue here is just that variadic
functions aren't being considered for sibcall optimization?

For the targets where arguments are passed in registers, the code generation is
worse, e.g. aarch64:

foo_wrapper:
stp x29, x30, [sp, -64]!
add x3, sp, 48
add x4, sp, 64
mov x29, sp
stp x4, x4, [sp, 16]
str x3, [sp, 32]
stp wzr, wzr, [sp, 40]
str x2, [sp, 56]
bl  foo
ldp x29, x30, [sp], 64
ret

The actual arguments to foo_wrapper are in x0, x1, and x2, and that's also
where foo wants them, and they aren't touched at all.  All of the computation
done here is dead.

(I noticed this while messing around with glibc's syscall wrappers, which
really do things like this.  'foo_wrapper' has the type signature of 'fcntl',
for instance.)

[Bug c/91554] New: if (!__builtin_constant_p (x)) warning_function() works in inline when x is int, not when x is void *

2019-08-26 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91554

Bug ID: 91554
   Summary: if (!__builtin_constant_p (x)) warning_function()
works in inline when x is int, not when x is void *
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

This snippet

```
extern void thefun_called_with_nonnull_arg (void)
__attribute__((__warning__(
"'thefun' called with second argument not NULL")));

extern int real_thefun (void *, void *);

static inline int
thefun (void *a, void *b)
{
   if (!__builtin_constant_p(b) || b != 0)
   thefun_called_with_nonnull_arg();
   return real_thefun(a, b);
}

int warning_expected (void *a, void *b)
{
return thefun(a, b);
}
int warning_not_expected (void *a)
{
return thefun(a, 0);
}
```

generates warnings from _both_ `warning_expected` and `warning_not_expected`,
on all versions of GCC I can conveniently test (see
https://godbolt.org/z/V-FHtZ ).  If I change the type of `b` to be `int`
throughout, or if I convert the static inline to a macro

```
#define thefun(a, b) \
  (((!__builtin_constant_p(b) || (b) != 0) \
? thefun_called_with_nonnull_arg() \
: (void) 0),   \
   real_thefun(a, b))
```

then I get a warning only for `warning_expected`, which is the behavior I want.

It seems to me that whether or not you can use `__builtin_constant_p` to guard
a call to a function declared with attribute warning, should not depend on the
type of __builtin_constant_p's argument.

[Bug c/91554] if (!__builtin_constant_p (x)) warning_function() works in inline when x is int, not when x is void *

2019-08-27 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91554

--- Comment #2 from Zack Weinberg  ---
Additional fun detail:

```
static inline int
thefun (void *a, void *b)
{
   if (!__builtin_constant_p((__UINTPTR_TYPE__)b) || b != 0)
   thefun_called_with_nonnull_arg();
   return real_thefun(a, b);
}
```

still warns for any use of `thefun`, but

```
static inline int
thefun (void *a, void *b)
{
   if (!__builtin_constant_p((short)(__UINTPTR_TYPE__)b) || b != 0)
   thefun_called_with_nonnull_arg();
   return real_thefun(a, b);
}
```

works as intended!  `(int)(__UINTPTR_TYPE__)` also works as intended on targets
where __UINTPTR_TYPE__ is bigger than int.

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/builtins.c;h=f902e246f1f347be4d4dc04e339fa865393039fe#l8462
looks suspicious to me.  Note also the STRIP_NOPS shortly above, which might
explain why it matters whether the pointer is cast to a different-sized integer
type.

[Bug c/91554] if (!__builtin_constant_p (fn_arg)) warning_function() works in inline when fn_arg is int, not when it is void *

2019-08-27 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91554

--- Comment #4 from Zack Weinberg  ---
(In reply to Richard Biener from comment #3)
> I guess you want to use
> 
> __builtin_constant_p (b != 0)
> 
> instead.

That wouldn't do what I want.  The goal is to warn for any argument _other
than_ a compile-time null pointer.  `!__builtin_constant_p(b) || b != 0` does
just that (it might be easier to understand the De Morgan equivalent
`!(__builtin_constant_p(b) && b == 0)`.  This is in aid of deprecating the
second argument to gettimeofday (see
https://sourceware.org/git/?p=glibc.git;a=blob;f=NEWS;h=75453bbda865c7d51df39177caef40b16e086dcf#l53
and
https://sourceware.org/git/?p=glibc.git;a=blob;f=manual/time.texi;h=cb234bd08fae1841034a2bdccf4e1d246be23034#l557
).

> The docs don't explain what a "constant at compile time" is
> so whether for example the address of a global or the address of an
> automatic var would be "constant".  But I'd say the above incorrectly
> disregards the NULL-pointer case.

It seems like this code pre-dates tree optimizations, I would suggest removing
these lines

8470   || AGGREGATE_TYPE_P (TREE_TYPE (arg))
8471   || POINTER_TYPE_P (TREE_TYPE (arg))

(and fixing the comment above to match) and seeing if that breaks anything.

[Bug preprocessor/80005] cpp expands __has_include() filename parts

2019-06-12 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80005

Zack Weinberg  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-06-12
 CC||zackw at panix dot com
 Ever confirmed|0   |1

--- Comment #1 from Zack Weinberg  ---
Confirming.  This has now bitten an attempt to use __has_include in glibc, see
discussion starting at
https://sourceware.org/ml/libc-alpha/2019-06/msg00254.html .

The proper fix, IMO, would be to apply the same special tokenization rules to
the argument of __has_include that are used for the argument of #include. This
is not exactly the same as "don't macro-expand the argument."  Rather, in

  #if !__has_include()

the character sequence  should be processed as a single
"h-char-sequence" token, in the language of C99 6.10.2.  So there never is a
"linux" identifier token to macro-expand (nor is there a "memfd" or "h"
identifier).

However, if we instead have

  #define HEADER_TO_LOOK_FOR "special.h"
  #if __has_include (HEADER_TO_LOOK_FOR)

then HEADER_TO_LOOK_FOR should be macro-expanded.

Also, h-char-sequences and q-char-sequences are not string literals.  For
example, if we have

  #if __has_include("X11\Xlib.h")

the \X is _not_ an ill-formed escape sequence.

[Bug c++/90885] GCC should warn about 2^16 and 2^32 and 2^64

2019-06-14 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90885

--- Comment #13 from Zack Weinberg  ---
Since examples of this error were observed with base 10, I think the warning
should cover 10^i for decimal literal i, too.

Relatedly, “note: ^ performs exclusive or, not exponentiation” might be a nice
addition to the existing error for ^ with a float for either operand.

[Bug lto/48200] Implement function attribute for symbol versioning (.symver)

2018-08-06 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48200

Zack Weinberg  changed:

   What|Removed |Added

 CC||zackw at panix dot com

--- Comment #26 from Zack Weinberg  ---
/subscribe

I've tripped over this problem myself in the context of the new
password-hashing library, libxcrypt (see
https://github.com/besser82/libxcrypt/issues/24 )  Our symbol-versioning
techniques are lifted from glibc (see
https://github.com/besser82/libxcrypt/blob/aae4c1baea534d2e4c9dfe2faf42ee0c5f7a6f22/crypt-port.h#L122
).  Function attributes seem like a good replacement to me, although I'd ask
that people think about how a library author might write macros that will
seamlessly fall back to the older technique.

[Bug tree-optimization/28364] poor optimization choices when iterating over a std::string (probably not c++-specific)

2018-03-02 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28364

--- Comment #30 from Zack Weinberg  ---
It's been a very long time and I don't know exactly what changed, but GCC 7.3
generates essentially the same code for both of the functions in the "C test
case" and I would not describe that code as "bad".

I can still make it duplicate the entire body of the loop by relatively small
tweaks, though.  For instance, 

int has_bad_chars(char *str, __SIZE_TYPE__ len)
{
  for (char *c = str; c < str + len; c++)
{
  unsigned char x = (unsigned char)(*c);
  if (x <= 0x1f || x == 0x5c || x == 0x7f)
return 1;
}
  return 0;
}

generates significantly worse code (doubling cache footprint for no gain in
branch predictability or any other metric) with -O2 than -Os.

Also, the code generated for the body of the loop (with both the original test
case and the above) is more complicated than it needs to be, but perhaps that
should be a new bug report.

[Bug tree-optimization/84673] New: Overcomplicated code generation for a chain of mutually exclusive conditions

2018-03-02 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84673

Bug ID: 84673
   Summary: Overcomplicated code generation for a chain of
mutually exclusive conditions
   Product: gcc
   Version: 7.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

This function

int has_bad_chars(unsigned char *str, __SIZE_TYPE__ len)
{
  for (unsigned char *c = str; c < str + len; c++)
{
  unsigned char x = *c;
  if (__builtin_expect (x <= 0x1f || x == 0x5c || x == 0x7f,
0))
return 1;
}
  return 0;
}

compiles with GCC 7.3.1 at -Os -march=native on a current-generation x86-64 to

has_bad_chars:
addq%rdi, %rsi
.L2:
cmpq%rsi, %rdi
jnb .L7
movb(%rdi), %al
cmpb$31, %al
setbe   %cl
cmpb$92, %al
sete%dl
orb %dl, %cl
jne .L5
cmpb$127, %al
je  .L5
incq%rdi
jmp .L2
.L7:
xorl%eax, %eax
ret
.L5:
movl$1, %eax
ret

It is six bytes shorter, and also I think more efficient, to generate this
instead:

has_bad_chars:
.LFB0:
.cfi_startproc
addq%rdi, %rsi
.L2:
cmpq%rsi, %rdi
jnb .L7
movb(%rdi), %al
cmpb$31, %al
jbe .L5
cmpb$92, %al
je  .L5
cmpb$127, %al
je  .L5
incq%rdi
jmp .L2
.L7:
xorl%eax, %eax
ret
.L5:
movl$1, %eax
ret

The same thing happens at -O2, but also a chunk of the loop body gets
pointlessly duplicated above the loop (it looks like it tried to unroll the
loop and got stuck halfway).

[Bug target/84772] New: powerpc-spe: Spurious "is used uninitialized" warning, or possibly incorrect codegen for va_arg(long double)

2018-03-08 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84772

Bug ID: 84772
   Summary: powerpc-spe: Spurious "is used uninitialized" warning,
or possibly incorrect codegen for va_arg(long double)
   Product: gcc
   Version: 7.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

The following test case is a mechanical reduction from the implementation of
vfprintf in GNU libc:

extern void *memset(void *, int, __SIZE_TYPE__);
void f(int *a, __builtin_va_list b)
{
  memset(a, 2, sizeof(int));
  for (;;)
__builtin_va_arg(b, long double);
}

I get a spurious uninitialized value warning on the __builtin_va_arg line with
a powerpc-spe compiler, but not with the exact same build targeting a different
architecture -- not even a different powerpc variant:

$ powerpc-glibc-linux-gnuspe-gcc -S -O -Wall -Werror vfp_min3.c; echo $?
vfp_min3.c: In function ‘f’:
vfp_min3.c:6:25: error: ‘va_arg_tmp.3’ is used uninitialized in this function
[-Werror=uninitialized]
 __builtin_va_arg(b, long double);
 ^~~~
cc1: all warnings being treated as errors
1
$ powerpc-glibc-linux-gnu-gcc -S -O -Wall -Werror vfp_min3.c; echo $?
0

$ powerpc-glibc-linux-gnuspe-gcc --version
powerpc-glibc-linux-gnuspe-gcc (GCC) 7.3.1 20180307 [gcc-7-branch revision
258338]
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ powerpc-glibc-linux-gnu-gcc --version
powerpc-glibc-linux-gnu-gcc (GCC) 7.3.1 20180307 [gcc-7-branch revision 258338]
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

This is an extra frustrating spurious-uninitialized bug because it's
complaining about a variable that doesn't even exist in the source code -- I
guess it's a  temporary used inside the expansion of __builtin_va_arg?  So it
might not even be *spurious*, I suppose, it might be telling me that that
expansion is wrong!

[Bug target/84772] powerpc-spe: Spurious "is used uninitialized" warning, or possibly incorrect codegen for va_arg(long double)

2018-03-09 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84772

--- Comment #2 from Zack Weinberg  ---
You don't appear to have the exact same build as me.  But there's something
fishy going on with that, because as far as I can tell, SVN rev 258338 is a
*trunk* revision, not a gcc-7-branch revision.

Anyway, I'm gonna attach a .stdarg dump from the compiler that does trigger the
problem, maybe comparing that to the .stdarg dump you get will help?

[Bug target/84772] powerpc-spe: Spurious "is used uninitialized" warning, or possibly incorrect codegen for va_arg(long double)

2018-03-09 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84772

--- Comment #3 from Zack Weinberg  ---
Created attachment 43607
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43607&action=edit
-fdump-tree-stdarg output for test case from problem compiler

I do not entirely understand how to read this (what do the _123(D) suffixes on
everything mean?) but right after  there sure does appear to be a read of
the variable va_arg_tmp.3, which was introduced by this pass, and which does
not appear to have been previously initialized.

[Bug target/84772] powerpc-spe: Spurious "is used uninitialized" warning, or possibly incorrect codegen for va_arg(long double)

2018-03-09 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84772

--- Comment #4 from Zack Weinberg  ---
The last actual _change_ in my GCC build is

r258315 | denisc | 2018-03-07 04:13:12 -0500 (Wed, 07 Mar 2018) | 9 lines

Backport from mainline
2018-02-07  Georg-Johann Lay  

PR target/84209
* config/avr/avr.h (GENERAL_REGNO_P, GENERAL_REG_P): New macros.
* config/avr/avr.md: Only post-reload split REG-REG moves if
either register is GENERAL_REG_P.

[Bug target/84772] powerpc-spe: Spurious "is used uninitialized" warning, or possibly incorrect codegen for va_arg(long double)

2018-03-09 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84772

--- Comment #7 from Zack Weinberg  ---
I no longer remember enough about GIMPLE to comment on your actual proposed
fix, but I do have a small nitpick on the test case:

+va_arg (ap, long double);  /* { dg-bogus "may be used
uninitialized in this function" } */  

I think that needs to be just { dg-bogus "used uninitialized" }, as you might
get either the "may be used initialized" or the "is used uninitialized"
message.

[Bug libstdc++/56166] New: std::string::clear() can allocate memory despite being marked noexcept

2013-01-31 Thread zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56166



 Bug #: 56166

   Summary: std::string::clear() can allocate memory despite being

marked noexcept

Classification: Unclassified

   Product: gcc

   Version: 4.7.2

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: libstdc++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: za...@panix.com





The attached test case (which is regrettably large, but I haven't found a

shorter way to trigger the problem) will call terminate() when compiled with

-std=c++11, despite there being no point at which an exception should escape

main().  The problem is deep inside std::basic_string:



/usr/include/c++/4.7/bits/basic_string.h

796   /**

797*  Erases the string, making it empty.

798*/

799   void

800   clear() _GLIBCXX_NOEXCEPT

801   { _M_mutate(0, this->size(), 0); }



_M_mutate can allocate memory even when it's being asked to erase the string

(presumably due to internal reference-counting -- if you take 's2' out of the

test case the crash goes away), and thus can throw bad_alloc.  In C++11 mode,

clear() is marked 'noexcept', so we wind up in terminate().


[Bug libstdc++/56166] std::string::clear() can throw despite being marked noexcept

2013-01-31 Thread zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56166



--- Comment #1 from Zack Weinberg  2013-01-31 18:21:49 
UTC ---

Created attachment 29320

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29320

test case



In case you're wondering, this was an attempt to do at least *some* testing of

a library's robustness in the face of memory allocation failure.


[Bug libstdc++/56166] std::string::clear() can throw despite being marked noexcept

2013-01-31 Thread zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56166



--- Comment #4 from Zack Weinberg  2013-02-01 00:48:43 
UTC ---

Is vstring going to be promoted to std::string in the *near* future? it doesn't

seem done to me, eg there is no stringstream for it, and it appears to generate

bulkier code. (not scientifically benchmarked)



that said, it doesn't fail this test case, but I haven't checked how its

clear() works, so I'm not sure it has no bug here.


[Bug middle-end/48580] missed optimization: integer overflow checks

2013-02-02 Thread zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580



--- Comment #18 from Zack Weinberg  2013-02-02 21:59:37 
UTC ---

I find it a little disappointing that what should have been a straightforward

additional optimization has gotten totally derailed into bikeshedding of an

enormous class of builtins which seem unlikely ever to gain any traction.  I

just want the code in the original bug report to be optimized.  That would be

enough.


[Bug c++/70178] New: Loop-invariant memory loads from std::string innards are not hoisted

2016-03-10 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70178

Bug ID: 70178
   Summary: Loop-invariant memory loads from std::string innards
are not hoisted
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

Consider

#include 
#include 
#include 
using std::string;

inline unsigned char hexval(unsigned char c)
{
if (c >= '0' && c <= '9')
return c - '0';
else if (c >= 'A' && c <= 'F')
return c - 'A' + 10;
else if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
else
throw "input character not a hexadecimal digit";
}

void hex2ascii_1(const string& in, string& out)
{
size_t inlen = in.length();
if (inlen % 2 != 0)
throw "input length not a multiple of 2";
out.clear();
out.reserve(inlen / 2);
for (string::const_iterator p = in.begin(); p != in.end(); p++)
{
   unsigned char c = hexval(*p);
   p++;
   c = (c << 4) + hexval(*p);
   out.push_back(c);
}
}

void hex2ascii_2(const string& in, string& out)
{
size_t inlen = in.length();
if (inlen % 2 != 0)
throw "input length not a multiple of 2";
out.clear();
out.reserve(inlen / 2);
std::transform(in.begin(), in.end() - 1, in.begin() + 1,
   std::back_inserter(out),
   [](unsigned char a, unsigned char b)
   { return (hexval(a) << 4) + hexval(b); });
}

It seems to me that both of these should be optimizable to the equivalent thing
you would write in C, with all the pointers in registers ...

void hex2ascii_hypothetical(const string& in, string& out)
{
size_t inlen = in.length();
if (inlen % 2 != 0)
throw "input length not a multiple of 2";
out.clear();
out.reserve(inlen / 2);

const unsigned char *p = in._M_data();
const unsigned char *limit = p + in._M_length();
unsigned char *q = out._M_data();
// (check for pointer wrap-around here?)

while (p < limit)
{
*q++ = (hexval(p[0]) << 4) + hexval(p[1]);
p += 2;
}
}

Maybe it wouldn't be able to deduce that capacity overflow is impossible by
construction, but that's a detail.  The important thing is that g++ 5 and 6
cannot hoist the pointer initializations out of the loop as shown.  They reload
p, limit, and q from memory (that is, from the relevant string objects) on
every iteration.

[Bug c++/70178] Loop-invariant memory loads from std::string innards are not hoisted

2016-03-11 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70178

--- Comment #2 from Zack Weinberg  ---
That was my working hypothesis as well.  Isn't there some way we can annotate
s->data to reassure the compiler that *this* char* doesn't alias?  I don't know
enough about the guts of std::string to know whether 'restrict' is accurate,
but it certainly shouldn't be pointing to anything but string contents.

[Bug c++/70178] Loop-invariant memory loads from std::string innards are not hoisted

2016-03-14 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70178

--- Comment #5 from Zack Weinberg  ---
It seems to me that a pair of extended integer types (signed and unsigned),
that are the same size as `char` but don't get the special TBAA treatment for
"character types", would be easier to implement and validate than figuring out
what "restrict" means when applied to a data member, and probably just as
effective in this context.  `std::string` doesn't expose bare writable pointers
so I *think* this has no compatibility implications.

Note that I do *not* think it would be safe for [u]int8_t to be (defined as)
those types.

[Bug preprocessor/71102] New: _Pragma("GCC warning ...") should concatenate string literals

2016-05-13 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71102

Bug ID: 71102
   Summary: _Pragma("GCC warning ...") should concatenate string
literals
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

Consider

#define warn1(msg) _Pragma (#msg)
#define warn(msg) warn1 (GCC warning msg)
#define lengthy_explanation(name) \
"This is a long explanation about why `" #name "' is deprecated."

#define SYMBOL warn(lengthy_explanation(SYMBOL)) 1

int main(void) { return SYMBOL; }

With gcc (5 and 6), the warning message printed is

test.c:9:13: warning: This is a long explanation about why `

which is obviously not what was wanted.  (The column position of the diagnostic
is also wrong, but that's secondary.)  One could theoretically work around this
by not using string literals in lengthy_explanation() but that risks blowing up
in one's face if any of the words in the explanation happen to be macros (and
also I'm not sure how I would get the quotation marks in there).

n.b. clang gets this right:

test.c:9:25: warning: This is a long explanation about why `SYMBOL' is
  deprecated. [-W#pragma-messages]

[Bug c/82134] New: warn_unused_result triggers on empty structs even when they are used

2017-09-07 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82134

Bug ID: 82134
   Summary: warn_unused_result triggers on empty structs even when
they are used
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

If a function that returns an empty struct is tagged with attribute
warn_unused_result, any call will trigger the warning, even if the return value
is (nominally) used.  For instance

struct Empty {};

extern void use_empty(struct Empty);

__attribute__((warn_unused_result))
extern struct Empty make_empty(void);

void should_warn(void)
{
make_empty();
}

void shouldnt_warn_1(void)
{
use_empty(make_empty());
}

void shouldnt_warn_2(void)
{
struct Empty e = make_empty();
use_empty(e);
}

-->

test.c: In function ‘should_warn’:
test.c:10:5: warning: ignoring return value of ‘make_empty’, declared with
attribute warn_unused_result [-Wunused-result]
 make_empty();
 ^~~~
test.c: In function ‘shouldnt_warn_1’:
test.c:15:5: warning: ignoring return value of ‘make_empty’, declared with
attribute warn_unused_result [-Wunused-result]
 use_empty(make_empty());
 ^~~
test.c: In function ‘shouldnt_warn_2’:
test.c:20:22: warning: ignoring return value of ‘make_empty’, declared with
attribute warn_unused_result [-Wunused-result]
 struct Empty e = make_empty();
  ^~~~

with both GCC 6 and GCC 7.

(From here:
https://stackoverflow.com/questions/46096628/gcc-empty-structs-and-wunused-result)

[Bug c/82134] warn_unused_result triggers on empty structs even when they are used

2017-09-07 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82134

--- Comment #2 from Zack Weinberg  ---
The claim in the Stack Overflow post was that this was useful in a scenario
involving machine-generated code that couldn't return void for some external
reason, but they didn't go into any kind of detail.

It has always been my opinion that warnings in general should be made
optimization-independent, including early dead code optimizations like this,
with _possible_ carefully-defined-and-documented exceptions for `if (0) { /*
don't warn about stuff in here */ }` and similar.  But I realize that's a long
way from where GCC is or has ever been.

[Bug c/82752] Support %OB, %Ob strftime formats

2017-10-27 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82752

Zack Weinberg  changed:

   What|Removed |Added

 CC||zackw at panix dot com

--- Comment #1 from Zack Weinberg  ---
And in strptime, %OB, %Ob, and %Oh should be supported.

[Bug c/83236] New: "Did you mean" suggestions maybe shouldn't offer implementation-private names

2017-11-30 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83236

Bug ID: 83236
   Summary: "Did you mean" suggestions maybe shouldn't offer
implementation-private names
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

On Linux/glibc,

$ cat > test.c <
ino_t inode;
EOF
$ gcc -std=c89 -fsyntax-only test.c
test.c:2:1: error: unknown type name ‘ino_t’; did you mean ‘__ino_t’?
 ino_t inode;
 ^
 __ino_t
$ gcc --version
gcc (Debian 7.2.0-16) 7.2.0

__ino_t is an implementation detail of glibc's headers, and application
programmers shouldn't be encouraged to use it.

To first order, I don't think any implementation-namespace names (/^_[_A-Z]/)
should be offered as suggestions.  To second order, they could be suggested
when the unknown symbol is itself in the implementation namespace, or when the
suggestion is a _documented_ impl-namespace symbol like __DATE__ or _IOFBF or
_Bool.

[Bug c/83236] "Did you mean" suggestions maybe shouldn't offer implementation-private names

2017-12-01 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83236

--- Comment #3 from Zack Weinberg  ---
Maybe name_reserved_for_implementation_p should be a langhook?

[Bug c/83236] "Did you mean" suggestions maybe shouldn't offer implementation-private names

2017-12-04 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83236

--- Comment #5 from Zack Weinberg  ---
I was just thinking that other language front-ends might want to offer
spell-checking suggestions with their own rules for which names are/aren't
appropriate to suggest in context, but maybe we can worry about that when it
happens.

[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread zackw at panix dot com


--- Comment #4 from zackw at panix dot com  2010-04-08 17:28 ---
(In reply to comment #0)
> When this testcase, using inline assembly, is compiled with -Os, -O2, or -O3
> it segfaults. -O0 and -O1 allow it to run correctly.
> 
> Moving the inline assembly into a separate file and including it in the
> compilation allow the program to run correctly at all -O levels.

>From these symptoms, it is practically certain that you have done something
wrong with the asm inputs and outputs.  I don't have an Alpha compiler to hand,
but just from looking at your code, I bet it will work correctly if you rewrite
it like so:

unsigned long rewritten(const unsigned long b[2]) {
unsigned long ofs, output;

asm(
"cmoveq %0,64,%1# ofs= (b[0] ? ofs : 64);\n"
"cmoveq %0,%2,%0# temp   = (b[0] ? b[0] : b[1]);\n"
"cttz   %0,%0   # output = cttz(temp);\n"
: "=r" (output), "=r" (ofs)
: "r" (b[1]), "0" (b[0]), "1" (0)
);
return output + ofs;
}

(I've assumed that the semantic of "cmoveq a,b,c" is "if (a==0) c=b;")

The trick with asm() is to do as little as possible.  I assume that the reason
the assembly version beats the pure-C version is the cmoveq's, so I stripped
the setup code and the addition.  This allows me to express the _real_ argument
constraints rather than fake ones, which lets me be confident that the
optimizers will do what you want.  Note that this also means "volatile" is
unnecessary.

As a general principle, if you find yourself writing an asm() with a big long
list of earlyclobber outputs but no inputs, you are doing it wrong.


-- 

zackw at panix dot com changed:

   What|Removed         |Added
--------
 CC||zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug c/43691] Code segfault when compiled with -Os, -O2, or -O3

2010-04-08 Thread zackw at panix dot com


-- 

zackw at panix dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43691



[Bug driver/68038] New: "Internal compiler error: Killed: program cc1" should read "Virtual memory exhausted"

2015-10-20 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68038

Bug ID: 68038
   Summary: "Internal compiler error: Killed: program cc1" should
read "Virtual memory exhausted"
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
      Reporter: zackw at panix dot com
  Target Milestone: ---

99+% of the time, when the compiler proper dies on signal 9, it's because we're
on Linux and the out-of-memory killer has killed the compiler proper because it
ate all available RAM.  But the error message you get when this happens
*sounds* like there's a bug in GCC.

I suggest the driver be changed to special-case SIGKILL and print a different
message, along the lines of

   gcc: compilation process (cc1) killed by SIGKILL
   This almost always means your computer does not have enough RAM to
   compile this program.  Try adding swap space or reducing the
   number of concurrent compile jobs.


[Bug c++/68265] New: Arbitrary syntactic nonsense silently accepted after 'int (*){}' until the next close brace

2015-11-09 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68265

Bug ID: 68265
   Summary: Arbitrary syntactic nonsense silently accepted after
'int (*){}' until the next close brace
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Keywords: accepts-invalid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
  Target Milestone: ---

(From https://stackoverflow.com/questions/33614455/ :)

The C++ compiler fails to diagnose ill-formed constructs such as

  int main()
  {
  int (*) {}
 any amount of syntactic nonsense
 on multiple lines, with *punctuation* and ++operators++ even...
 will be silently discarded
 until the next close brace
  }

With -pedantic -std=c++98 you do get "warning: extended initializer lists only
available with -std=c++11 or -std=gnu++11", but with -std=c++11, not a peep.

If any one (or more) of the tokens 'int ( * ) { }' are removed, you do get an
error.  Also, the C compiler does not have the same bug.

[Bug c++/68265] Arbitrary syntactic nonsense silently accepted after 'int (*){}' until the next close brace

2015-11-10 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68265

--- Comment #2 from Zack Weinberg  ---
This problem apparently goes back at least as far as 4.8.  Stack Overflow
people found a number of variations, please consult

https://stackoverflow.com/questions/23033043/is-it-a-new-c11-style-of-comments
https://stackoverflow.com/questions/23015482/strange-code-that-compiles-with-g

[Bug lto/61886] [4.8/4.9/5 Regression] LTO breaks fread with _FORTIFY_SOURCE=2

2015-02-11 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61886

--- Comment #34 from Zack Weinberg  ---
> As I tried to explain, it is currently design decision to have one declaration
> and one symtam node for one symbol.  The one decl rule was introduced by
> Codesourcery (Zack) in 2003-4. He updated frontends to avoid copying and
> dropped code that dealt with duplicated declarations.  Due to lack of sanity
> checking some cases remained, like this one (because at that time we did not
> really have proper asm name hash).  There are couple open PRs since that time
> that was never corrected.

It's been long enough that I don't recall precisely what my goals were, but I
don't think it was my intention to enforce a "one decl per asm name" rule at
the time.  I was trying to deal with C front-end issues: according to
https://gcc.gnu.org/ml/gcc-patches/2004-01/msg00808.html, "at least bug 12267,
bug 12336, bug 12373, bug 12391, bug 12560, bug 12934, bug 13129".  My
recollection is that the fundamental problem was getting more than one
TREE_DECL object per *declared name* in -funit-at-a-time mode, which was shiny
and new back then!

... Digging through the mailing list archives, maybe I'm remembering wrong. 
https://gcc.gnu.org/ml/gcc-patches/2004-03/msg01440.html has

> The old code took the copying
> approach; this was bad because it violated the basic assumption made
> elsewhere in the compiler that there is exactly one DECL node for each
> assembly-level symbol.  Hence all the bugs.

I do not remember anymore what parts of the compiler were making those
assumptions, but I'm guessing it was probably the debug information generators,
or mostly those.  Anyhow, I don't think y'all should be taking a decision I
made for the C front end under time and peer pressure (people were *really mad*
at me for leaving some of those bugs unfixed for several releases in a row) in
2004 as a permanent compiler-wide design constraint ten years later :)

(Note also bug 13801, and the discussion of its fix, here:
https://gcc.gnu.org/ml/gcc-patches/2004-08/msg00085.html -- we wound up having
to back down from the "not only is there only one DECL per assembly level
symbol, it only ever accumulates information" rule that I originally wanted,
due to C90 conformance problems.)


[Bug lto/62249] Spurious FORTIFY_SOURCE warning with -flto, poll

2014-10-10 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62249

--- Comment #4 from Zack Weinberg  ---
The "delta-minimized self-contained test case" should require only `-O2 -flto`
to reproduce the problem.  The other test case, however, should require
`-D_FORTIFY_SOURCE=2` in addition, and to be compiled against recent glibc.

Note also that this does not happen with 4.8 or older.


[Bug preprocessor/61638] New: "warning: multi-line comment" unclear and has false positives

2014-06-28 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61638

Bug ID: 61638
   Summary: "warning: multi-line comment" unclear and has false
positives
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: enhancement
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

The warning for a // comment continued on the next line by a backslash is
"warning: multi-line comment", which doesn't make a whole lot of sense if you
don't already know that this is a possibility.  (Caret diagnostics help, but
the caret is in the wrong place: it points at the //, not the \.)

It also warns when the construct is harmless, e.g. because the next line has
another // comment on it: this naturally happens when someone feels like
drawing a tree

  //A
  //   / \
  //  B   C

This peeves people enough to write snarky comments like this (from
https://code.monotone.ca/p/monotone/source/tree/h:net.venge.monotone/test/unit/tests/graph.cc#L260
):

  // This tests the nasty case described in the giant comment above
  // get_uncommon_ancestors:
  //
  //  9
  //  |\  . Extraneous dots brought to you by the
  //  8 \ . Committee to Shut Up the C Preprocessor
  // /|  \. (CSUCPP), and viewers like you and me.
  /// |   |
  // (... ASCII art continues ...)

So I would like to suggest the following four improvements:

1) The warning message should be changed to "warning: backslash-newline
continues // comment onto the next line".
2) The caret should be corrected to point to the backslash, not the //.
3) The warning should be suppressed if the next line is blank or contains only
another // comment.  (It should *not* be suppressed if the next line is blank
up to a /*, because that genuinely does change the meaning of the code.
4) "warning: backslash and newline separated by space" should not be suppressed
in // comments, because see bug 8270 for a whole bunch of people being very
confused why putting a space after the backslash in their ASCII art doesn't
make it not a backslash-newline.  (It should still be suppressed in /*
comments.)


[Bug tree-optimization/62112] New: Optimize out malloc when block is unused or write-only

2014-08-12 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62112

Bug ID: 62112
   Summary: Optimize out malloc when block is unused or write-only
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

This program

#include 
#include 

int
main(void)
{
size_t n = 1000;
float *x = calloc(n,sizeof(float));
float *y = malloc(n*sizeof(float));
if (x && y)
  memcpy(y,x,sizeof(float)*n);
return 0;
}

can be optimized (in the absence of `-fno-builtin-(memcpy|malloc|calloc)`) to

int main(void) { return 0; }

because: the memory block pointed to by `y` is write-only, so the `memcpy` and
`malloc` can be discarded; after that is done, the memory block pointed to by
`x` is unused, so that allocation can be discarded as well.

`calloc` is used here to avoid any question of UB due to reading uninitialized
memory even within `memcpy`.  The optimization should apply to all
heap-allocation functions, including especially C++ operator new (as long as
the constructor has no side effects outside the just-allocated object).

Clang 3.5 does perform this optimization.


[Bug tree-optimization/62112] Optimize out malloc when block is unused or write-only

2014-08-12 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62112

--- Comment #2 from Zack Weinberg  ---
I observe that the `memcpy` does get lowered to inline code.  Is it just a
phase-ordering problem that we then don't detect the stores as dead?


[Bug c++/50595] template overload resolution insufficiently sensitive to name dependency?

2014-08-13 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50595

Zack Weinberg  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Zack Weinberg  ---
Per http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#197 GCC's
behavior is correct.


[Bug lto/62249] New: Spurious FORTIFY_SOURCE warning with -flto, poll

2014-08-24 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62249

Bug ID: 62249
   Summary: Spurious FORTIFY_SOURCE warning with -flto, poll
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

Created attachment 33388
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33388&action=edit
test case (compile on recent glibc system)

Compile the attached program on a recent-glibc system with -D_FORTIFY_SOURCE=2
-O2 -flto and you get a spurious warning:

$ gcc -flto -O2 -D_FORTIFY_SOURCE=2 test.c
In function ‘__poll_alias’,
inlined from ‘main’ at test.c:17:9:
/usr/include/x86_64-linux-gnu/bits/poll2.h:41:2: warning: call to
‘__poll_chk_warn’ declared with attribute warning: poll called with fds buffer
too small file nfds entries
  return __poll_chk (__fds, __nfds, __timeout, __bos (__fds));

$ gcc -fwhole-program -O2 -D_FORTIFY_SOURCE=2 test.c
$

Inspection of bits/poll2.h leads me to believe that the glibc folks expect this
construct ...

__fortify_function int
poll (struct pollfd *__fds, nfds_t __nfds, int __timeout)
{
  if (__bos (__fds) != (__SIZE_TYPE__) -1)
{
  if (! __builtin_constant_p (__nfds))
return __poll_chk (__fds, __nfds, __timeout, __bos (__fds));
  else if (__bos (__fds) / sizeof (*__fds) < __nfds)
return __poll_chk_warn (__fds, __nfds, __timeout, __bos (__fds));
}

  return __poll_alias (__fds, __nfds, __timeout);
}

... to have the dead arm of the inner 'if' eliminated (one of them must be
dead, because the condition is a call to __builtin_constant_p) before warnings
are issued.

[Bug lto/62249] Spurious FORTIFY_SOURCE warning with -flto, poll

2014-08-24 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62249

--- Comment #1 from Zack Weinberg  ---
Created attachment 33389
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33389&action=edit
Delta-minimized self-contained test case

Requires no headers anymore.

Delta-minimization revealed something interesting:

int __poll_chk (struct pollfd *, unsigned, int, unsigned);
int __poll_chk_warn (struct pollfd *, unsigned, int, unsigned)
  __asm__ ("__poll_chk")
  __attribute__ ((__warning__ ("poll called with fds buffer too small")));

The __asm__("__poll_chk") annotation is essential; if it's removed the bug goes
away.  I don't know anything about the guts of LTO, but to me that suggests a
bug in its unification of function decls, rather than a bug in dead-code
removal.


[Bug lto/62249] Spurious FORTIFY_SOURCE warning with -flto, poll

2014-08-24 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62249

--- Comment #2 from Zack Weinberg  ---
Incidentally, yes, the test case is based on a real program.


[Bug sanitizer/62307] New: -fsanitize=undefined doesn't pay attention to __attribute__((returns_nonnull))

2014-08-29 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62307

Bug ID: 62307
   Summary: -fsanitize=undefined doesn't pay attention to
__attribute__((returns_nonnull))
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org

Consider

  extern int *__errno_location(void)
__attribute__ ((__const__))
__attribute__ ((__returns_nonnull__));

  void set_errno(int x)
  {
(*__errno_location()) = x;
  }

(This is a close approximation to how GNU libc defines errno -- there currently
isn't a returns_nonnull annotation, but I'm about to file a bug for _that_ ;-)

With 4.9.1, -fsanitize=undefined -O2 -fdump-tree-optimized, I get

  set_errno (int x)
  {
int * _1;

:
_1 = __errno_location ();
if (_1 == 0B)
  goto ;
else
  goto ;

:
__builtin___ubsan_handle_type_mismatch (&*.Lubsan_data0, 0);

:
*_1 = x_3(D);
return;
  }

The returns_nonnull annotation should cause the compiler to delete the runtime
check.  (None of the RTL passes delete it either.)

This type of unnecessary check is the single largest source of calls to
__builtin___ubsan_handle_type_mismatch in a real program.  I see a number of
other cases where we could be doing better at eliminating these checks on "this
pointer is already known to be non-NULL" grounds, e.g.

  if (_191 == 0B)
goto ;
  else
goto ;

  :
  __builtin___ubsan_handle_type_mismatch (&*.Lubsan_data26, 0);

  :
  *_191 = _37;
  if (_191 == 0B)
goto ;
  else
goto ;

  :
  __builtin___ubsan_handle_type_mismatch (&*.Lubsan_data27, 0);

... so maybe it's a pass-ordering problem?  Or rather, the fact that
UBSAN_NULL() is an opaque operation until .sanopt, which prevents nearly all of
the tree optimizers from doing anything with it?


[Bug sanitizer/62307] -fsanitize=undefined doesn't pay attention to __attribute__((returns_nonnull))

2014-08-29 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62307

--- Comment #2 from Zack Weinberg  ---
(In reply to Marc Glisse from comment #1)
> -fsanitize=null seems to imply -fno-delete-null-pointer-checks, so I assume
> this is on purpose. It would actually be quite natural for the sanitizer to
> insert an extra check after every call to a returns_nonnull function,
> checking that the result is indeed !=0. Otherwise yes, sanopt is way too
> late for any other optimization to take place.

I had a different mental model of how ubsan was supposed to function -- I was
expecting it to insert explicit checks before every construct with potentially
runtime-undefined behavior (not due to array bounds or pointer lifetimes), but
then for the compiler to try as hard as possible to *eliminate* the checks
where provably unnecessary.  I was hoping, in fact, to be able to use
-fsanitize=undefined -fdump-tree-optimized as a poor man's "tell me all the
places where the program can't be proven *not* to have runtime-undefined
behavior."


[Bug tree-optimization/63271] New: Should commute arithmetic with vector load

2014-09-15 Thread zackw at panix dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63271

Bug ID: 63271
   Summary: Should commute arithmetic with vector load
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zackw at panix dot com

Consider

#include 

__m128i foo(char C)
{
  return _mm_set_epi8(   0,C,  2*C,  3*C,
   4*C,  5*C,  6*C,  7*C,
   8*C,  9*C, 10*C, 11*C,
  12*C, 13*C, 14*C, 15*C);
}

__m128i bar(char C)
{
  __m128i v = _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7,
   8, 9,10,11,12,13,14,15);
  v *= C;
  return v;
}

I *believe* these functions compute the same value, and should therefore
generate identical code, but with gcc 4.9 foo() generates considerably larger
and slower code.

The test case is expressed in terms of x86  but I have no reason
to believe it isn't a generic missed optimization in the tree-level vectorizer.


[Bug rtl-optimization/48580] New: missed optimization: integer overflow checks

2011-04-12 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580

   Summary: missed optimization: integer overflow checks
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


To the best of my knowledge, this is the only safe way (without -fwrapv) to
check whether the product of two signed integers overflowed:

bool product_does_not_overflow(signed x, signed y)
{
  unsigned tmp = x * unsigned(y);

  return signed(tmp) > 0 && tmp / x == unsigned(y);
}

(I believe C and C++ are the same in this regard but I could be wrong.  If
there is a better way to write this test I would love to know about it.)

g++ 4.6 produces this assembly dump on x86-64:

_Z25product_does_not_overflowii:
movl%esi, %edx
xorl%eax, %eax
imull%edi, %edx
testl%edx, %edx
jle.L2
movl%edx, %eax
xorl%edx, %edx
divl%edi
cmpl%eax, %esi
sete%al
.L2:
rep
ret

but, if I understand the semantics of IMUL correctly, it could do this instead:

_Z25product_does_not_overflowii:
xorl%eax, %eax
imull%edi, %esi
setno%al
ret

which is a pretty substantial micro-win, particularly in getting rid of a
divide.


[Bug rtl-optimization/48580] missed optimization: integer overflow checks

2011-04-12 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580

--- Comment #2 from Zack Weinberg  2011-04-12 20:40:41 
UTC ---
(In reply to comment #1)
> 
> Two signed integers given that they are known to be positive, anyway.  
> This may return unexpected results if either or both arguments are 
> negative or zero.
...
> (If the function gets called with one constant operand, you can make it 
> inline and use __builtin_constant_p to replace a divide with a range check 
> on the other operand.  That's only useful for some cases of overflow 
> checks, of course.)

In the code that this is cut down from, both arguments are known to be strictly
positive, but neither is constant.  (They're only signed for historical
reasons, I think, but it would be a huge amount of work to change that.)

> I sort of think GCC should have built-in functions exposing C and C++ 
> interfaces for: each basic arithmetic operation, defined to wrap; each 
> basic arithmetic operation, defined to saturate; each basic arithmetic 
> operation, defined to have undefined overflow; each basic arithmetic 
> operation, with a separate overflow flag being set; each basic arithmetic 
> operation, defined to trap on overflow.  All of these for both signed and 
> unsigned and for any desired number of bits (up to the maximum number 
> supported for arithmetic, so generally 1-64 bits on 32-bit configurations 
> and 1-128 bits on 64-bit configurations); except for the defined-to-trap 
> case, all would still have undefined behavior on division by 0.  You could 
> then have optimizations mapping generic C idioms to such built-in 
> operations where the target supports efficient code for the operations.  
> But this rather relies on the no-undefined-overflow work being finished 
> first so that some of the required operations actually exist inside GCC, 
> before they can easily be exposed to the user.

So you see this as more of a tree-level than an RTL-level missed optimization,
then?  Your plan sounds fine to me, although I might look for a less ambitious
but more likely to get done soon approach, personally.


[Bug rtl-optimization/48580] missed optimization: integer overflow checks

2011-04-12 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580

--- Comment #4 from Zack Weinberg  2011-04-12 21:03:01 
UTC ---
On Tue, Apr 12, 2011 at 1:52 PM, joseph at codesourcery dot com
 wrote:
>> In the code that this is cut down from, both arguments are known to be 
>> strictly
>> positive, but neither is constant.  (They're only signed for historical
>> reasons, I think, but it would be a huge amount of work to change that.)
>
> My point in noting the need for the integers to be positive was really
> that unless the compiler knows they are positive, the transformation
> you're asking for appears to be incorrect - the semantics of your function
> are that a product with either term 0 counts as overflowing, but using a
> processor overflow flag would report it as not overflowing.

Well, if the compiler didn't know that, it could still use the
overflow flag plus an extra test for either input operand being zero,
couldn't it?  The C idiom has to test for a zero result, because e.g.
0x4000_U * 16 wraps to zero.

(The original code does in fact check for x or y  <= 0 in a place
where VRP would notice; I should have said that instead of "known to
be strictly positive", sorry for any confusion.)


[Bug rtl-optimization/48580] missed optimization: integer overflow checks

2011-04-12 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580

--- Comment #5 from Zack Weinberg  2011-04-12 21:04:34 
UTC ---
Addendum: what would *you* describe as the correct C idiom for
ensuring that the product of two signed integers was positive and did
not overflow the range of a same-sized signed integer, assuming
nothing about either multiplicand?


[Bug c/45861] New: Possible missed optimization - array ops vs shift-and-mask

2010-10-01 Thread zackw at panix dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45861

   Summary: Possible missed optimization - array ops vs
shift-and-mask
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: za...@panix.com


Created attachment 21938
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=21938
test case

The attached .c file contains two functions, which (unless I screwed up)
compute exactly the same (mathematical) function - they take an array of 8
bytes, permute its elements, and stuff them into a 64-bit integer, which is
then returned.  However, GCC generates very different code for each (on
x86-64).  There seem to be two missed optimization opportunities here:

1) I don't know *which* of the two code generation possibilities here is
better, but it seems like GCC ought to know and ought to generate that code for
both functions.

2) Could we be taking advantage of SSEn vector permute instructions here?


[Bug c/44179] New: warn about sizeof(char)

2010-05-17 Thread zackw at panix dot com
sizeof(char) is defined to be 1, and is therefore almost always unnecessary. 
It would be nice if the C and C++ front ends could complain about it.


-- 
   Summary: warn about sizeof(char)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: zackw at panix dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44179



[Bug c/44179] warn about sizeof(char)

2010-05-17 Thread zackw at panix dot com


--- Comment #2 from zackw at panix dot com  2010-05-17 23:59 ---
So it has to be a little smarter than the obvious, so what else is new.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44179



  1   2   >