[Bug c/94196] Multiple issues with attributes

2020-04-19 Thread nate at thatsmathematics dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94196

Nate Eldredge  changed:

   What|Removed |Added

 CC||nate at thatsmathematics dot 
com

--- Comment #2 from Nate Eldredge  ---
Another case is that the pure attribute is not respected on a function pointer,
though the related const attribute is.

For example:

#include 

int __attribute__((pure,noipa)) my_pure_func(int x) {
printf("pure func called with %d\n", x);
return x;
}

int __attribute__((const,noipa)) my_const_func(int x) {
printf("const func called with %d\n", x);
return x;
}

int __attribute__((pure)) (*pure_ptr)(int) = my_pure_func;
int __attribute__((const)) (*const_ptr)(int) = my_const_func;

int a,b,c,d;

void foo(void) {
a = pure_ptr(1) + pure_ptr(1);
b = my_pure_func(2) + my_pure_func(2);
c = const_ptr(3) + const_ptr(3);
d = my_const_func(4) + my_const_func(4);
}

int main(void) {
foo();
return 0;
}


gives a warning:

pure.c:13:1: warning: ‘pure’ attribute ignored [-Wattributes]
   13 | int __attribute__((pure)) (*pure_ptr)(int) = my_pure_func;
  | ^~~

and indeed `my_pure_func(1)` is called twice.  The others are handled correctly
and only called once.

[Bug tree-optimization/93982] New: Assignment incorrectly omitted by -foptimize-strlen

2020-02-29 Thread nate at thatsmathematics dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93982

Bug ID: 93982
   Summary: Assignment incorrectly omitted by -foptimize-strlen
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

Created attachment 47937
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47937&action=edit
Reduced testcase

If the attached testcase is compiled with `-O1 -foptimize-strlen' on amd64, the
function foo is miscompiled: the assignment to res.target[1] is omitted.  It
also happens with -O2, but not with -O1 alone or -O3.

This bug is somewhat similar to bug 93213.  It is a regression from 9.2.0.

The generated assembly is:

foo:
subq$8, %rsp
callmy_alloc
movq$.LC0, (%rax)
movq$.LC1, 16(%rax)
movq$.LC1, 24(%rax)
movq$.LC1, 32(%rax)
addq$8, %rsp
ret

Note the absence of `movq $.LC1, 8(%rax)'.

I tested with trunk, latest pull from git, revision 117baab8.  In particular
the patch for bug 93213 (e13f37d9) is included.  The program is compiled
correctly by gcc 9.2.0 with the same options (and all others I tried). I did a
git bisect and the offending commit is 34fcf41e.

The cast in foo() at first looked questionable from a strict aliasing
standpoint, but I believe the code is legal since the memory returned by calloc
has no declared type, and we never access this memory except as objects of type
`const char *'.  Also, the miscompilation persists with -fno-strict-aliasing.

I am no gcc expert, but I dug into the source a little bit, out of curiosity.  
It looks like the deletion happens in handle_store(), at
gcc/tree-ssa-strlen.c:5021.  It seems that in this function, the code is being
treated as if the string "12345678" itself were being stored at address
res.target, rather than the address of the string; as if the code were
`strcpy(res.target, "12345678")'.  In particular, it thinks the trailing null
was stored at address res.target+8.  The following statement, `res.target[1] =
""', is likewise treated as if it were `strcpy(res.target+8, "")', which would
also just store a null byte at res.target+8, so it is seen as redundant and is
removed.

I would like to acknowledge StackOverflow user BrodieG for initially
discovering this bug and helping to investigate, as well as users KamilCuk and
John Bollinger for helpful comments. The original question is at
https://stackoverflow.com/q/60406042/634919.

Output of `gcc -v`:

Using built-in specs.
COLLECT_GCC=/home/nate/gcc/bin/gcc
COLLECT_LTO_WRAPPER=/home/nate/do-not-backup/gcc-inst/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/home/nate/gcc --disable-bootstrap
--enable-languages=c
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200229 (experimental) (GCC)

[Bug tree-optimization/94015] New: [10 Regression] Another assignment incorrectly omitted by -foptimize-strlen

2020-03-03 Thread nate at thatsmathematics dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94015

Bug ID: 94015
   Summary: [10 Regression] Another assignment incorrectly omitted
by -foptimize-strlen
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
CC: dmalcolm at gcc dot gnu.org, jakub at gcc dot gnu.org,
law at redhat dot com, marxin at gcc dot gnu.org,
msebor at gcc dot gnu.org, nate at thatsmathematics dot com
  Target Milestone: ---

Created attachment 47957
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47957&action=edit
Test case

This is apparently the same bug as in bug 93982, which I don't think has been
fixed completely.

I am using trunk, commit 9b4f00dd (not sure how to make that a link?).  It
includes f26688fb which was supposed to fix bug 93982.  This is again a
regression from 9.2.0.

If the attached testcase is compiled with `-O1 -foptimize-strlen -fpie', or
-O2, on amd64, the function foo is miscompiled: the assignment to s[7] is
omitted.  The generated assembly is:

This bug is somewhat similar to bug 93213.  It is a regression from 9.2.0.

The generated assembly is:

foo:
subq$8, %rsp
callalloc
leaq.LC0(%rip), %rdx
movq%rdx, (%rax)
addq$8, %rsp
ret

9.2.0 adds `movb $0, 7(%rax)' as it should.

I wasn't able to create a test that failed at runtime on Linux, since by
default everything is loaded in the low half of memory, so the address of the
string literal has zero in its high byte and s[7] gets set to zero anyway.  But
the compiler can't know that will happen.  There is probably a linker or loader
option to change this, but I could not immediately figure out the correct
incantation.  I can try harder if it would help.

-fdump-tree-all shows that the statement is deleted by the strlen pass, as
before.  The output of the preceding pass looks like:

foo ()
{
  char * s;
   [local count: 1073741824]:
  s_3 = alloc ();
  MEM[(char * *)s_3] = "1234567";
  MEM[(char *)s_3 + 7B] = 0;
  return;
}

I didn't single step the compiler code this time, but I presume the issue is
that although the strlen pass now knows that the store in `MEM[(char * *)s_3] =
"1234567";' is the size of a pointer (8 bytes), it still thinks those 8 bytes
are the string "1234567\0" rather than its address.  So it still thinks it will
result in a null byte stored at `s_3 + 7B`, making the following line
redundant.

[Bug tree-optimization/94015] [10 Regression] Another assignment incorrectly omitted by -foptimize-strlen

2020-03-03 Thread nate at thatsmathematics dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94015

--- Comment #4 from Nate Eldredge  ---
Comment on attachment 47959
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47959
gcc10-pr94015.patch

I'm not qualified to opine on the proposed fix, but just wanted to note that,
as I mentioned above, running your testcase doesn't actually exercise the bug
on amd64/linux because the high byte of the address of the literal is always
zero, so the test passes whether s[7]='\0' is "optimized" out or not.  More
tests are always good, I suppose, but I just want to emphasize that running
this testcase is not the way to check whether the bug is fixed.  I wasn't able
to come up with a good runtime way to check whether the code was correctly
compiled.  Maybe someone more clever than me can think of a way?

[Bug other/97473] New: Spilled function parameters not aligned properly on multiple non-x86 targets

2020-10-17 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97473

Bug ID: 97473
   Summary: Spilled function parameters not aligned properly on
multiple non-x86 targets
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

Created attachment 49394
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49394&action=edit
Test case

Suppose we have a type V requiring alignment, such as with
__attribute__((aligned(N))).  In current versions of gcc, both 10.2 and recent
trunk, it appears that local (auto) variables of this type are properly aligned
on the stack, at least on all the targets I tested.  However, on many targets
other than x86, alignment is apparently not respected for function parameters
of this type when their address is taken. 

The function parameter may actually be passed in a register, in which case when
its address is taken, it must be spilled to the stack.  But on the failing
targets, the spilled copy is not sufficiently aligned, and so for instance,
other functions which receive a pointer to this variable will find it does not
have the alignment that it should.

I'm not sure if this is a bug or a limitation, but it's quite counterintuitive,
since function parameters generally can be treated like local variables for
most other purposes.  I couldn't find any mention of this in the documentation
or past bug reports.

This can be reproduced by a very short C example like the following:

typedef int V __attribute__((aligned(64)));

void g(V *);

void f(V x) {
g(&x);
}

The function g can get a pointer that is not aligned to 64 bytes.  A more
complete test case is attached, which I tested mainly on ARM and AArch64 with
gcc 10.2 and also trunk.  It seems to happen with or without optimization, so
long as one prevents IPA of g.  Inspection of the assembly shows gcc does not
generate any code to align the objects beyond the stack alignment guaranteed by
the ABI (8 bytes for ARM, 16 bytes for AArch64).

It fails on (complete gcc -v output below):

- aarch64-linux-gnu 10.2.0 and trunk from today
- arm-linux-gnueabihf 10.2.0 and trunk from last week
- alpha-linux-gnu 10.2.0
- sparc64-linux-gnu 10.2.0
- mips-linux-gnu 10.2.0

It succeeds on:

- x86_64-linux-gnu 10.2.0, also with -m32

On x86_64-linux-gnu, gcc generates instructions to align the stack and place
the spilled copy of x at an aligned address, and the testcase passes there. 
(Perhaps this was implemented to support AVX?)  With -m32 it copies x from its
original unaligned position on the stack into an aligned stack slot.

As noted, auto variables of the same type do get proper alignment on all the
platforms I tested, and so one can work around with `V tmp = x; g(&tmp);`.

For what it's worth, clang on ARM and AArch64 does align the spilled copies.

I was not sure which pass of the compiler is responsible for this so I just
chose component "other".  I didn't think "target" was appropriate as this
affects many targets, though not all.

This issue was brought to my attention by StackOverflow user Alf (thanks!), see
https://stackoverflow.com/questions/64287587/memory-alignment-issues-with-gcc-vector-extension-and-arm-neon.
 Alf's original program was in C++ for ARM32 with NEON and the hard-float ABI,
and involved mixing functions that passed vector types (like int32x4_t) either
by value or by by reference.  In this setting they can be passed by value in
SIMD registers, but in memory they require 16-byte alignment.  This was
violated, resulting in bus errors at runtime.  So there is "real life" code
affected by this.

I tried including full `gcc -v` output from all versions tested, but it seems
to be triggering the bugzilla spam filter, so I'm omitting it.  Hopefully it
isn't needed, but let me know if it is.

[Bug other/97473] Spilled function parameters not aligned properly on multiple non-x86 targets

2020-10-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97473

--- Comment #2 from Nate Eldredge  ---
Possibly related to bug 84877 ?

[Bug target/30527] Use of input/output operands in __asm__ templates not fully documented

2023-01-21 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30527

Nate Eldredge  changed:

   What|Removed |Added

 CC||nate at thatsmathematics dot 
com

--- Comment #8 from Nate Eldredge  ---
For arm/aarch64 in particular, there is an extra wrinkle, which is that
armclang *does* support and document the template modifiers.  See
https://developer.arm.com/documentation/100067/0610/armclang-Inline-Assembler/Inline-assembly-template-modifiers?lang=en.
 As best I can tell, they are exactly the same as what gcc already supports
(undocumentedly).

So that makes this into a compatibility issue.  People may be writing code for
armclang using the modifiers, and then want to build with gcc instead.  In
practice it will work fine, but from the gcc docs, you wouldn't know it.

On these targets, some of the modifiers are pretty important, and there are
fairly basic things that you simply can't do without them.  For example, on
aarch64, the b/h/s/d/q modifiers to get the names of various scalar pieces of a
vector register (v15 -> b15 / h15 / s15 / d15 / q15).  It's just impossible to
write any scalar floating-point asm without this, or SIMD code using the
"across vector" instructions like ADDV which need a scalar output operand.  

Or, the c modifier to suppress the leading # on an immediate.  This one is
documented for x86, where the need for it is similarly obvious, but no
indication in the docs that it works on arm/aarch64 as well.

I really do think it would be a good idea for these to become officially
supported and documented by gcc, at least for these targets.

[Bug target/104039] New: AArch64 Redundant instruction moving general to vector register

2022-01-14 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104039

Bug ID: 104039
   Summary: AArch64 Redundant instruction moving general to vector
register
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

Compiling the following code on AArch64 with -O2 or -O3:

typedef unsigned long u64x2 __attribute__((vector_size(16)));

u64x2 combine(unsigned long a, unsigned long b) {
u64x2 v = {a,b};
return v;
}

yields the following assembly:

combine:
fmovd0, x0
ins v0.d[1], x1
ins v0.d[1], x1
ret

where the second ins is entirely redundant with the first and serves no
apparent purpose.  (Unless it is something extremely clever...)

This seems to be a regression from 8.x to 9.x; Godbolt's 8.5 looks correct with
just one ins, but 9.3 has the two.

Originally noticed by Peter Cordes on StackOverflow:
https://stackoverflow.com/questions/70717360/how-to-load-vector-registers-from-integer-registers-in-arm64-m1/70718572#comment125016906_70717360

[Bug target/104110] New: AArch64 unnecessary use of call-preserved register

2022-01-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104110

Bug ID: 104110
   Summary: AArch64 unnecessary use of call-preserved register
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

gcc misses an optimization (or in some sense deoptimizes) by using a
call-preserved register to save a trivial constant across a function call.

Source code:

void bar(unsigned);
unsigned foo(unsigned c) {
bar(1U << c);
return 1;
}

Output from gcc -O3 on AArch64:

foo:
stp x29, x30, [sp, -32]!
mov x29, sp
str x19, [sp, 16]
mov w19, 1
lsl w0, w19, w0
bl  bar
mov w0, w19
ldr x19, [sp, 16]
ldp x29, x30, [sp], 32
ret

Note that x19 is used unnecessarily to save the constant 1 across the function
call, causing an unnecessary push and pop.  It would have been better to just
use some call-clobbered register for the constant 1 before the function call,
and then a simple `mov w0, 1` afterward.\

Same behavior with -O, -O2, -Os.  Tested on godbolt, affects yesterday's trunk
and all the way back to 5.4.

Might be related to bug 70801 or bug 71768 but I am not sure.

[Bug target/110780] New: aarch64 NEON redundant displaced ld3

2023-07-23 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110780

Bug ID: 110780
   Summary: aarch64 NEON redundant displaced ld3
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

Compile the following with gcc 14.0.0 20230723 on aarch64 with -O3:

#include 
void CSI2toBE12(uint8_t* pCSI2, uint8_t* pBE, uint8_t* pCSI2LineEnd)
{
while (pCSI2 < pCSI2LineEnd) {
pBE[0] = pCSI2[0];
pBE[1] = ((pCSI2[2] & 0xf) << 4) | (pCSI2[1] >> 4);
pBE[2] = ((pCSI2[1] & 0xf) << 4) | (pCSI2[2] >> 4);
pCSI2 += 3;
pBE += 3;
}
}

Godbolt: https://godbolt.org/z/WshTPKzY5

In the inner loop (.L5 of the godbolt asm) we have

ld3 {v25.16b - v27.16b}, [x3]
add x6, x3, 1
// no intervening stores
ld3 {v25.16b - v27.16b}, [x6]

The second load is redundant.  v25, v26 are the same as what was already in
v26, v27 respectively.  The value loaded into v27 is new but it is not used in
the subsequent code.

This might also account for some extra later complexity, because it means that
the last 48 bytes of the input can't be handled by this loop (or else the
second load would be out of bounds by one byte) and so must be handled
specially.

[Bug libstdc++/104928] std::counting_semaphore on Linux can sleep forever

2023-12-09 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928

--- Comment #2 from Nate Eldredge  ---
This bug is still present.  Tested and reproduced with g++ 13.1.0 (Ubuntu
package), and by inspection of the source code, it's still in the trunk as
well.

Encountered on StackOverflow:
https://stackoverflow.com/questions/77626624/race-condition-in-morriss-algorithm

[Bug libstdc++/104928] std::counting_semaphore on Linux can sleep forever

2023-12-10 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928

--- Comment #4 from Nate Eldredge  ---
@Jonathan: I think that patch set is on the right track, but it has some other
serious bugs.  For one, __atomic_wait_address calls __detail::__wait_impl with
__args._M_old uninitialized (g++ -O3 -Wall catches it).  There is another
uninitialized warning about __wait_addr that I haven't yet confirmed.  

Lastly, in __wait_impl, there is a test `if (__args &
__wait_flags::__spin_only)`, but __spin_only has two bits set, one of which is
__do_spin.  So in effect, __do_spin (which is set by default on Linux) is taken
to imply __spin_only, with the result that it *only* ever spins, without ever
sleeping.  Thus every semaphore (and maybe other waits too) becomes a spinlock,
which is Not Good.

Should I take this up on gcc-patches, or elsewhere?

[Bug libstdc++/104928] std::counting_semaphore on Linux can sleep forever

2023-12-10 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928

--- Comment #5 from Nate Eldredge  ---
Oh wait, disregard that last, I realized that I only applied one of the two
patches.  Let me try again.

[Bug libstdc++/104928] std::counting_semaphore on Linux can sleep forever

2023-12-11 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928

--- Comment #7 from Nate Eldredge  ---
@Jonathan: Done,
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640119.html (sorry, may
not be linked to original threads).

[Bug tree-optimization/117742] New: Inefficient code for __builtin_clear_padding

2024-11-22 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117742

Bug ID: 117742
   Summary: Inefficient code for __builtin_clear_padding
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

With all versions of `gcc -O3` on godbolt, the following C++ code

struct A {
char c;
int l;
};

A clear_padding(A x) {
__builtin_clear_padding(&x);
return x;
}

compiles on aarch64 to

clear_padding(A):
sub sp, sp, #16
strhwzr, [sp]
strbwzr, [sp, 2]
ldr x1, [sp]
add sp, sp, 16
bfi x0, x1, 8, 24
ret

where for some reason it writes zeros to the stack and then loads them back to
insert into the struct.  This is wholly unnecessary and the function could be
the single instruction `and x0, x0, #0x00ff`.

This also shows up in places like `std::atomic::compare_exchange_weak()`
which need to clear padding.

Other target architectures are similar.

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q (or without -quiet passed to cc1plus)

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #6 from Nate Eldredge  ---
After some brief digging, it seems like the problem is that
`cxx_printable_name_internal` can be called recursively by `lang_decl_name'
(via `announce_function').  This is bad because, with its static ring buffer,
it's not reentrant.  In particular, at the site of the call to `lang_decl_name`
(https://github.com/gcc-mirror/gcc/blame/427386042f056a2910882bf0c632b4db68c52bbb/gcc/cp/tree.cc#L2770),
the ring buffer is in an inconsistent state, as one of its entries has just
been freed but not marked as invalid.  So a recursive call may think that entry
is valid, and decide to free it again to make room for a new one.

The ring buffer design seems problematic under the circumstances.  Is caching
the printable name really an important optimization?  If so, then maybe a less
primitive caching structure, with more sensible lifetime management, would be
appropriate.

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #4 from Nate Eldredge  ---
Note for testing that the compiler doesn't crash when compiling only
bits/std.cc, or when a-std.ii and a-foo.ii are concatenated into a single file.
 It seems to be important that they are passed as separate files on the command
line.

I should also point out that the compiler is configured with --enable-checking.

[Bug c++/118927] New: "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

Bug ID: 118927
   Summary: "double free detected in tcache 2" with import std and
-Q
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

Created attachment 60523
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60523&action=edit
Main source file foo.cc

I was trying to use gcc 15 trunk (commit 101e3101e) on arm64 gentoo (vm on
Apple M3) to compile the following code from a recent article by Stroustrup
(https://cacm.acm.org/blogcacm/21st-century-c/):

import std;  // make all of the standard library available
using namespace std;

int main()   // print unique lines from input

{unordered_map m;  // hash table
  for (string line; getline (cin,line); )
  if (m[line]++ == 0)
  cout<https://stackoverflow.com/questions/76154680/how-to-use-module-std-with-gcc, I
used

g++ -Q -std=c++20 -fmodules -fsearch-include-path bits/std.cc foo.cc

The compile fails with "free(): double free detected in tcache 2" and the
following backtrace:

0x259e35b internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:517
0x12a617b crash_signal
../../gcc/toplev.cc:322
0xb54343 cxx_printable_name_internal
../../gcc/cp/tree.cc:2768
0x12a6a3f announce_function(tree_node*)
../../gcc/toplev.cc:230
0x8ddd03 start_preparsed_function(tree_node*, tree_node*, int)
../../gcc/cp/decl.cc:18590
0x9feb17 maybe_clone_body(tree_node*)
../../gcc/cp/optimize.cc:585
0x9a8f87 post_load_processing
../../gcc/cp/module.cc:18834
0x9d81ef lazy_load_pendings(tree_node*)
../../gcc/cp/module.cc:20769
0xabd3a7 lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int)
../../gcc/cp/pt.cc:10241
0xac04cf tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.cc:16549
0xac08bf tsubst_entering_scope
../../gcc/cp/pt.cc:10095
0xac08bf tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/cp/pt.cc:17092
0x93ea8f dump_template_bindings
../../gcc/cp/error.cc:596
0x933b6f dump_function_decl
../../gcc/cp/error.cc:1982
0x93fb27 decl_as_string(tree_node*, int)
../../gcc/cp/error.cc:3334
0x93fb27 lang_decl_name(tree_node*, int, bool)
../../gcc/cp/error.cc:3369
0xb5435b cxx_printable_name_internal
../../gcc/cp/tree.cc:2770
0x12a6a3f announce_function(tree_node*)
../../gcc/toplev.cc:230
0x8ddd03 start_preparsed_function(tree_node*, tree_node*, int)
../../gcc/cp/decl.cc:18590
0x9feb17 maybe_clone_body(tree_node*)
../../gcc/cp/optimize.cc:585

When -Q is omitted, the program compiles and runs correctly.

g++ -v output:

Using built-in specs.
COLLECT_GCC=/home/nate/do-not-backup/gcc-inst/bin/g++
COLLECT_LTO_WRAPPER=/home/nate/do-not-backup/gcc-inst/libexec/gcc/aarch64-unknown-linux-gnu/15.0.1/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: ../configure --enable-checking --enable-languages=c++
--prefix=/home/nate/do-not-backup/gcc-inst
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.1 20250218 (experimental) (GCC)

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #1 from Nate Eldredge  ---
Created attachment 60525
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60525&action=edit
Preprocessor output for foo.cc

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #2 from Nate Eldredge  ---
Created attachment 60526
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60526&action=edit
Preprocessor output for std.cc

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #3 from Nate Eldredge  ---
Created attachment 60527
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60527&action=edit
stderr output from compilation

[Bug c++/118927] "double free detected in tcache 2" with import std and -Q

2025-02-18 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118927

--- Comment #5 from Nate Eldredge  ---
Oh, it's actually triggered by the compilation of a-foo.ii itself, but the
gcm.cache has to have previously been created.

The gcm.cache is too large to post as an attachment, but I can share it another
way if it's needed.

[Bug driver/118975] -undef is passed to the linker

2025-02-21 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118975

--- Comment #1 from Nate Eldredge  ---
I should have said, credit to StackOverflow user anol for finding this.

https://stackoverflow.com/questions/79457581/gcc-undef-leads-to-cannot-find-entry-symbol-start-defaulting-to-x/79457825#79457825

[Bug driver/118975] New: -undef is passed to the linker

2025-02-21 Thread nate at thatsmathematics dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118975

Bug ID: 118975
   Summary: -undef is passed to the linker
   Product: gcc
   Version: 14.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nate at thatsmathematics dot com
  Target Milestone: ---

When the `-undef` option is given to gcc for a command that involves linking,
the option is passed to the linker as well as the compiler.  This makes no
sense as `-undef` should only affect preprocessing, and it causes the linker to
misbehave.

When running `gcc -undef -v m.c` for a file `m.c` containing only `int
main(void) { }` in my test, the linker command line is:

/usr/libexec/gcc/aarch64-unknown-linux-gnu/14/collect2 [...] -undef .../Scrt1.o
[...] 

GNU ld parses this as equivalent to `--undefined=.../Scrt1.o`, so the path
`.../Scrt1.o` is treated as a symbol to be undefined, rather than as a file to
be linked.  I get a message about the `_start` entry point being undefined, and
the resulting executable doesn't work.

Tested locally with Gentoo's ebuild of 14.2.1_p20241221 p7 on arm64, but also
reproducible with all recent versions on godbolt, e.g.
https://godbolt.org/z/3s6qGex8o