[Bug c/93432] New: variable is used uninitialized, but gcc shows no warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93432 Bug ID: 93432 Summary: variable is used uninitialized, but gcc shows no warning Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com Target Milestone: --- Consider the following code: int test53(int y) { int z; for (int x=0; x<10; x=x+1,y=y+1,z=z+1) { if (y<10) { continue; } z = 1; } return z; } int main() { printf("%d\n", test53(0)); printf("%d\n", test53(5)); printf("%d\n", test53(10)); return 0; } If y < 10 in the first iteration, the variable z is used uninitialized. Clearly, the continue can potentially skip the initialization of z. Even if -Wall and -Wextra, gcc does not produce a warning.
[Bug c/93432] variable is used uninitialized, but gcc shows no warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93432 --- Comment #3 from Sven --- I'm not sure how you optimize the uninit use away. When running the example, the first printf typically yields a random value. So the uninitialized value is certainly used as a return value.
[Bug other/58133] New: GCC should emit arm assembly following the unified syntax
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58133 Bug ID: 58133 Summary: GCC should emit arm assembly following the unified syntax Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com I'm not aware of any command line switch to make gcc generate unified syntax. The asm code that gcc generated in thumb mode follows the old devided syntax. All documentation by Atmel these days is about the unified syntax. Also, binutils has decided to disassemble to unified syntax by default. A code example and how to reproduce the issue is given below. Command: arm-softfloat-eabi-gcc -mcpu=arm7tdmi -O2 -mthumb -S -o - foo.c Contents of foo.c: int main() { return 0; } Generated Assembly (devided syntax): main: movr0, #0 bxlr Disassembly (unified syntax): : 0:2000 movsr0, #0 2:4770 bxlr According to the documentation, mov r0,#0 is not a valid 16bit command. The documentation however assumes the unified syntax. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/Cihcdbca.html It might be argued that the transition from classic to unified syntax is to error-prone or too much work. For me, it causes some trouble. I will have to switch between unified syntax and devided syntax in inline assembly since gcc uses devided and clang uses unified syntax.
[Bug other/58133] GCC should emit arm assembly following the unified syntax
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58133 --- Comment #1 from Sven --- It seems, that for targets like -mcpu=cortex-m4 the gcc does generate unified syntax. So is the unified syntax only used for newer targets that use the thumb2 instruction set whereas the divided syntax is used for older thumb1 targets?
[Bug other/58133] GCC should emit arm assembly following the unified syntax
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58133 --- Comment #3 from Sven --- (In reply to Richard Earnshaw from comment #2) > This is a known issue. So what needs to be done? Where do I find the source/configuration/whatever of the code generator for thumb mode?
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #49 from Sven --- (In reply to W.H. Ding from comment #47) > Hi, everyone > > I wonder if this issue has to do with the bug-like problem I encountered > when accessing an unaligned stand-alone global variable (rather than a > member of a packed struct). A test case is as follows: > > char g_c = 'x'; > int g_d __attribute__((aligned(1))) = 13; > > int main(void) > { > g_c = 'z'; > // > g_d = 33;// Crash on, in my case, ARM Cortex-M0 > // > return 0; > } This doesn't work. The aligned attribute is for providing additional alignment hints. The GCC documentation clearly states, that aligned can increase the alignment. So g_d is still 4 byte aligned, and correctly so. Also, you cannot use the packed attribute (which reduces the alignment to 1) for simple types. You can only use it for structs. Try a packed struct that contains a single int. That will work.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #50 from Sven --- (In reply to Sven from comment #49) > This doesn't work. The aligned attribute is for providing additional > alignment hints. The GCC documentation clearly states, that aligned can > increase the alignment. So g_d is still 4 byte aligned, and correctly so. Submitted too soon. Should have been "The GCC documentation clearly states that the aligned attribute can only increase the alignment"
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #41 from Sven --- (In reply to Alexey Salmin from comment #39) > .. when the packed attribute is preserved in the pointer. What do you mean by that? GCC documentation explicitly forbids to use the packed attribute for anything but structs, unions, and enums.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #44 from Sven --- (In reply to Alexey Salmin from comment #42) > typedef struct unaligned_int128_t_ { > __int128_t value; > } __attribute__((packed)) unaligned_int128_t; You can combine the packed attribute with the aligned attribute. Then you can define one struct with aligned(4) and one with aligned(8). Does the warning trigger if you cast between those types? Or does the cast simply override the warning because it's now the programmers responsibility to make sure that the alignment is correct?
[Bug target/48429] ARM __attribute__((interrupt("FIQ"))) not optimizing register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48429 --- Comment #2 from Sven --- Even 8 years later, this bug is not fixed (gcc 8.3). I believe clang/llvm has the same problem. Anyhow, that's not the only problem. The moment that a function is called, registers r0 to r3 (and maybe others) have to be saved. Then again, you probably don't call a function in your FIQ since you want it to be fast. With the help of gcc's generated assembly, I write my FIQ handler in assembler. I don't mind, but fixing this would be very convenient.
[Bug c/86968] New: Unaligned big-endian access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 Bug ID: 86968 Summary: Unaligned big-endian access on armv7-a yields 4 ldrb instructions rather than ldr+rev Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com Target Milestone: --- Created attachment 44547 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44547&action=edit C source code illustrating the problem armv7-a support unaligned memory access. In case of unaligned little-endian access, gcc generates a single ldr instrction. Also, for aligned big-endian access, gcc generates an ldr followed by a rev instruction (reverses byte order). However, when big-endian access is not aligned, gcc does not use ldr+rev. Instead, it generates 4 ldrb instructions plus the code to move the 4 bytes into a single register. Find the source and generated assembler code attached. My compiler command was arm-none-eabi-gcc -O3 -mthumb -S -o - -march=armv7-a endian.c The version of gcc is 8.2.0
[Bug c/86968] Unaligned big-endian access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #1 from Sven --- Created attachment 44548 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44548&action=edit the generated assembler code
[Bug middle-end/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 --- Comment #3 from Sven --- I'm not familiar with GCC internals. So my following comments may be completely off. This has been classified as a "missed optimization". I would not expect the optimizer to change 4 ldrb into a single ldr. This seems like a code generation issue to me, not like an optimization issue. As can be seen with -O0, the aligned big endian access uses a single ldr but the unaligned big endian access uses the 4 ldrb. It seems gcc behaves like this: if (big_endian_access) { if (aligned_access) { issue "ldr" issue "rev" } else { issue 4 "ldrb" in big-endian order } } else { if (aligned_access || arch_supports_unaligned) { issue "ldr" } else { issue 4 "ldrb" in little-endian order } } When instead, gcc should behave like this: if (aligned_access || arch_supports_unaligned) { issue "ldr" if (big_endian_access) { issue "rev" } } else { if (big_endian_access) { issue 4 "ldrb" in big-endian order } else { issue 4 "ldrb" in little-endian order } }
[Bug c/69502] attribute aligned reduces alignment contrary to documentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69502 --- Comment #5 from Sven --- (In reply to sandra from comment #4) > Fixed on trunk. It's good thing that the documentation reflects the behavior of gcc. But on the other hand, having the align attribute work in both directions is a bad idea, IMHO. Using an attribute to specify an alignment guarantee (setting a lower bound on the actual alignment) is a benign thing. However, forcing lowering the alignment guarantee usually indicates some sort of "trickery" that may force the compiler to circumvent certain limitations of the underlying platform. These two concepts (increasing alignment, lowering alignment) should be kept strictly separate.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #28 from Sven --- (In reply to Eric Gallager from comment #27) > gcc 8 adds -Wpacked-not-aligned; does that fix this bug? I couldn't find documentation on what this switch is supposed to do. Can you point me in the right direction? Is there some commit explaining -Wpacked-not-aligned? FYI: the LLVM developers have added -Waddress-of-packed-member which addresses the issue discussed here.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #31 from Sven --- https://gcc.gnu.org/viewcvs/gcc/trunk/?view=log&pathrev=251180 I am reading the commit message, and the example doesn't make any sense. The aligned attribute is for providing additional alignment guarantees in addition to the types default alignment. In particular, the documentation of the aligned type attribute specifically states, that this attribute cannot decrease alignment. It can only _increase_ alignment. In particular, aligned(4) in combination with a 64bit type makes little sense.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #34 from Sven --- (In reply to H.J. Lu from comment #32) > long long is aligned to 4 bytes in struct for i386. Understood. So the aligned(4) was just added to explicitly restating the alignment? Anyhow, the two warnings added by that commit seem unrelated to this issue. The warnings added check the alignment of variables and members in memory. When they fall below the given threshold, a warning is issued. This bug report however is about a different issue. Every member of a packed struct has an alignment guarantee of 1 - no more than that. However, a regular int* variable guarantees, when dereferenced, guarantees an alignment of 4 (on i386 and arm, for example). So the following code should produce a warning: struct foo { int i; } __attribute__((packed)); struct foo *x; int *y = &(x->i); The issue becomes more obvious with this struct: struct foo2 { char c; int i; } __attribute__((packed)); However, both foo and foo2 only have an alignment guarantee of at most 1, so also the int member inside both structs has an alignment guarantee of at most 1. As mentioned in the initial post, gcc will generate the proper machine code to read an unaligned int (on arm for example) when using x->i directly. However, when using *y, gcc will assume 4 byte alignment. That is to be expected, hence gcc should warn about the fact, and the address of a (potentially) unaligned int is assigned to a regular int* pointer.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #35 from Sven --- (In reply to Sven from comment #34) > That is to be > expected, hence gcc should warn about the fact, and the address of a > (potentially) unaligned int is assigned to a regular int* pointer. Sorry, typo: That is to be expected, hence gcc should warn about the fact that the address of a (potentially) unaligned int is assigned to a regular int* pointer.
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #23 from Sven --- FYI: I have asked the llvm folks to add a warning to their compiler for the when a pointer to a member of a packed struct is assigned to an "ordinary" pointer with higher alignment guarantees. Clearly, I agree with comment #18 that compilers should warn about this. See https://llvm.org/bugs/show_bug.cgi?id=22821
[Bug c/51628] __attribute__((packed)) is unsafe in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 --- Comment #24 from Sven --- Comment #4 mentions typedef int myint __attribute__((aligned(1))); That shouldn't even work. The GCC documentation on Type Attributes mentions that "The aligned attribute can only increase the alignment". It goes on to mention the packed attribute, which can be used to decrease alignment. But as far as I know, that attributes was designed for structs. Anyhow, it seems that the aligned attribute is intended for increasing the alignment only - not for decreasing. Yet, when I checked __alignof(myint) with both gcc and clang, it was in fact decreased from 4 to 1. Not sure why. That seems to contradict the documentation.
[Bug c/97833] New: -Wconversion behaves erratic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97833 Bug ID: 97833 Summary: -Wconversion behaves erratic Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com Target Milestone: --- Created attachment 49559 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49559&action=edit non-working example Find attached an example for which -Wconversion behaves uncomprehensible. Why does it yields warning for test2 but not for test1 and test3? This happens with gcc for 32bit arm and gcc for x86_64. In all 3 functions, we have 2 shift operations. The operand is uint16_t, which is promoted to int. The result of the shift operations is cast to uint16_t. So the operands of the bit-wise or are again uint16_t. So both operands of the bitwise or are promoted to int. So basically, in all 3 cases the code is returning an int. However, -W conversion warns only in 1 case. Also, why does it matter whether x is shifted by 0 or 1 ? Why does a shift by 0 result in an error, and a shift by 1 does not? Why does it matter whether x and y are originally uint8_t being cast to uint16_t (test2) or a uint16_t (test3) originally? In both cases, the result of the shifts is cast to uint16_t. Is gcc trying to keep track of the range of the individual expressions? Is gcc somehow failing when the a shift by 0 occurs? I believe that a shift by zero is defined behavior.
[Bug c/101950] New: __builtin_clrsb is never inlined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101950 Bug ID: 101950 Summary: __builtin_clrsb is never inlined Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com Target Milestone: --- With gcc 11.1 on ARM 32-bit and Intel, I don't see that __builtin_clrsb is inlined. On AARCH64 it is inlined and the cls instruction is used, as expected. I use the C-code below to compare the assembly generated. For ARM, I use -O3 -mcpu=cortex-a53 -marm and for Intel I just use -O3. On ARM 32-bit, clrsb1 seems to be the fastest code (see below for the assembly code) since clz handles zero correctly. On Intel, bsr does not handle zero, hence the workaround of setting the lsb before calling __builtin_clzl (see below for the assembly code). On Intel, clrsb1 is slighly longer and uses a jump to handle the zero case. clang apparently uses variant clrsb1 on ARM and Intel, and it's inlined on both architectures when using -O3. #define SHIFT (sizeof(x)*8-1) int clz(unsigned long x) { if (x == 0) { return sizeof(x)*8; } return __builtin_clzl(x); } int clsb(long x) { return clz(x ^ (x >> SHIFT)); } int clrsb1(long x) { return clsb(x)-1; } int clrsb2(long x) { x = ((x << 1) ^ (x >> SHIFT)) | 1; return __builtin_clzl(x); } int clrsb3(long x) { return __builtin_clrsbl(x); } on ARM 32-bit: clrsb1: eor x0, x0, x0, asr 63 clz x0, x0 sub w0, w0, #1 ret on Intel: clrsb2: lea rax, [rdi+rdi] sar rdi, 63 xor rax, rdi or rax, 1 bsr rax, rax xor eax, 63 ret
[Bug middle-end/101973] New: subtraction of clz is not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101973 Bug ID: 101973 Summary: subtraction of clz is not optimized Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sven.koehler at gmail dot com Target Milestone: --- On Intel x86_64, the generated code for __builtin_clz(x) looks something like this: clz(x) = 63 - bsr(x) Since Intel does not seem to have a way to do 63-y in a single instruction, XOR is used instead and the actual assembly code corresponds to clz(x) = 63 ^ bsr(x). Since bsr(x) is in the range 0 to 63, the XOR with 63 is equivalent to 63-y. However, when we actually need the index of the most significant non-zero bit, we have another 63-y, as in this function: int bsr(unsigned long x) { return sizeof(x)*8 - 1 - __builtin_clzl(x); } With -O3, GCC emits the following assembly code: bsr: bsr rdi, rdi mov eax, 63 xor rdi, 63 sub eax, edi ret The XOR with 63 and the subtraction from 63 cancel each other out in this special case. LLVM/clang performs this optimization. One might also consider the arbitrary case of z-clz(x) as a test case. On Intel, this is equivalent to bsr(x)+(z-63).
[Bug middle-end/101973] subtraction of clz is not optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101973 Sven changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from Sven --- OK. Closing this myself.