https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121262
Bug ID: 121262 Summary: (x86) GCC sometimes produces 'cmp' instructions of larger register width Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: Explorer09 at gmail dot com Target Milestone: --- I'm filing this issue for x86 architectures only, but the issue might also exist for other architecture targets. When a value is extracted from a pointed buffer, and stored it to a variable of a larger bit width, GCC can recognize the upper bits of the new variable is zero, but not utilize that fact when producing the 'cmp' instructions. An example can show the issue: ```c #include <stdint.h> uint64_t func1_a(uint32_t *p) { uint64_t value = *p; value &= 0xFFFFFFFF; // Should be no-op if (value < 0x12345678) { return value; } return 0x40000000; } uint64_t func1_b(uint32_t *p) { uint64_t value = *p; value &= 0xFFFFFFFF; // Should be no-op if ((uint32_t)value < 0x12345678) { return value; } return 0x40000000; } ``` x86-64 gcc 15.1 with '-Os' option (I tested this in Compiler Explorer) produces: ```assembly func1_a: movl (%rdi), %eax movl $1073741824, %edx cmpq $305419896, %rax cmovnb %rdx, %rax ret func1_b: movl (%rdi), %eax movl $1073741824, %edx cmpl $305419896, %eax cmovnb %edx, %eax ret ``` Note the "cmp rax, <constant>" is used in `func1_a`. This is unnecessary and makes the code one byte larger than "cmp eax, <constant>". This issue does not appear when "value" is not from a pointed buffer. (You can compare the code with `func1_noptr` in the additional test code below.) Additional test code: ```c uint32_t func2_a(uint16_t *p) { uint32_t value = *p; value &= 0xFFFF; // Should be no-op if (value < 0x1234) { return value; } return 0x4000; } uint32_t func2_b(uint16_t *p) { uint32_t value = *p; value &= 0xFFFF; // Should be no-op if ((uint16_t)value < 0x1234) { return value; } return 0x4000; } uint32_t func3_a(uint8_t *p) { uint32_t value = *p; value &= 0xFF; // Should be no-op if (value <= 0x7F) { return value; } return (uint32_t)-1; } uint32_t func3_b(uint8_t *p) { uint32_t value = *p; value &= 0xFF; // Should be no-op if ((uint8_t)value <= 0x7F) { return value; } return (uint32_t)-1; } // `func1_noptr`, `func2_noptr` and `func3_noptr` have no issues uint64_t func1_noptr(uint32_t x) { uint64_t value = x; if (value < 0x12345678) { return value; } return 0x40000000; } uint32_t func2_noptr(uint16_t x) { uint32_t value = x; if (value < 0x1234) { return value; } return 0x4000; } uint32_t func3_noptr(uint8_t x) { uint32_t value = x; if (value <= 0x7F) { return value; } return (uint32_t)-1; } ``` The 8-bit version of the test code, `func3`, is the actual problem I'm facing. I want to check whether a byte in an array is in the [0x00, 0x7F] range (which is something about checking whether the string is ASCII). And GCC missed the opportunity to produce a smaller code there. `func3_a` and `func3_b` should be equivalent.