https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120221
Bug ID: 120221 Summary: Missed optimization related to switch handling Product: gcc Version: 15.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: christophe.jaillet at wanadoo dot fr Target Milestone: --- Hi, while looking at generate code around the GENMASK, FIELD_PREP and co macro in Linux, I compared the generated code from cht_wc_extcon_get_id() in drivers/extcon/extcon-intel-cht-wc.c I use default linux flags, so this should be compiled at -O2. The code looks like: #define CHT_WC_PWRSRC_USBID_MASK GENMASK(4, 3) #define CHT_WC_PWRSRC_USBID_SHIFT 3 #define CHT_WC_PWRSRC_RID_ACA 0 #define CHT_WC_PWRSRC_RID_GND 1 #define CHT_WC_PWRSRC_RID_FLOAT 2 static int cht_wc_extcon_get_id(struct cht_wc_extcon_data *ext, int pwrsrc_sts) { switch ((pwrsrc_sts & CHT_WC_PWRSRC_USBID_MASK) >> CHT_WC_PWRSRC_USBID_SHIFT) { case CHT_WC_PWRSRC_RID_GND: return INTEL_USB_ID_GND; case CHT_WC_PWRSRC_RID_FLOAT: return INTEL_USB_ID_FLOAT; case CHT_WC_PWRSRC_RID_ACA: return INTEL_USB_RID_A; default: return INTEL_USB_ID_FLOAT; } } The generated code, on x86_64 looks like: testl %eax, %eax jne .L142 movslq 8(%rsp), %rax shrq $3, %rax andl $3, %eax je .L106 cmpq $1, %rax jne .L107 In order to get rid of the >>, I tried to make a better use of FIELD_PREP_CONST and I changed the code into: #define CHT_WC_PWRSRC_USBID_MASK GENMASK(4, 3) #define CHT_WC_PWRSRC_RID_ACA FIELD_PREP_CONST(CHT_WC_PWRSRC_USBID_MASK, 0) #define CHT_WC_PWRSRC_RID_GND FIELD_PREP_CONST(CHT_WC_PWRSRC_USBID_MASK, 1) #define CHT_WC_PWRSRC_RID_FLOAT FIELD_PREP_CONST(CHT_WC_PWRSRC_USBID_MASK, 2) static int cht_wc_extcon_get_id(struct cht_wc_extcon_data *ext, int pwrsrc_sts) { switch (pwrsrc_sts & CHT_WC_PWRSRC_USBID_MASK) { case CHT_WC_PWRSRC_RID_GND: return INTEL_USB_ID_GND; case CHT_WC_PWRSRC_RID_FLOAT: return INTEL_USB_ID_FLOAT; case CHT_WC_PWRSRC_RID_ACA: return INTEL_USB_RID_A; default: return INTEL_USB_ID_FLOAT; } } The generated code now looks like: testl %eax, %eax jne .L142 movl 8(%rsp), %eax andl $24, %eax je .L106 cmpl $8, %eax jne .L107 The shrq is correctly removed, and we compare against 8 in place of 1 which looks correct. However, I wonder why the initial version used movslq and cmpq, while the other version only andl and cmpl. Using the 'q' version looks useless to me and not-optimal. Should gcc manage by itself to see that 'q' version are not needed in both cases?