https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117421
Alexander Monakov <amonakov at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amonakov at gcc dot gnu.org --- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> --- Not necessarily — lbu-li-bne is fine when mismatch on the first character is expected, or when you are evaluating the compiler using dynamic instruction count under Qemu as RISC-V developers in GCC sometimes do today. Your load-xor-or sequence could be useful on all targets, not just RISC-V, if we expect misprediction penalties on early characters to outweigh extra work, but the compiler has little basis to make that decision. We probably could fine-tune it a little, e.g. if we know that load ports vs. branch units are balanced 2:1, we should try to do two loads per each branch using your xor-or reduction. But then again, it won't change anything for the "default" RISC-V codegen. -mtune=generic-ooo also implies fast unaligned access. I feel this bugreport became a bit unfocused. For the benefit of anyone looking at it in future, can you restate what it is about?