https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #23 from Richard Henderson <rth at gcc dot gnu.org> --- (In reply to Jiong Wang from comment #21) > Please check the documentation at > http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/ > Cortex_A57_Software_Optimization_Guide_external.pdf, page 14, the line > describe "Load register, register offset, scale by 2". Interesting that only HImode loads suffer that penalty, and that QI, SI and DImode loads (scale by 4/8) don't. But nevermind. > Agreed, while double check the code, for the performance related "scale by > 2" situation, aarch64 backend has already made it a illegitimate address. > > There is the following check in aarch64_classify_address, the "GET_MODE_SIZE > (mode) != 16" is catching that. > > bool allow_reg_index_p = > !load_store_pair_p > && (GET_MODE_SIZE (mode) != 16 || aarch64_vector_mode_supported_p (mode)) > && !aarch64_vect_struct_mode_p (mode); > > So if the address is something like (base for short + index * 2), then it > will go through the aarch64_legitimize_address. Um, no, it won't. That's a 16 byte test, not 16 bit. So: Not a load/store_pair & not 16 bytes & not a vect mode = allow_reg_index. > Thus I think your second > patch at #c10 with my minor modification still is the proper fix for current > stage. That said I agree. I don't think we should change move expanders at this time. Patch committed as approved.