Matthew Fortune <matthew.fort...@imgtec.com> writes: > > >> > Do you have an objection to allowing an option to disable these > > >> instructions (despite the reason for wanting to do so)? > > >> > > >> Yes this seems like a bad workaround for broken code. > > >> > > > Well, we work around broken hardware all the time. What would you > > suggest that Steve do instead? > > > > We work around broken hardware all the time yes but that is something > > which is hard to change. Software is easier to fix. That does not > > mean in the compiler. The compiler team should not get themselves in > > the business of working around broken software that depends on > > undefined behavior (this undefined behavior is very obvious and easier > > to fix in the source of the problem). Report the bug to Google. > > I'm mostly in agreement with Andrew on this. I'd like to understand more > about the use-case and what happens for other architectures (only ARM > and x86 are really relevant here I guess). I believe x86 will just > happily load/store doubles to 4-byte aligned addresses but I'm not sure > for ARM. If ARM has to take unaligned access faults for this then I > think MIPS simply has to do the same. > > Unless the application is allocating thousands of small chunks of data > and the padding is significant in the overall memory usage then I'd have > thought the allocator could be fixed relatively easily. I'd also expect > that to be generally a performance win for the application on all > architectures. > > 'If' we were to make a change to the compiler then I would make it a > little more generic/focus on the actual issue which relates to alignment. > I'm not sure what I would call the option but its effect would be for > the compiler to not rely on any greater than 'X' alignment and disable > the use of any load/store instructions which need larger alignment. > > That still feels like a bad feature though.
Opinions aside there are actually some technical problems with this feature. In particular we have to consider that the o32 ABI is about to start its long awaited transition to use 64-bit floating-point registers. We know that removing LDC1/SDC1 is possible for the original O32 FP32 ABI as doubles just get handled with pairs of LWC1/SWC1. Now if we consider O32 FP64 then it is not possible to use LWC1 to access the upper-32-bits of doubles but we do have MTHC1/MFHC1 so the data would have to move via GPRs which doubles the penalty in a way as GPR-FPR transfers are generally slow. (I think the patch as it stands will fail if used with O32 FP64 but I haven't looked in detail) The bigger problem comes from the transitional ABI O32 FPXX. Firstly, O32 FPXX on MIPS32r2 onwards... this will be OK as it has MTHC1/MFHC1 so can use the same solution as O32 FP64. Next, O32 FPXX on MIPS II... this is simply impossible, there is no way to load a double-precision FPR given the constraints of the O32 FPXX ABI and the removal of LDC1/SDC1. While the patch could be fixed to account for all of this we need to determine if the case which won't work is the exact one which is needed (and I believe it is). I.e. Android applications for MIPS are MIPS32r1 O32 FP32 NAN1985 currently and will transition from FP32 to FPXX to enable the use of things like MSA. Since MIPS32r1 does not have MTHC1/MFHC1 then this solution will not work. We'll have to talk with the google engineers to explain the issues. Matthew