Hi Richard, > Could you give details? I thought it was always known that trapped > system register accesses were slow. In the previous versions, the > checks seemed to be presented as an up-front price worth paying for > faster atomic operations, on the systems that would use those paths. > Now the checks are being presented as something that are good to remove > to make the code simpler and faster.
The system register checks came from early versions (~2 years ago) when there was no HWCAP defined yet, so Victor added them for testing. The idea was to leave them for a while so you could get new atomics on an older kernel, and remove them once newer kernels became available. > There have been a few changes to this code in the current release cycle, > and each time it seems like the new version is being presented as better > than the previous one with single-sentence justifications. I'm not sure which commits with single-sentence justifications you mean? There has been only 1 commit in the current cycle after the rcpc3 code was added, and that was a minor cleanup that also fixed a bug. > Could we instead have a comment in the code discussing the various > approaches that we could take, including the ones that previous versions > took, describes the trade-offs, and explains why we've chosen to do what > we've chosen to do? This should be in the kernel documentation - the advice is: use HWCAPs and avoid system register reads. Note I fixed libgcc/config/aarch64/cpuinfo.c to remove all system register reads as well. Cheers, Wilco