On Thu, 20 Jan 2022 at 13:25, Idan Horowitz <idan.horow...@gmail.com> wrote: > > On Thu, 20 Jan 2022 at 14:32, Peter Maydell <peter.mayd...@linaro.org> wrote: > > > > > > But the code you are effectively removing is never executed > > for the instructions where you're changing the access function. > > If you're proposing this as a performance improvement, can > > you provide before-and-after benchmarks demonstrating that > > improvement ? > > > > I wanted to say that in my micro-benchmark of DC IVAC I saw a 1% > decrease in runtime, but I failed to replicate it again now, so I must > have accidentally ran it together with one of my other patches last > time. > Sorry for wasting your time with the review.
No worries. Incidentally, it's not surprising that if you microbenchmark the cache instructions the trap-checking appears as a large component of it -- for QEMU cache ops are NOPs so trap checking is the *only* thing that the instruction has to do. It's probably worth looking at benchmarks of real workloads to try to identify whether any particular instruction is a significant component before spending much time on trying to improve its performance. -- PMM