On Sat, May 16, 2020 at 07:02:25PM -0700, David Miller wrote: > > The KS8851SNL/SNLI and KS8851-16MLL/MLLI/MLLU are very much the same pieces > > of silicon, except the former has an SPI interface, while the later has a > > parallel bus interface. Thus far, Linux has two separate drivers for each > > and they are diverging considerably. > > > > This series unifies them into a single driver with small SPI and parallel > > bus specific parts. The approach here is to first separate out the SPI > > specific parts into a separate file, then add parallel bus accessors in > > another separate file and then finally remove the old parallel bus driver. > > The reason for replacing the old parallel bus driver is because the SPI > > bus driver is much higher quality. > > What strikes me in these changes is all of the new indirect jumps in > the fast paths of TX and RX packet processing. It's just too much for > my eyes. :-) > > Especially in the presence of Spectre mitigations, these costs are > quite non-trivial. > > Seriously, I would recommend that instead of having these small > indirect helpers, just inline the differences into two instances of > the RX interrupt and the TX handler.
I agree. However in terms of performance there's a bigger problem: Previously ks8851.c (SPI driver) had 8-bit and 32-bit register accessors. The present series drops them and performs a 32-bit access as two 16-bit accesses and an 8-bit access as one 16-bit access because that's what ks8851_mll.c (16-bit parallel bus driver) does. That has a real, measurable performance impact because in the case of 8-bit accesses, another 8 bits need to be transferred over the SPI bus, and in the case of 32-bit accesses, *two* SPI transfers need to be performed. The 8-bit and 32-bit accesses happen in ks8851_rx_pkts(), i.e. in the RX hotpath. I've provided numbers for the performance impact and even a patch to solve them but it was dismissed and not included in the present series: https://lore.kernel.org/netdev/20200420140700.6632hztejwcgj...@wunner.de/ The reason given for the dismissal was that I had performed the measurements on 4.19 which is allegedly "long dead" (in Andrew Lunn's words). However I can assure you that performing two SPI transfers has not magically become as fast as performing one SPI transfer since 4.19. So the argument is nonsense. Nevertheless I was going to repeat the performance measurements on a recent kernel but haven't gotten around to that yet because the measurements need to be performed with CONFIG_PREEMPT_RT_FULL to be reliable (a vanilla kernel is too jittery), so I have to create a new branch with RT patches on the test machine, which is fairly involved and time consuming. I think it's fair that the two drivers are unified, but the performance for the SPI variant shouldn't be unnecessarily diminished in the process. Thanks, Lukas