On 28/4/18 1:11 am, William Busacker wrote: > > Can someone point me in the direction of material that can explain how > RTEMS uses the MMU on an ARM processor (specifically the ARM11 that the > Raspberry Pi uses)? I want to see if there are any optimizations I make > in code to take better advantage of how the memory access system works.
I am not across all of the RPi config so what I provide is what I know happens on other ARM devices. The MMU set up is here: https://git.rtems.org/rtems/tree/bsps/arm/raspberrypi/start/mm_config_table.c The values in the table are configured into the MMU using the BSP start hooks and there are two of these. They are here for the RPi: https://git.rtems.org/rtems/tree/bsps/arm/raspberrypi/start/bspstarthooks.c The hooks are called from the generic ARM start up code: https://git.rtems.org/rtems/tree/bsps/arm/shared/start/start.S > My reasoning for trying this is I have a bit of software that is taking > several 12x12 matrices and vectors of similar size and > multiplying/adding/inverting them all together a few hundred times over. > These are all float type, and while I know the sheer calculation time is > quite high the measured execution time is much higher than it should be > leading to me suspect that there is memory bottle neck. What I would > like to know is how the caching system works so I can maybe make > adjustments to take better advantage of the cache and possibly reduce > execution time. I hope the links above provide you with enough information to figure this out. Please report back what you find. I know on a Zynq which is initialised in a similar way the memory bandwidth is high and you need specialized hand crafted NEON instructions to get the maximum from a single ARM core. What compiler flags are you using? I see in: https://git.rtems.org/rtems/tree/bsps/arm/raspberrypi/config/raspberrypi2.cfg the RPi2 has a NEON. Are you using the NEON? I know Eigen has explicit vectorization for a NEON and that makes a difference: http://eigen.tuxfamily.org/index.php?title=Main_Page > This code is also being generated using Matlab's C Autocoder so that > code itself isn't exactly readable (but being that I'm using a > University Matlab license for free, I can't complain too much) so I'd > like to try and keep manual adjustments to a minimum. If anyone knows of > tricks to get autocoder to play nicer, that would be great too. Sorry, I do not use it. I would run Linux on a similar RPi, compile the code and compare the compiler options and generated code. I would also run the code in some form of a test and benchmark it. Chris _______________________________________________ users mailing list users@rtems.org http://lists.rtems.org/mailman/listinfo/users