On 08.02.24 11:43, Cedric Berger wrote:
Hello Sebastian,
On 08.02.2024 11:09, Sebastian Huber wrote:
Hello Cedric,
On 08.02.24 10:53, Cedric Berger wrote:
Hello,
I've a question: does RTEMS really wants to support FPU operations in
ISRs?
Because if the answer is "no", then I believe that we could simplify
the RTEMS code (and for me the mental model of the whole thing) by
running the FPU with both FPCCR.ASPEN and FPCCR.LSPEN = 0.
This mean that during IRQ/exception, the simple exception frame (32
bytes) would always be used instead of a combination of the simple
and extended frame (32 or 116 bytes).
This would then improve the real-time guarantees of the system, by
having a shorter and more deterministic IRQ response time.
if you don't use the FPU in ISRs, then the overhead is just a space
overhead with the lazy FPU save/restore.
Yes, but it still means more cache lines, right?
Yes, but this should not really matter if we don't write to these lines.
And functions like _ARMV7M_Pendable_service_call and
_ARMV7M_Supervisor_call now have to save/restore 116 bytes instead of 32
bytes, right?
The functions are only used if you switch threads. If the ISR returns to
the interrupted thread immediately, then you don't have to save/restore
stuff.
If you switch from an ISR to another thread you have to save/restore
the volatile FPU context anyway.
Yes, my idea was to simply move d0-d7 out of struct
ARMV7M_Exception_frame and into struct Context_Control.
And killing functions like
_ARMV7M_Trigger_lazy_floating_point_context_save()
I am not sure if it is that simple if you implement the deferred FPU
switching.
This would also simplify the context switching code, by centralizing
of the saving of the FPU context in RTEMS only, and enabling
optimisation like only saving/restoring the FPU when switching
between tasks defined with RTEMS_FLOATING_POINT.
What do you think? I'm missing something? would it be a good idea?
From experience, working with the RTEMS_FLOATING_POINT in applications
is quite annoying. Is there really a measurable and significant
performance improvement if you enable the deferred FPU switching? Can
you guarantee that the compiler will not generate FPU or vector
instructions for integer operations? In this version or a GCC release
in the future?
Obviously, since I'm not God, I won't be able to provide any guarantee
regarding the future :)
But I believe that if GCC started to use FPU for integer operations,
many people would complain:
FreeBSD requires fpu_kern_enter/fpu_kern_leave to use FPU in the kernel,
and Linux requires kernel_fpu_begin/kernel_fpu_end to use FPU ops in the
kernel.
I'm pretty sure Linus will give GCC developpers a hard time if they
start to use FPU for integer operations anytime soon...
I am definitely sure that on PowerPC the AltiVec unit is used to
optimize memory copies and initializations. I agree that it is unlikely
that GCC will use the FPU for integer operations.
I would be willing to work on that is there is some kind of agreement
here.
If you change the ARMv7-M CPU port to use the deferred FPU switching,
then you surely break existing applications which then have to use
RTEMS_FLOATING_POINT. I would do this only if there would be a clear
and measurable performance improvement. For the measurements we would
need a benchmark.
Ok, so no to using deferred FPU switching for the moment, at least
without benchmark.
From my point of view, yes.
But what about just running with FPCCR.ASPEN and FPCCR.LSPEN = 0, and
always saving the FPU in _CPU_Context_switch when swithing tasks?
You have to consider that if you switch after an ISR to another thread,
then you have to save the volatile FPU context of the interrupted
thread. If you switch back to the thread interrupted by the ISR, then
you have to restore the volatile FPU context.
It would only break existing applications which use the FPU inside IRQs,
is that a problem? is the ISR FPU behaviour/requirements documented
somewhere in RTEMS?
To be honest, my main motivation here is trying to simplify the code and
more importantly improve debuggability and my understanding of how the
system work. my head kind of hurts trying to understand exactly at which
point in the code a lazy FPU save can occur.
Yes, the ARMv7-M context switching is unusually complicated. I am not
sure if the deferred FPU context switching will simplify things.
--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax: +49-89-18 94 741 - 08
Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel