from:"Ken.Unger"

[RFC] Adding RISC-V Vector support to RTEMS

2024-03-14 Thread Ken.Unger

Hello RTEMS experts,

We're in the process of implementing support for RTEMS on a new RISC-V
platform. Among other things, our processor core supports the RISC-V Vector
ISA (RVV), with its 32 vector registers which in our case are 512 bits (VLEN)
deep. RVV is used by applications to accelerate a variety of code/algorithms
utilizing either the auto-vectorizer in GCC/Clang, or through C intrinsics or
direct assembly coding.

My query here is regarding context saving, and options for optimization.
Before I get to that, here's a few points of background.

#1. Use of RVV by the compiler (GCC/Clang) is dictated by the ISA string
provided, e.g -march=rv64imadcv, where v is for vector. The compiler
preprocessor symbol "__riscv_vector" can then be used for conditionally
compiled code, similar to what is done for floating point.

#2. Enablement of RVV can be controlled via a machine status CSR. This allows
one to disable RVV entirely (triggering an exception if RVV instructions are
executed), but also allows one to track the dirty/clean state of the vector
register file.
(https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-context-status-in-mstatus).
So, one can conditionally save/restore the registers, although the 2KB of
stack space (32 x (512b / 8)) would need to be allocated.

#3. The C ABI specifies that vector registers are Caller saved. So, any system
call (or any function call for that matter) does not need to preserve these
registers.

A relatively straightforward approach is to say that RTEMS + Application is
built with vectors enabled (i.e -march=rv64imadcv). One does not need to
save/restore the vector registers within Context_Control because of #3,
however, because we are now building RTEMS with V enabled, we need to add this
save/restore to the CPU_Interrupt_frame and incur that cost on all interrupts
and stack space for all tasks. (A small secondary effect is that
CPU_INTERRUPT_FRAME_SIZE is now larger than the immediate addressing range).

Is there an alternative to consider? Might one build RTEMS itself without V,
leaving that only for Applications/Tasks, perhaps adding a Task attribute, and
thereby taking the vector save/restore penalty only when switching into or out
of a vector enabled task? (Perhaps similar to HW floating point from the
past). One would then use an RTEMS config flag to enable vector support,
although qualified in runtime by validating the existence of the vector ISA
(misa CSR).Or maybe that direction would require too many changes? e.g I
see that "is_fp" is used more specifically, rather than the more general
rtems_attribute in Thread_Control and in the arguments to
_Context_Initialize(). Anyways, I'm happy to hear any thoughts on this
subject. Note that I'm new to RTEMS, so may not have some past context.

Thank you,
Ken
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

RE: [RFC] Adding RISC-V Vector support to RTEMS

2024-04-04 Thread Ken.Unger

Just a late note to say thanks for the direction Sebastien.  What you suggest 
makes sense to me, requiring (and enforcing) that vector registers not be used 
within _RISCV_Interrupt_dispatch() while preserving the vector state across 
_Thread_Do_dispatch().  (ref: cpukit/score/cpu/riscv/riscv-exception-handler.S)

We'll upstream/PR in a few months once we have things cleaned and once you are 
clear of the gitlab transition.  Related to this, we'll be adding a new RISC-V 
BSP with some shared/shareable components for RISC-V AIA (Advanced Interrupt 
Architecture) components APLIC, IMSIC. 

Ken

-Original Message-
From: Sebastian Huber  
Sent: Friday, March 15, 2024 7:30 AM
To: Ken Unger - C34024 ; devel@rtems.org
Subject: Re: [RFC] Adding RISC-V Vector support to RTEMS

Hello Ken,

On 14.03.24 18:33, ken.un...@microchip.com wrote:
> Hello RTEMS experts,
>
> We’re in the process of implementing support for RTEMS on a new RISC-V 
> platform.  Among other things, our processor core supports the RISC-V 
> Vector ISA (RVV), with its 32 vector registers which in our case are 512
> bits (VLEN) deep.   RVV is used by applications to accelerate a variety
> of code/algorithms utilizing either the auto-vectorizer in GCC/Clang, 
> or through C intrinsics or direct assembly coding.
>
> My query here is regarding context saving, and options for 
> optimization.  Before I get to that, here’s a few points of background.
>
> #1. Use of RVV by the compiler (GCC/Clang) is dictated by the ISA 
> string provided, e.g -march=rv64imadcv,  where v is for vector.  The 
> compiler preprocessor symbol "__riscv_vector" can then be used for 
> conditionally compiled code, similar to what is done for floating point.
>
> #2. Enablement of RVV can be controlled via a machine status CSR.   This
> allows one to disable RVV entirely (triggering an exception if RVV 
> instructions are executed), but also allows one to track the 
> dirty/clean state of the vector register file.
> (https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-context-status-in-mstatus
>  
> ).
>   So, one can conditionally save/restore the registers, although the 2KB of 
> stack space (32 x (512b / 8)) would need to be allocated.
>
> #3. The C ABI specifies that vector registers are Caller saved.  So, 
> any system call (or any function call for that matter) does not need 
> to preserve these registers.
>
> A relatively straightforward approach is to say that RTEMS + 
> Application is built with vectors enabled (i.e  -march=rv64imadcv).  
> One does not need to save/restore the vector registers within 
> Context_Control because of #3, however,

Yes, in this case you don't have to save/restore the vector registers in 
_CPU_Context_switch(). This is similar to the SPARC architecture. For SPARC, we 
implemented a lazy floating point switch
(cpukit/score/cpu/sparc/include/rtems/score/cpu.h):

/*
  * The SPARC ABI is a bit special with respect to the floating point context.
  * The complete floating point context is volatile.  Thus, from an ABI point
  * of view nothing needs to be saved and restored during a context switch.
  * Instead the floating point context must be saved and restored during
  * interrupt processing.  Historically, the deferred floating point switch was
  * used for SPARC and the complete floating point context is saved and
  * restored during a context switch to the new floating point unit owner.
  * This is a bit dangerous since post-switch actions (e.g. signal handlers)
  * and context switch extensions may silently corrupt the floating point
  * context.
  *
  * The floating point unit is disabled for interrupt handlers.  Thus, in case
  * an interrupt handler uses the floating point unit then this will result in a
  * trap (INTERNAL_ERROR_ILLEGAL_USE_OF_FLOATING_POINT_UNIT).
  *
  * In uniprocessor configurations, a lazy floating point context switch is
  * used.  In case an active floating point thread is interrupted (PSR[EF] == 1)
  * and a thread dispatch is carried out, then this thread is registered as the
  * floating point owner.  When a floating point owner is present during a
  * context switch, the floating point unit is disabled for the heir thread
  * (PSR[EF] == 0).  The floating point disabled trap checks that the use of the
  * floating point unit is allowed and saves/restores the floating point context
  * on demand.
  *
  * In SMP configurations, the deferred floating point switch is not supported
  * in principle.  So, use here a synchronous floating point switching.
  * Synchronous means that the volatile floating point context is saved and
  * restored around a thread dispatch issued during interrupt processing.  Thus
  * post-switch actions and context switch extensions may safely use the
  * floating point unit.
  */
#if SPARC_HAS_FPU == 1
   #if defined(RTEMS_SMP)
 #define SPARC_USE_SYNCHRONOUS_FP_SWITCH

[RFC] Adding RISC-V Vector support to RTEMS

RE: [RFC] Adding RISC-V Vector support to RTEMS

2 matches

Site Navigation

Mail list logo

Footer information