Re: [PATCH] tester: Limit simultaneous QEMU jobs to 1

2021-08-31 Thread Chris Johns
On 31/8/21 3:20 pm, Sebastian Huber wrote:
> On 30/08/2021 20:32, Kinsey Moore wrote:
>> On 8/30/2021 12:12, Sebastian Huber wrote:
>>> On 24/08/2021 20:45, Kinsey Moore wrote:
 diff --git a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
 b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
 index 3beba06..581c59c 100644
 --- a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
 +++ b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
 @@ -36,3 +36,4 @@ bsp   = a53_ilp32_qemu
   arch  = aarch64
   tester    = %{_rtscripts}/qemu.cfg
   bsp_qemu_opts = %{qemu_opts_base} -serial mon:stdio -machine
 virt,gic-version=3 -cpu cortex-a53 -m 4096
 +jobs  = 1
>>>
>>> Does this overwrite the command line option or is this a default value?
>>>
>> When this is set in the tester configuration, the command line switch has no
>> effect but it can be overridden in the user-config.
> 
> Overruling the command line option is not that great. I have a vastly 
> different
> test run duration with --jobs=1 vs. --jobs=48 with more or less the same test
> results. 

What does more or less mean?

I appreciate the efforts Kinsey has gone to looking into why we have this
happening and I also believe we need to keep pushing towards repeatable result.
If limiting to 1 gives us repeatable results on qemu then I prefer this over
tainted test results with intermittent tags.

> I think this option should be split into a "force-jobs" and
> "default-jobs" option.

I am sorry I do not understand these options?

The command line is ignored because and the value is fixed on purpose and I am
not seeing a reason to change this.

When specified in a config it is a physical limit. A user being able to change
the number of TFTP jobs on the command line does not make sense.

This tool's focus is testing on hardware and I see that as more important. And
as I have said before if we have problematic tests maybe the test or the tool
generating the results needs to be investigated.

I see this issue as something specific to the design of qemu and a few of our
tests. I can guess at some of the reasons qemu does this but also being able to
have the tick timer's clock be sync'ed with the CPU clock is important in some
types of simulation, ie our case and these problematic test. We are a real-time
operating system so needing this to be right to closer in simulation does not
seem unreasonable.

This discussion send a clear message, tier 1 archs and BSPs are very important
to this project.

Chris
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: [PATCH] tester: Limit simultaneous QEMU jobs to 1

2021-08-31 Thread Sebastian Huber

On 31/08/2021 09:00, Chris Johns wrote:

On 31/8/21 3:20 pm, Sebastian Huber wrote:

On 30/08/2021 20:32, Kinsey Moore wrote:

On 8/30/2021 12:12, Sebastian Huber wrote:

On 24/08/2021 20:45, Kinsey Moore wrote:

diff --git a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
index 3beba06..581c59c 100644
--- a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
+++ b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
@@ -36,3 +36,4 @@ bsp   = a53_ilp32_qemu
   arch  = aarch64
   tester    = %{_rtscripts}/qemu.cfg
   bsp_qemu_opts = %{qemu_opts_base} -serial mon:stdio -machine
virt,gic-version=3 -cpu cortex-a53 -m 4096
+jobs  = 1


Does this overwrite the command line option or is this a default value?


When this is set in the tester configuration, the command line switch has no
effect but it can be overridden in the user-config.


Overruling the command line option is not that great. I have a vastly different
test run duration with --jobs=1 vs. --jobs=48 with more or less the same test
results.


What does more or less mean?


On Qemu some tests have no reliable outcome. If I run with --jobs=48 
only two of these tests fail compare to --jobs=1.




I appreciate the efforts Kinsey has gone to looking into why we have this
happening and I also believe we need to keep pushing towards repeatable result.
If limiting to 1 gives us repeatable results on qemu then I prefer this over
tainted test results with intermittent tags.


During development waiting one minute is much better than waiting 13 
minutes. Repeatable tests is one aspect, but there are other aspects 
too. Overruling command line options is not that great. If you run with 
default values, it is all right to trade off repeatable results against 
a fast test run. However, if I want to run with --jobs=N, I want to run 
with N jobs and not just one.



I think this option should be split into a "force-jobs" and
"default-jobs" option.


I am sorry I do not understand these options?


force-jobs forces the jobs to N regardless of what is specified on the 
command line. Maybe a warning or error should be emitted if the command 
line option conflicts with the configuration option.


default-jobs selects the job count if no --jobs command line option is 
specified.




The command line is ignored because and the value is fixed on purpose and I am
not seeing a reason to change this.


Ignoring command line options is not really a pleasant user experience.



When specified in a config it is a physical limit. A user being able to change
the number of TFTP jobs on the command line does not make sense.


Yes, for physical limits this makes sense.



This tool's focus is testing on hardware and I see that as more important. And
as I have said before if we have problematic tests maybe the test or the tool
generating the results needs to be investigated.

I see this issue as something specific to the design of qemu and a few of our
tests. I can guess at some of the reasons qemu does this but also being able to
have the tick timer's clock be sync'ed with the CPU clock is important in some
types of simulation, ie our case and these problematic test. We are a real-time
operating system so needing this to be right to closer in simulation does not
seem unreasonable.

This discussion send a clear message, tier 1 archs and BSPs are very important
to this project.


There are several ways to address the sporadic test failures on Qemu. 
You could for example also change the tests to make them independent of 
the simulator timing. For now, my approach would be to change the 
default jobs count for the Qemu BSPs and still let the user overrule the 
default with a custom value to get for example a faster test run.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: [PATCH v1 2/5] cpukit: Add Exception Manager

2021-08-31 Thread Sebastian Huber

On 30/08/2021 17:13, Kinsey Moore wrote:

On 8/30/2021 07:50, Sebastian Huber wrote:

On 30/08/2021 14:27, Kinsey Moore wrote:

On 8/30/2021 00:42, Sebastian Huber wrote:

Hello Kinsey,

why can't you use the existing fatal error extension for this? You 
just have to test for an RTEMS_FATAL_SOURCE_EXTENSION source.  The 
fatal code is a pointer to the exception frame.


Unfortunately, the fatal error extensions framework necessarily 
assumes that the exception is fatal and so does not include the 
machinery to perform a thread dispatch or restore the exception frame 
for additional execution. It could theoretically be done in the fatal 
error extensions context, but it would end up being reimplemented for 
every architecture and you'd have to unwind the stack manually. I'm 
sure there are other ragged edges that would have to be smoothed over 
as well.


Non-interrupt exceptions are not uniformly handled across 
architectures in RTEMS currently. Adding the 
RTEMS_FATAL_SOURCE_EXTENSION fatal source was an attempt to do this. I 
am not that fond of adding a second approach unless there are strong 
technical reasons to do this.
This was in an effort to formalize how recoverable exceptions are 
handled. Currently, it's done on on SPARC by handling exception traps as 
you would an interrupt trap since they share a common architecture on 
that platform. This representation varies quite a bit among platforms, 
so we needed a different mechanism.


I recently changed the non-interrupt exception handling on sparc, since 
it was not robust against a corrupt stack pointer:


http://devel.rtems.org/ticket/4459



The initial fatal extensions are quite robust, you only need a stack, 
valid read-only data and a valid code. So, using a user extension is 
the right thing to do, but I don't thing we need a new one.


Doing the non-interrupt exception processing on the stack which caused 
the exception is a bit problematic, since the stack pointer might be 
corrupt as well. It is more robust to switch to for example the 
interrupt stack. If the exception was caused by an interrupt, then 
this exception is not recoverable.


The non-interrupt exception processing occurs on the interrupt stack, 
not the thread/user stack. In the AArch64 support code provided, the 
stack is switched back to the thread/user stack before thread dispatch 
and exception frame restoration occurs.


You can only switch back to the thread stack if it is valid. Doing a 
thread dispatch should be only done if you are sure that the system 
state is still intact. This is probably no the case for most exceptions.







If the non-interrupt exception was caused by a thread, then you could 
do some high level actions for some exceptions, such as floating-point 
exceptions and arithmetic exceptions. If you get a data abort or 
instruction error, then it is probably better to terminate the system.
I leave that decision to the handlers defined on this framework. In the 
case of the exception-to-signal mapping, I'm carrying over the existing 
exception set from the SPARC implementation.


It is probably this code:

+case EXCEPTION_DATA_ABORT_READ:
+case EXCEPTION_DATA_ABORT_WRITE:
+case EXCEPTION_DATA_ABORT_UNSPECIFIED:
+case EXCEPTION_INSTRUCTION_ABORT:
+case EXCEPTION_MMU_UNSPECIFIED:
+case EXCEPTION_ACCESS_ALIGNMENT:
+  signal = SIGSEGV;
+  break;
+
+default:
+  /*
+   * Covers unknown, PC/SP alignment, illegal execution state, and 
any new

+   * exception classes that get added.
+   */
+  signal = SIGILL;
+  break;
+  }

Using signals to handle these exceptions is like playing Russian roulette.



Non-interrupt exception handling is always architecture-dependent. It 
is just a matter how you organize it. In general, the most sensible 
way to deal with non-interrupt exceptions is to log the error somehow 
and terminate the system. The mapping to signals is a bit of a special 
case if you ask me. My preferred way to handle non-interrupt 
exceptions would be to


1. switch to a dedicated stack

2. save the complete register set to the CPU exception frame

3. call the fatal error extensions with RTEMS_FATAL_SOURCE_EXTENSION 
and the CPU exception frame (with interrupts disabled)


Add a new API to query/alter the CPU exception frame, switch to the 
stack indicated by the CPU exception frame, and restore the context 
stored in the CPU exception frame. With these architecture-dependent 
CPU exception frame support it should be possible to implement a high 
level mapper to signals.


What you've described is basically what is happening here (the dedicated 
stack is currently the interrupt/exception stack on AArch64), but the 
low level details are necessarily contained within the CPU port in patch 
3/5. Support for this framework is not required for any CPU port, but 
CPU ports that do support it repurpose the existing code underlying the 
fatal error extensions with the additional support you describ

[PATCH 3/5] score: Remove _Thread_queue_First_locked()

2021-08-31 Thread Sebastian Huber
The _Thread_queue_First_locked() was only used in one place.  Move the code of
this inline function to this place.
---
 cpukit/include/rtems/score/threadqimpl.h | 28 
 cpukit/score/src/threadqfirst.c  | 10 -
 2 files changed, 9 insertions(+), 29 deletions(-)

diff --git a/cpukit/include/rtems/score/threadqimpl.h 
b/cpukit/include/rtems/score/threadqimpl.h
index 0a24d2da3b..19d9704dec 100644
--- a/cpukit/include/rtems/score/threadqimpl.h
+++ b/cpukit/include/rtems/score/threadqimpl.h
@@ -1135,34 +1135,6 @@ RTEMS_INLINE_ROUTINE bool _Thread_queue_Is_empty(
   return queue->heads == NULL;
 }
 
-/**
- * @brief Returns the first thread on the thread queue if it exists, otherwise
- * @c NULL.
- *
- * The caller must be the owner of the thread queue lock.  The thread queue
- * lock is not released.
- *
- * @param the_thread_queue The thread queue.
- * @param operations The thread queue operations.
- *
-* @retval first The first thread on the thread queue according to the enqueue
- * order.
- * @retval NULL No thread is present on the thread queue.
- */
-RTEMS_INLINE_ROUTINE Thread_Control *_Thread_queue_First_locked(
-  Thread_queue_Control  *the_thread_queue,
-  const Thread_queue_Operations *operations
-)
-{
-  Thread_queue_Heads *heads = the_thread_queue->Queue.heads;
-
-  if ( heads != NULL ) {
-return ( *operations->first )( heads );
-  } else {
-return NULL;
-  }
-}
-
 /**
  * @brief Returns the first thread on the thread queue if it exists, otherwise
  * @c NULL.
diff --git a/cpukit/score/src/threadqfirst.c b/cpukit/score/src/threadqfirst.c
index 9908523298..8edbc1645f 100644
--- a/cpukit/score/src/threadqfirst.c
+++ b/cpukit/score/src/threadqfirst.c
@@ -27,11 +27,19 @@ Thread_Control *_Thread_queue_First(
   const Thread_queue_Operations *operations
 )
 {
+  Thread_queue_Heads   *heads;
   Thread_Control   *the_thread;
   Thread_queue_Context  queue_context;
 
   _Thread_queue_Acquire( the_thread_queue, &queue_context );
-  the_thread = _Thread_queue_First_locked( the_thread_queue, operations );
+  heads = the_thread_queue->Queue.heads;
+
+  if ( heads != NULL ) {
+the_thread = ( *operations->first )( heads );
+  } else {
+the_thread = NULL;
+  }
+
   _Thread_queue_Release( the_thread_queue, &queue_context );
 
   return the_thread;
-- 
2.26.2

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


[PATCH 0/5] Thread queue related fixes and clean ups

2021-08-31 Thread Sebastian Huber
This patch set fixes some issues which popped up in the new validation tests.

Sebastian Huber (5):
  score: Fix priority discipline handling
  score: Fix blocking message queue receive
  score: Remove _Thread_queue_First_locked()
  score: Remove _Thread_queue_Unblock_critical()
  score: Update priority only if necessary

 cpukit/include/rtems/posix/muteximpl.h |  48 +-
 cpukit/include/rtems/score/coremsgimpl.h   |  20 ++-
 cpukit/include/rtems/score/coremuteximpl.h |  51 +-
 cpukit/include/rtems/score/coresemimpl.h   |  20 ++-
 cpukit/include/rtems/score/threadqimpl.h   | 168 ---
 cpukit/posix/src/condsignalsupp.c  |  31 ++--
 cpukit/posix/src/sempost.c |  14 +-
 cpukit/score/src/coremsgseize.c|  20 ++-
 cpukit/score/src/semaphore.c   |  28 +---
 cpukit/score/src/threadchangepriority.c|   2 +-
 cpukit/score/src/threadqenqueue.c  | 180 +++--
 cpukit/score/src/threadqfirst.c|  10 +-
 cpukit/score/src/threadqflush.c|  22 +--
 13 files changed, 289 insertions(+), 325 deletions(-)

-- 
2.26.2

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


[PATCH 5/5] score: Update priority only if necessary

2021-08-31 Thread Sebastian Huber
In _Thread_queue_Flush_critical(), update the priority of the thread
queue owner only if necessary.  The scheduler update priority operation
could be expensive.
---
 cpukit/include/rtems/score/threadqimpl.h |  4 ++--
 cpukit/score/src/threadchangepriority.c  |  2 +-
 cpukit/score/src/threadqflush.c  | 22 +-
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/cpukit/include/rtems/score/threadqimpl.h 
b/cpukit/include/rtems/score/threadqimpl.h
index f42c67cc47..33cdb3058d 100644
--- a/cpukit/include/rtems/score/threadqimpl.h
+++ b/cpukit/include/rtems/score/threadqimpl.h
@@ -366,8 +366,8 @@ RTEMS_INLINE_ROUTINE void 
_Thread_queue_Context_clear_priority_updates(
  *
  * @return The priority update count of @a queue_context.
  */
-RTEMS_INLINE_ROUTINE size_t _Thread_queue_Context_save_priority_updates(
-  Thread_queue_Context *queue_context
+RTEMS_INLINE_ROUTINE size_t _Thread_queue_Context_get_priority_updates(
+  const Thread_queue_Context *queue_context
 )
 {
   return queue_context->Priority.update_count;
diff --git a/cpukit/score/src/threadchangepriority.c 
b/cpukit/score/src/threadchangepriority.c
index ac2e9a6d0c..613d0cd7af 100644
--- a/cpukit/score/src/threadchangepriority.c
+++ b/cpukit/score/src/threadchangepriority.c
@@ -212,7 +212,7 @@ void _Thread_Priority_perform_actions(
*/
 
   the_thread = start_of_path;
-  update_count = _Thread_queue_Context_save_priority_updates( queue_context );
+  update_count = _Thread_queue_Context_get_priority_updates( queue_context );
 
   while ( true ) {
 Thread_queue_Queue *queue;
diff --git a/cpukit/score/src/threadqflush.c b/cpukit/score/src/threadqflush.c
index 357e3d696e..42b35a499b 100644
--- a/cpukit/score/src/threadqflush.c
+++ b/cpukit/score/src/threadqflush.c
@@ -71,15 +71,15 @@ size_t _Thread_queue_Flush_critical(
   Thread_queue_Context  *queue_context
 )
 {
-  size_t  flushed;
-  Chain_Control   unblock;
-  Thread_Control *owner;
-  Chain_Node *node;
-  Chain_Node *tail;
+  size_tflushed;
+  size_tpriority_updates;
+  Chain_Control unblock;
+  Chain_Node   *node;
+  Chain_Node   *tail;
 
   flushed = 0;
+  priority_updates = 0;
   _Chain_Initialize_empty( &unblock );
-  owner = queue->owner;
 
   while ( true ) {
 Thread_queue_Heads *heads;
@@ -99,8 +99,7 @@ size_t _Thread_queue_Flush_critical(
 
 /*
  * We do not have enough space in the queue context to collect all priority
- * updates, so clear it each time.  We unconditionally do the priority
- * update for the owner later if it exists.
+ * updates, so clear it each time and accumulate the priority updates.
  */
 _Thread_queue_Context_clear_priority_updates( queue_context );
 
@@ -120,6 +119,8 @@ size_t _Thread_queue_Flush_critical(
   );
 }
 
+priority_updates +=
+  _Thread_queue_Context_get_priority_updates( queue_context );
 ++flushed;
   }
 
@@ -145,9 +146,12 @@ size_t _Thread_queue_Flush_critical(
   node = next;
 } while ( node != tail );
 
-if ( owner != NULL ) {
+if ( priority_updates != 0 ) {
+  Thread_Control  *owner;
   ISR_lock_Context lock_context;
 
+  owner = queue->owner;
+  _Assert( owner != NULL );
   _Thread_State_acquire( owner, &lock_context );
   _Scheduler_Update_priority( owner );
   _Thread_State_release( owner, &lock_context );
-- 
2.26.2

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


[PATCH 4/5] score: Remove _Thread_queue_Unblock_critical()

2021-08-31 Thread Sebastian Huber
This function was only used in one place.  Replace it with a call to
_Thread_queue_Resume().
---
 cpukit/include/rtems/score/threadqimpl.h | 34 -
 cpukit/score/src/threadqenqueue.c| 38 
 2 files changed, 12 insertions(+), 60 deletions(-)

diff --git a/cpukit/include/rtems/score/threadqimpl.h 
b/cpukit/include/rtems/score/threadqimpl.h
index 19d9704dec..f42c67cc47 100644
--- a/cpukit/include/rtems/score/threadqimpl.h
+++ b/cpukit/include/rtems/score/threadqimpl.h
@@ -938,13 +938,12 @@ Status_Control _Thread_queue_Enqueue_sticky(
  * @param[in, out] the_thread The thread to extract.
  * @param[in, out] queue_context The thread queue context.
  *
- * @return Returns the unblock indicator for _Thread_queue_Unblock_critical().
- * True indicates, that this thread must be unblocked by the scheduler later in
- * _Thread_queue_Unblock_critical(), and false otherwise.  In case false is
- * returned, then the thread queue enqueue procedure was interrupted.  Thus it
- * will unblock itself and the thread wait information is no longer accessible,
- * since this thread may already block on another resource in an SMP
- * configuration.
+ * @return Returns the unblock indicator.  True indicates, that this thread
+ * must be unblocked by the scheduler using _Thread_Remove_timer_and_unblock(),
+ * and false otherwise.  In case false is returned, then the thread queue
+ * enqueue procedure was interrupted.  Thus it will unblock itself and the
+ * thread wait information is no longer accessible, since this thread may
+ * already block on another resource in an SMP configuration.
  */
 bool _Thread_queue_Extract_locked(
   Thread_queue_Queue*queue,
@@ -953,27 +952,6 @@ bool _Thread_queue_Extract_locked(
   Thread_queue_Context  *queue_context
 );
 
-/**
- * @brief Unblocks the thread which was on the thread queue before.
- *
- * The caller must be the owner of the thread queue lock.  This function will
- * release the thread queue lock.  Thread dispatching is disabled before the
- * thread queue lock is released and an unblock is necessary.  Thread
- * dispatching is enabled once the sequence to unblock the thread is complete.
- *
- * @param unblock The unblock indicator returned by
- * _Thread_queue_Extract_locked().
- * @param queue The actual thread queue.
- * @param[in, out] the_thread The thread to extract.
- * @param[in, out] lock_context The lock context of the lock acquire.
- */
-void _Thread_queue_Unblock_critical(
-  boolunblock,
-  Thread_queue_Queue *queue,
-  Thread_Control *the_thread,
-  ISR_lock_Context   *lock_context
-);
-
 /**
  * @brief Resumes the extracted or surrendered thread.
  *
diff --git a/cpukit/score/src/threadqenqueue.c 
b/cpukit/score/src/threadqenqueue.c
index 833d37ee61..4b138ba4d1 100644
--- a/cpukit/score/src/threadqenqueue.c
+++ b/cpukit/score/src/threadqenqueue.c
@@ -11,7 +11,7 @@
  *   _Thread_queue_Path_acquire_critical(),
  *   _Thread_queue_Path_release_critical(),
  *   _Thread_queue_Resume(),_Thread_queue_Surrender(),
- *   _Thread_queue_Surrender_sticky(), and _Thread_queue_Unblock_critical().
+ *   _Thread_queue_Surrender_sticky().
  */
 
 /*
@@ -584,27 +584,6 @@ bool _Thread_queue_Extract_locked(
   return _Thread_queue_Make_ready_again( the_thread );
 }
 
-void _Thread_queue_Unblock_critical(
-  boolunblock,
-  Thread_queue_Queue *queue,
-  Thread_Control *the_thread,
-  ISR_lock_Context   *lock_context
-)
-{
-  if ( unblock ) {
-Per_CPU_Control *cpu_self;
-
-cpu_self = _Thread_Dispatch_disable_critical( lock_context );
-_Thread_queue_Queue_release( queue, lock_context );
-
-_Thread_Remove_timer_and_unblock( the_thread, queue );
-
-_Thread_Dispatch_enable( cpu_self );
-  } else {
-_Thread_queue_Queue_release( queue, lock_context );
-  }
-}
-
 void _Thread_queue_Resume(
   Thread_queue_Queue   *queue,
   Thread_Control   *the_thread,
@@ -645,25 +624,20 @@ void _Thread_queue_Extract( Thread_Control *the_thread )
   queue = the_thread->Wait.queue;
 
   if ( queue != NULL ) {
-bool unblock;
-
 _Thread_Wait_remove_request( the_thread, &queue_context.Lock_context );
 _Thread_queue_Context_set_MP_callout(
   &queue_context,
   _Thread_queue_MP_callout_do_nothing
 );
-unblock = _Thread_queue_Extract_locked(
+#if defined(RTEMS_MULTIPROCESSING)
+_Thread_queue_MP_set_callout( the_thread, &queue_context );
+#endif
+( *the_thread->Wait.operations->extract )(
   queue,
-  the_thread->Wait.operations,
   the_thread,
   &queue_context
 );
-_Thread_queue_Unblock_critical(
-  unblock,
-  queue,
-  the_thread,
-  &queue_context.Lock_context.Lock_context
-);
+_Thread_queue_Resume( queue, the_thread, &queue_context );
   } else {
 _Thread_Wait_release( the_thread, &queue_context );
   }
-- 
2.26.2

___
devel mailing list
devel@rtems

[PATCH 2/5] score: Fix blocking message queue receive

2021-08-31 Thread Sebastian Huber
In order to ensure FIFO fairness across schedulers, the thread queue
surrender operation must be used to dequeue a thread from the thread
queue.  The thread queue extract operation is intended for timeouts.

Add _Thread_queue_Resume() which may be used to make extracted or
surrendered threads ready again.

Remove the now unused _Thread_queue_Extract_critical() function.

Close #4509.
---
 cpukit/include/rtems/score/coremsgimpl.h | 20 
 cpukit/include/rtems/score/threadqimpl.h | 59 ++--
 cpukit/score/src/coremsgseize.c  | 20 
 cpukit/score/src/threadqenqueue.c| 45 ++
 4 files changed, 62 insertions(+), 82 deletions(-)

diff --git a/cpukit/include/rtems/score/coremsgimpl.h 
b/cpukit/include/rtems/score/coremsgimpl.h
index 161cf8f124..7f01769010 100644
--- a/cpukit/include/rtems/score/coremsgimpl.h
+++ b/cpukit/include/rtems/score/coremsgimpl.h
@@ -616,7 +616,8 @@ RTEMS_INLINE_ROUTINE Thread_Control 
*_CORE_message_queue_Dequeue_receiver(
   Thread_queue_Context*queue_context
 )
 {
-  Thread_Control *the_thread;
+  Thread_queue_Heads *heads;
+  Thread_Control *the_thread;
 
   /*
*  If there are pending messages, then there can't be threads
@@ -634,14 +635,18 @@ RTEMS_INLINE_ROUTINE Thread_Control 
*_CORE_message_queue_Dequeue_receiver(
*  There must be no pending messages if there is a thread waiting to
*  receive a message.
*/
-  the_thread = _Thread_queue_First_locked(
-&the_message_queue->Wait_queue,
-the_message_queue->operations
-  );
-  if ( the_thread == NULL ) {
+  heads = the_message_queue->Wait_queue.Queue.heads;
+  if ( heads == NULL ) {
 return NULL;
   }
 
+  the_thread = ( *the_message_queue->operations->surrender )(
+&the_message_queue->Wait_queue.Queue,
+heads,
+NULL,
+queue_context
+  );
+
*(size_t *) the_thread->Wait.return_argument = size;
the_thread->Wait.count = (uint32_t) submit_type;
 
@@ -651,9 +656,8 @@ RTEMS_INLINE_ROUTINE Thread_Control 
*_CORE_message_queue_Dequeue_receiver(
 size
   );
 
-  _Thread_queue_Extract_critical(
+  _Thread_queue_Resume(
 &the_message_queue->Wait_queue.Queue,
-the_message_queue->operations,
 the_thread,
 queue_context
   );
diff --git a/cpukit/include/rtems/score/threadqimpl.h 
b/cpukit/include/rtems/score/threadqimpl.h
index 2465fc4499..0a24d2da3b 100644
--- a/cpukit/include/rtems/score/threadqimpl.h
+++ b/cpukit/include/rtems/score/threadqimpl.h
@@ -975,56 +975,23 @@ void _Thread_queue_Unblock_critical(
 );
 
 /**
- * @brief Extracts the thread from the thread queue and unblocks it.
+ * @brief Resumes the extracted or surrendered thread.
  *
- * The caller must be the owner of the thread queue lock.  This function will
- * release the thread queue lock and restore the default thread lock.  Thread
- * dispatching is disabled before the thread queue lock is released and an
- * unblock is necessary.  Thread dispatching is enabled once the sequence to
- * unblock the thread is complete.  This makes it possible to use the thread
- * queue lock to protect the state of objects embedding the thread queue and
- * directly enter _Thread_queue_Extract_critical() to finalize an operation in
- * case a waiting thread exists.
- *
- * @code
- * #include 
- *
- * typedef struct {
- *   Thread_queue_Control  Queue;
- *   Thread_Control   *owner;
- * } Mutex;
+ * This function makes the thread ready again.  If necessary, the thread is
+ * unblocked and its thread timer removed.
  *
- * void _Mutex_Release( Mutex *mutex )
- * {
- *   Thread_queue_Context  queue_context;
- *   Thread_Control   *first;
- *
- *   _Thread_queue_Context_initialize( &queue_context, NULL );
- *   _Thread_queue_Acquire( &mutex->Queue, queue_context );
- *
- *   first = _Thread_queue_First_locked( &mutex->Queue );
- *   mutex->owner = first;
- *
- *   if ( first != NULL ) {
- * _Thread_queue_Extract_critical(
- *   &mutex->Queue.Queue,
- *   mutex->Queue.operations,
- *   first,
- *   &queue_context
- *   );
- * }
- * @endcode
+ * The thread shall have been extracted from the thread queue or surrendered by
+ * the thread queue right before the call to this function.  The caller shall
+ * be the owner of the thread queue lock.
  *
- * @param queue The actual thread queue.
- * @param operations The thread queue operations.
- * @param[in, out] the_thread The thread to extract.
- * @param[in, out] queue_context The thread queue context of the lock acquire.
+ * @param queue is the actual thread queue.
+ * @param[in, out] the_thread is the thread to make ready and unblock.
+ * @param[in, out] queue_context is the thread queue context.
  */
-void _Thread_queue_Extract_critical(
-  Thread_queue_Queue*queue,
-  const Thread_queue_Operations *operations,
-  Thread_Control*the_thread,
-  Thread_queue_Context  *queue_context
+void _Thread_queue_Resume(
+  Thread_queue_Queue   *queue,
+  Threa

[PATCH 1/5] score: Fix priority discipline handling

2021-08-31 Thread Sebastian Huber
The priority queues in clustered scheduling configurations use a per
scheduler priority queue rotation to ensure FIFO fairness across
schedulers.  This mechanism is implemented in the thread queue surrender
operation.  Unfortunately some semaphore and message queue directives
used wrongly the thread queue extract operation.  Fix this through the
use of _Thread_queue_Surrender().

Update #4358.
---
 cpukit/include/rtems/posix/muteximpl.h |  48 +
 cpukit/include/rtems/score/coremuteximpl.h |  51 +-
 cpukit/include/rtems/score/coresemimpl.h   |  20 ++--
 cpukit/include/rtems/score/threadqimpl.h   |  43 
 cpukit/posix/src/condsignalsupp.c  |  31 +++---
 cpukit/posix/src/sempost.c |  14 +--
 cpukit/score/src/semaphore.c   |  28 ++
 cpukit/score/src/threadqenqueue.c  | 109 +
 8 files changed, 196 insertions(+), 148 deletions(-)

diff --git a/cpukit/include/rtems/posix/muteximpl.h 
b/cpukit/include/rtems/posix/muteximpl.h
index 5d20bc1ef6..3decb6f4ac 100644
--- a/cpukit/include/rtems/posix/muteximpl.h
+++ b/cpukit/include/rtems/posix/muteximpl.h
@@ -382,10 +382,7 @@ RTEMS_INLINE_ROUTINE Status_Control 
_POSIX_Mutex_Ceiling_surrender(
   Thread_queue_Context *queue_context
 )
 {
-  unsigned intnest_level;
-  ISR_lock_Contextlock_context;
-  Per_CPU_Control*cpu_self;
-  Thread_queue_Heads *heads;
+  unsigned int nest_level;
 
   if ( !_POSIX_Mutex_Is_owner( the_mutex, executing ) ) {
 _POSIX_Mutex_Release( the_mutex, queue_context );
@@ -400,48 +397,13 @@ RTEMS_INLINE_ROUTINE Status_Control 
_POSIX_Mutex_Ceiling_surrender(
 return STATUS_SUCCESSFUL;
   }
 
-  _Thread_Resource_count_decrement( executing );
-
-  _Thread_queue_Context_clear_priority_updates( queue_context );
-  _Thread_Wait_acquire_default_critical( executing, &lock_context );
-  _Thread_Priority_remove(
+  return _Thread_queue_Surrender_priority_ceiling(
+&the_mutex->Recursive.Mutex.Queue.Queue,
 executing,
 &the_mutex->Priority_ceiling,
-queue_context
+queue_context,
+POSIX_MUTEX_PRIORITY_CEILING_TQ_OPERATIONS
   );
-  _Thread_Wait_release_default_critical( executing, &lock_context );
-
-  cpu_self = _Thread_queue_Dispatch_disable( queue_context );
-
-  heads = the_mutex->Recursive.Mutex.Queue.Queue.heads;
-
-  if ( heads != NULL ) {
-const Thread_queue_Operations *operations;
-Thread_Control*new_owner;
-
-operations = POSIX_MUTEX_PRIORITY_CEILING_TQ_OPERATIONS;
-new_owner = ( *operations->first )( heads );
-_POSIX_Mutex_Set_owner( the_mutex, new_owner );
-_Thread_Resource_count_increment( new_owner );
-_Thread_Priority_add(
-  new_owner,
-  &the_mutex->Priority_ceiling,
-  queue_context
-);
-_Thread_queue_Extract_critical(
-  &the_mutex->Recursive.Mutex.Queue.Queue,
-  operations,
-  new_owner,
-  queue_context
-);
-  } else {
-_POSIX_Mutex_Set_owner( the_mutex, NULL );
-_POSIX_Mutex_Release( the_mutex, queue_context );
-  }
-
-  _Thread_Priority_update( queue_context );
-  _Thread_Dispatch_enable( cpu_self );
-  return STATUS_SUCCESSFUL;
 }
 
 #define POSIX_MUTEX_ABSTIME_TRY_LOCK ((uintptr_t) 1)
diff --git a/cpukit/include/rtems/score/coremuteximpl.h 
b/cpukit/include/rtems/score/coremuteximpl.h
index 426c4c5a95..757efbde9b 100644
--- a/cpukit/include/rtems/score/coremuteximpl.h
+++ b/cpukit/include/rtems/score/coremuteximpl.h
@@ -529,10 +529,7 @@ RTEMS_INLINE_ROUTINE Status_Control 
_CORE_ceiling_mutex_Surrender(
   Thread_queue_Context   *queue_context
 )
 {
-  unsigned int  nest_level;
-  ISR_lock_Context  lock_context;
-  Per_CPU_Control  *cpu_self;
-  Thread_Control   *new_owner;
+  unsigned int nest_level;
 
   _CORE_mutex_Acquire_critical( &the_mutex->Recursive.Mutex, queue_context );
 
@@ -549,53 +546,13 @@ RTEMS_INLINE_ROUTINE Status_Control 
_CORE_ceiling_mutex_Surrender(
 return STATUS_SUCCESSFUL;
   }
 
-  _Thread_Resource_count_decrement( executing );
-
-  _Thread_queue_Context_clear_priority_updates( queue_context );
-  _Thread_Wait_acquire_default_critical( executing, &lock_context );
-  _Thread_Priority_remove(
+  return _Thread_queue_Surrender_priority_ceiling(
+&the_mutex->Recursive.Mutex.Wait_queue.Queue,
 executing,
 &the_mutex->Priority_ceiling,
-queue_context
-  );
-  _Thread_Wait_release_default_critical( executing, &lock_context );
-
-  new_owner = _Thread_queue_First_locked(
-&the_mutex->Recursive.Mutex.Wait_queue,
+queue_context,
 CORE_MUTEX_TQ_OPERATIONS
   );
-  _CORE_mutex_Set_owner( &the_mutex->Recursive.Mutex, new_owner );
-
-  cpu_self = _Thread_Dispatch_disable_critical(
-&queue_context->Lock_context.Lock_context
-  );
-
-  if ( new_owner != NULL ) {
-#if defined(RTEMS_MULTIPROCESSING)
-if ( _Objects_Is_local_id( new_owner->Object.id ) )
-#endif
-{
-  _Thread_Resource_count_increment( new_owner );
-  _Thread_Priority_add(
- 

[PATCH rtems-libbsd] imx: Remove ccm functions alredy defined in RTEMS

2021-08-31 Thread Christian Mauderer
The imx_ccm_*_hz are all defined in RTEMS. So don't duplicate them in
libbsd. Otherwise some applications get linker errors.

Update #3869
---
 freebsd/sys/arm/freescale/imx/imx6_ccm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/freebsd/sys/arm/freescale/imx/imx6_ccm.c 
b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
index 78bbd5c1..7fdb69b8 100644
--- a/freebsd/sys/arm/freescale/imx/imx6_ccm.c
+++ b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
@@ -368,6 +368,7 @@ imx6_ccm_sata_enable(void)
return 0;
 }
 
+#ifndef __rtems__
 uint32_t
 imx_ccm_ecspi_hz(void)
 {
@@ -408,6 +409,7 @@ imx_ccm_ahb_hz(void)
 {
return (13200);
 }
+#endif /* __rtems__ */
 
 void
 imx_ccm_ipu_enable(int ipu)
-- 
2.31.1

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


[PATCH] score: Document Futex Handler

2021-08-31 Thread Sebastian Huber
Use EAGIN instead of EWOULDBLOCK to be in line with the Linux man page.
These error numbers have the same values in Newlib.
---
 cpukit/score/src/futex.c | 42 ++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/cpukit/score/src/futex.c b/cpukit/score/src/futex.c
index b65a843704..f3a1ae3994 100644
--- a/cpukit/score/src/futex.c
+++ b/cpukit/score/src/futex.c
@@ -1,11 +1,12 @@
 /**
  * @file
  *
- * @ingroup RTEMSScore
+ * @ingroup RTEMSScoreFutex
  *
  * @brief This source file contains the implementation of
  *   _Futex_Wait() and _Futex_Wake().
  */
+
 /*
  * Copyright (c) 2015, 2016 embedded brains GmbH.  All rights reserved.
  *
@@ -20,6 +21,18 @@
  * http://www.rtems.org/license/LICENSE.
  */
 
+/**
+ * @defgroup RTEMSScoreFutex Futex Handler
+ *
+ * @ingroup RTEMSScore
+ *
+ * @brief This group contains the Futex Handler implementation.
+ *
+ * The behaviour of the futex operations is defined by Linux, see also:
+ *
+ * https://man7.org/linux/man-pages/man2/futex.2.html
+ */
+
 #ifdef HAVE_CONFIG_H
 #include "config.h"
 #endif
@@ -84,6 +97,22 @@ static void _Futex_Queue_release(
   _ISR_Local_enable( level );
 }
 
+/**
+ * @brief Performs the ``FUTEX_WAIT`` operation.
+ *
+ * @param[in, out] _futex is the futex object.
+ *
+ * @param[in] uaddr is the address to the futex state.
+ *
+ * @param val is the expected futex state value.
+ *
+ * @retval 0 Returns zero if the futex state is equal to the expected value.
+ *   In this case the calling thread is enqueued on the thread queue of the
+ *   futex object.
+ *
+ * @retval EAGAIN Returns EAGAIN if the futex state is not equal to the
+ *   expected value.
+ */
 int _Futex_Wait( struct _Futex_Control *_futex, int *uaddr, int val )
 {
   Futex_Control*futex;
@@ -113,7 +142,7 @@ int _Futex_Wait( struct _Futex_Control *_futex, int *uaddr, 
int val )
 eno = 0;
   } else {
 _Futex_Queue_release( futex, level, &queue_context );
-eno = EWOULDBLOCK;
+eno = EAGAIN;
   }
 
   return eno;
@@ -143,6 +172,15 @@ static Thread_Control *_Futex_Flush_filter(
   return the_thread;
 }
 
+/**
+ * @brief Performs the ``FUTEX_WAKE`` operation.
+ *
+ * @param[in, out] _futex is the futex.
+ *
+ * @param count is the maximum count of threads to wake up.
+ *
+ * @return Returns the count of woken up threads.
+ */
 int _Futex_Wake( struct _Futex_Control *_futex, int count )
 {
   Futex_Control *futex;
-- 
2.31.1

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: [PATCH v1 2/5] cpukit: Add Exception Manager

2021-08-31 Thread Kinsey Moore

On 8/31/2021 04:31, Sebastian Huber wrote:

On 30/08/2021 17:13, Kinsey Moore wrote:

On 8/30/2021 07:50, Sebastian Huber wrote:

On 30/08/2021 14:27, Kinsey Moore wrote:

On 8/30/2021 00:42, Sebastian Huber wrote:

Hello Kinsey,

why can't you use the existing fatal error extension for this? You 
just have to test for an RTEMS_FATAL_SOURCE_EXTENSION source.  The 
fatal code is a pointer to the exception frame.


Unfortunately, the fatal error extensions framework necessarily 
assumes that the exception is fatal and so does not include the 
machinery to perform a thread dispatch or restore the exception 
frame for additional execution. It could theoretically be done in 
the fatal error extensions context, but it would end up being 
reimplemented for every architecture and you'd have to unwind the 
stack manually. I'm sure there are other ragged edges that would 
have to be smoothed over as well.


Non-interrupt exceptions are not uniformly handled across 
architectures in RTEMS currently. Adding the 
RTEMS_FATAL_SOURCE_EXTENSION fatal source was an attempt to do this. 
I am not that fond of adding a second approach unless there are 
strong technical reasons to do this.
This was in an effort to formalize how recoverable exceptions are 
handled. Currently, it's done on on SPARC by handling exception traps 
as you would an interrupt trap since they share a common architecture 
on that platform. This representation varies quite a bit among 
platforms, so we needed a different mechanism.


I recently changed the non-interrupt exception handling on sparc, 
since it was not robust against a corrupt stack pointer:


http://devel.rtems.org/ticket/4459



The initial fatal extensions are quite robust, you only need a 
stack, valid read-only data and a valid code. So, using a user 
extension is the right thing to do, but I don't thing we need a new 
one.


Doing the non-interrupt exception processing on the stack which 
caused the exception is a bit problematic, since the stack pointer 
might be corrupt as well. It is more robust to switch to for example 
the interrupt stack. If the exception was caused by an interrupt, 
then this exception is not recoverable.


The non-interrupt exception processing occurs on the interrupt stack, 
not the thread/user stack. In the AArch64 support code provided, the 
stack is switched back to the thread/user stack before thread 
dispatch and exception frame restoration occurs.


You can only switch back to the thread stack if it is valid. Doing a 
thread dispatch should be only done if you are sure that the system 
state is still intact. This is probably no the case for most exceptions.
If the handler has declared that it handled the exception and corrected 
the cause underlying the exception then the system state should be 
valid. If it can't make that claim then it should not have handled the 
exception.







If the non-interrupt exception was caused by a thread, then you 
could do some high level actions for some exceptions, such as 
floating-point exceptions and arithmetic exceptions. If you get a 
data abort or instruction error, then it is probably better to 
terminate the system.
I leave that decision to the handlers defined on this framework. In 
the case of the exception-to-signal mapping, I'm carrying over the 
existing exception set from the SPARC implementation.


It is probably this code:

+    case EXCEPTION_DATA_ABORT_READ:
+    case EXCEPTION_DATA_ABORT_WRITE:
+    case EXCEPTION_DATA_ABORT_UNSPECIFIED:
+    case EXCEPTION_INSTRUCTION_ABORT:
+    case EXCEPTION_MMU_UNSPECIFIED:
+    case EXCEPTION_ACCESS_ALIGNMENT:
+  signal = SIGSEGV;
+  break;
+
+    default:
+  /*
+   * Covers unknown, PC/SP alignment, illegal execution state, 
and any new

+   * exception classes that get added.
+   */
+  signal = SIGILL;
+  break;
+  }

Using signals to handle these exceptions is like playing Russian 
roulette.
You're right. Specifically, SP alignment faults should be moved to the 
not-handled section because they're not actually handled here and would 
have to be to proceed with further execution. I'll make that change, thanks.




Non-interrupt exception handling is always architecture-dependent. 
It is just a matter how you organize it. In general, the most 
sensible way to deal with non-interrupt exceptions is to log the 
error somehow and terminate the system. The mapping to signals is a 
bit of a special case if you ask me. My preferred way to handle 
non-interrupt exceptions would be to


1. switch to a dedicated stack

2. save the complete register set to the CPU exception frame

3. call the fatal error extensions with RTEMS_FATAL_SOURCE_EXTENSION 
and the CPU exception frame (with interrupts disabled)


Add a new API to query/alter the CPU exception frame, switch to the 
stack indicated by the CPU exception frame, and restore the context 
stored in the CPU exception frame. With these architecture-dependent 
CPU exception frame support i

Re: [PATCH] score: Document Futex Handler

2021-08-31 Thread Joel Sherrill
On Tue, Aug 31, 2021, 7:31 AM Sebastian Huber <
sebastian.hu...@embedded-brains.de> wrote:

> Use EAGIN instead of EWOULDBLOCK to be in line with the Linux man page.
>

EAGAIN AND Linux man page for what?

These error numbers have the same values in Newlib.
> ---
>  cpukit/score/src/futex.c | 42 ++--
>  1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/cpukit/score/src/futex.c b/cpukit/score/src/futex.c
> index b65a843704..f3a1ae3994 100644
> --- a/cpukit/score/src/futex.c
> +++ b/cpukit/score/src/futex.c
> @@ -1,11 +1,12 @@
>  /**
>   * @file
>   *
> - * @ingroup RTEMSScore
> + * @ingroup RTEMSScoreFutex
>   *
>   * @brief This source file contains the implementation of
>   *   _Futex_Wait() and _Futex_Wake().
>   */
> +
>  /*
>   * Copyright (c) 2015, 2016 embedded brains GmbH.  All rights reserved.
>   *
> @@ -20,6 +21,18 @@
>   * http://www.rtems.org/license/LICENSE.
>   */
>
> +/**
> + * @defgroup RTEMSScoreFutex Futex Handler
> + *
> + * @ingroup RTEMSScore
> + *
> + * @brief This group contains the Futex Handler implementation.
> + *
> + * The behaviour of the futex operations is defined by Linux, see also:
> + *
> + * https://man7.org/linux/man-pages/man2/futex.2.html
> + */
> +
>  #ifdef HAVE_CONFIG_H
>  #include "config.h"
>  #endif
> @@ -84,6 +97,22 @@ static void _Futex_Queue_release(
>_ISR_Local_enable( level );
>  }
>
> +/**
> + * @brief Performs the ``FUTEX_WAIT`` operation.
> + *
> + * @param[in, out] _futex is the futex object.
> + *
> + * @param[in] uaddr is the address to the futex state.
> + *
> + * @param val is the expected futex state value.
> + *
> + * @retval 0 Returns zero if the futex state is equal to the expected
> value.
> + *   In this case the calling thread is enqueued on the thread queue of
> the
> + *   futex object.
> + *
> + * @retval EAGAIN Returns EAGAIN if the futex state is not equal to the
> + *   expected value.
> + */
>  int _Futex_Wait( struct _Futex_Control *_futex, int *uaddr, int val )
>  {
>Futex_Control*futex;
> @@ -113,7 +142,7 @@ int _Futex_Wait( struct _Futex_Control *_futex, int
> *uaddr, int val )
>  eno = 0;
>} else {
>  _Futex_Queue_release( futex, level, &queue_context );
> -eno = EWOULDBLOCK;
> +eno = EAGAIN;
>}
>
>return eno;
> @@ -143,6 +172,15 @@ static Thread_Control *_Futex_Flush_filter(
>return the_thread;
>  }
>
> +/**
> + * @brief Performs the ``FUTEX_WAKE`` operation.
> + *
> + * @param[in, out] _futex is the futex.
> + *
> + * @param count is the maximum count of threads to wake up.
> + *
> + * @return Returns the count of woken up threads.
> + */
>  int _Futex_Wake( struct _Futex_Control *_futex, int count )
>  {
>Futex_Control *futex;
> --
> 2.31.1
>
> ___
> devel mailing list
> devel@rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
>
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: [PATCH rtems-libbsd] imx: Remove ccm functions alredy defined in RTEMS

2021-08-31 Thread Gedare Bloom
ok

On Tue, Aug 31, 2021 at 5:43 AM Christian Mauderer
 wrote:
>
> The imx_ccm_*_hz are all defined in RTEMS. So don't duplicate them in
> libbsd. Otherwise some applications get linker errors.
>
> Update #3869
> ---
>  freebsd/sys/arm/freescale/imx/imx6_ccm.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/freebsd/sys/arm/freescale/imx/imx6_ccm.c 
> b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> index 78bbd5c1..7fdb69b8 100644
> --- a/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> +++ b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> @@ -368,6 +368,7 @@ imx6_ccm_sata_enable(void)
> return 0;
>  }
>
> +#ifndef __rtems__
>  uint32_t
>  imx_ccm_ecspi_hz(void)
>  {
> @@ -408,6 +409,7 @@ imx_ccm_ahb_hz(void)
>  {
> return (13200);
>  }
> +#endif /* __rtems__ */
>
>  void
>  imx_ccm_ipu_enable(int ipu)
> --
> 2.31.1
>
> ___
> devel mailing list
> devel@rtems.org
> http://lists.rtems.org/mailman/listinfo/devel
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: [PATCH rtems-libbsd] imx: Remove ccm functions alredy defined in RTEMS

2021-08-31 Thread Gedare Bloom
Sorry, i think libbsd is still a bit slushy, wait for Chris to ok thx

On Tue, Aug 31, 2021 at 3:21 PM Gedare Bloom  wrote:
>
> ok
>
> On Tue, Aug 31, 2021 at 5:43 AM Christian Mauderer
>  wrote:
> >
> > The imx_ccm_*_hz are all defined in RTEMS. So don't duplicate them in
> > libbsd. Otherwise some applications get linker errors.
> >
> > Update #3869
> > ---
> >  freebsd/sys/arm/freescale/imx/imx6_ccm.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/freebsd/sys/arm/freescale/imx/imx6_ccm.c 
> > b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> > index 78bbd5c1..7fdb69b8 100644
> > --- a/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> > +++ b/freebsd/sys/arm/freescale/imx/imx6_ccm.c
> > @@ -368,6 +368,7 @@ imx6_ccm_sata_enable(void)
> > return 0;
> >  }
> >
> > +#ifndef __rtems__
> >  uint32_t
> >  imx_ccm_ecspi_hz(void)
> >  {
> > @@ -408,6 +409,7 @@ imx_ccm_ahb_hz(void)
> >  {
> > return (13200);
> >  }
> > +#endif /* __rtems__ */
> >
> >  void
> >  imx_ccm_ipu_enable(int ipu)
> > --
> > 2.31.1
> >
> > ___
> > devel mailing list
> > devel@rtems.org
> > http://lists.rtems.org/mailman/listinfo/devel
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: [PATCH v1 2/5] cpukit: Add Exception Manager

2021-08-31 Thread Chris Johns
On 31/8/21 11:35 pm, Kinsey Moore wrote:
> On 8/31/2021 04:31, Sebastian Huber wrote:
>> On 30/08/2021 17:13, Kinsey Moore wrote:
>>> On 8/30/2021 07:50, Sebastian Huber wrote:
 On 30/08/2021 14:27, Kinsey Moore wrote:
> On 8/30/2021 00:42, Sebastian Huber wrote:
>> Hello Kinsey,
>>
>> why can't you use the existing fatal error extension for this? You just
>> have to test for an RTEMS_FATAL_SOURCE_EXTENSION source.  The fatal code
>> is a pointer to the exception frame.
>
> Unfortunately, the fatal error extensions framework necessarily assumes
> that the exception is fatal and so does not include the machinery to
> perform a thread dispatch or restore the exception frame for additional
> execution. It could theoretically be done in the fatal error extensions
> context, but it would end up being reimplemented for every architecture 
> and
> you'd have to unwind the stack manually. I'm sure there are other ragged
> edges that would have to be smoothed over as well.

 Non-interrupt exceptions are not uniformly handled across architectures in
 RTEMS currently. Adding the RTEMS_FATAL_SOURCE_EXTENSION fatal source was 
 an
 attempt to do this. I am not that fond of adding a second approach unless
 there are strong technical reasons to do this.
>>> This was in an effort to formalize how recoverable exceptions are handled.
>>> Currently, it's done on on SPARC by handling exception traps as you would an
>>> interrupt trap since they share a common architecture on that platform. This
>>> representation varies quite a bit among platforms, so we needed a different
>>> mechanism.
>>
>> I recently changed the non-interrupt exception handling on sparc, since it 
>> was
>> not robust against a corrupt stack pointer:
>>
>> http://devel.rtems.org/ticket/4459
>>

 The initial fatal extensions are quite robust, you only need a stack, valid
 read-only data and a valid code. So, using a user extension is the right
 thing to do, but I don't thing we need a new one.

 Doing the non-interrupt exception processing on the stack which caused the
 exception is a bit problematic, since the stack pointer might be corrupt as
 well. It is more robust to switch to for example the interrupt stack. If 
 the
 exception was caused by an interrupt, then this exception is not 
 recoverable.
>>>
>>> The non-interrupt exception processing occurs on the interrupt stack, not 
>>> the
>>> thread/user stack. In the AArch64 support code provided, the stack is
>>> switched back to the thread/user stack before thread dispatch and exception
>>> frame restoration occurs.
>>
>> You can only switch back to the thread stack if it is valid. Doing a thread
>> dispatch should be only done if you are sure that the system state is still
>> intact. This is probably no the case for most exceptions.
> If the handler has declared that it handled the exception and corrected the
> cause underlying the exception then the system state should be valid. If it
> can't make that claim then it should not have handled the exception.

There are valid use cases for this such as an unhandled fault in a task
suspending it. Till did this for EPICS years ago.

Our job is to provide the support to implement system level requirements and we
should leave the analysis and discussions about being valid to our users.

Libdebugger is a good example of a system that handles faults of all types and
RTEMS keeps running without an issue.

 If the non-interrupt exception was caused by a thread, then you could do
 some high level actions for some exceptions, such as floating-point
 exceptions and arithmetic exceptions. If you get a data abort or 
 instruction
 error, then it is probably better to terminate the system.
>>> I leave that decision to the handlers defined on this framework. In the case
>>> of the exception-to-signal mapping, I'm carrying over the existing exception
>>> set from the SPARC implementation.
>>
>> It is probably this code:
>>
>> +    case EXCEPTION_DATA_ABORT_READ:
>> +    case EXCEPTION_DATA_ABORT_WRITE:
>> +    case EXCEPTION_DATA_ABORT_UNSPECIFIED:
>> +    case EXCEPTION_INSTRUCTION_ABORT:
>> +    case EXCEPTION_MMU_UNSPECIFIED:
>> +    case EXCEPTION_ACCESS_ALIGNMENT:
>> +  signal = SIGSEGV;
>> +  break;
>> +
>> +    default:
>> +  /*
>> +   * Covers unknown, PC/SP alignment, illegal execution state, and any 
>> new
>> +   * exception classes that get added.
>> +   */
>> +  signal = SIGILL;
>> +  break;
>> +  }
>>
>> Using signals to handle these exceptions is like playing Russian roulette.
> You're right. Specifically, SP alignment faults should be moved to the
> not-handled section because they're not actually handled here and would have 
> to
> be to proceed with further execution. I'll make that change, thanks.
> Non-interrupt exception handling is always architecture-dependent. 

Re: [PATCH] tester: Limit simultaneous QEMU jobs to 1

2021-08-31 Thread Chris Johns
On 31/8/21 6:30 pm, Sebastian Huber wrote:
> On 31/08/2021 09:00, Chris Johns wrote:
>> On 31/8/21 3:20 pm, Sebastian Huber wrote:
>>> On 30/08/2021 20:32, Kinsey Moore wrote:
 On 8/30/2021 12:12, Sebastian Huber wrote:
> On 24/08/2021 20:45, Kinsey Moore wrote:
>> diff --git a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
>> b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
>> index 3beba06..581c59c 100644
>> --- a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
>> +++ b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
>> @@ -36,3 +36,4 @@ bsp   = a53_ilp32_qemu
>>    arch  = aarch64
>>    tester    = %{_rtscripts}/qemu.cfg
>>    bsp_qemu_opts = %{qemu_opts_base} -serial mon:stdio -machine
>> virt,gic-version=3 -cpu cortex-a53 -m 4096
>> +jobs  = 1
>
> Does this overwrite the command line option or is this a default value?
>
 When this is set in the tester configuration, the command line switch has 
 no
 effect but it can be overridden in the user-config.
>>>
>>> Overruling the command line option is not that great. I have a vastly 
>>> different
>>> test run duration with --jobs=1 vs. --jobs=48 with more or less the same 
>>> test
>>> results.
>>
>> What does more or less mean?
> 
> On Qemu some tests have no reliable outcome. If I run with --jobs=48 only two 
> of
> these tests fail compare to --jobs=1.

It seems the experience varies between archs and hosts. It is the origin of this
patch series.

>> I appreciate the efforts Kinsey has gone to looking into why we have this
>> happening and I also believe we need to keep pushing towards repeatable 
>> result.
>> If limiting to 1 gives us repeatable results on qemu then I prefer this over
>> tainted test results with intermittent tags.
> 
> During development waiting one minute is much better than waiting 13 minutes.
> Repeatable tests is one aspect, but there are other aspects too. Overruling
> command line options is not that great. If you run with default values, it is
> all right to trade off repeatable results against a fast test run. However, 
> if I
> want to run with --jobs=N, I want to run with N jobs and not just one.

Yes I agree. How we manage this so it is apparent seems to be the key issue 
here.

>>> I think this option should be split into a "force-jobs" and
>>> "default-jobs" option.
>>
>> I am sorry I do not understand these options?
> 
> force-jobs forces the jobs to N regardless of what is specified on the command
> line. Maybe a warning or error should be emitted if the command line option
> conflicts with the configuration option.
> 
> default-jobs selects the job count if no --jobs command line option is 
> specified.

What about adding a `max-job` field which is 0 for no limit? This cannot be
exceeded?

Then `default-jobs` can be used as the default, again 0 means no liimit?

>> The command line is ignored because and the value is fixed on purpose and I 
>> am
>> not seeing a reason to change this.
> 
> Ignoring command line options is not really a pleasant user experience.

Yes it is not. It was added in a hurry without much though when I added the TFTP
support.

>> When specified in a config it is a physical limit. A user being able to 
>> change
>> the number of TFTP jobs on the command line does not make sense.
> 
> Yes, for physical limits this makes sense.

We need to manage the managed this case for new users.

>> This tool's focus is testing on hardware and I see that as more important. 
>> And
>> as I have said before if we have problematic tests maybe the test or the tool
>> generating the results needs to be investigated.
>>
>> I see this issue as something specific to the design of qemu and a few of our
>> tests. I can guess at some of the reasons qemu does this but also being able 
>> to
>> have the tick timer's clock be sync'ed with the CPU clock is important in 
>> some
>> types of simulation, ie our case and these problematic test. We are a 
>> real-time
>> operating system so needing this to be right to closer in simulation does not
>> seem unreasonable.
>>
>> This discussion send a clear message, tier 1 archs and BSPs are very 
>> important
>> to this project.
> 
> There are several ways to address the sporadic test failures on Qemu. You 
> could
> for example also change the tests to make them independent of the simulator
> timing. For now, my approach would be to change the default jobs count for the
> Qemu BSPs and still let the user overrule the default with a custom value to 
> get
> for example a faster test run.

This is sensible. In summary:

1. Add `max-jobs` as a config file only settings with a default of 0

2. Change the config `jobs` to `default-jobs` again with 0 as the default 
default.

3. Let the command line override the default jobs and raise an error if over the
maximum jobs allowed.

4. Provide a clear notice at the start and end of a run if the jobs used do not
match the default.

Chris
_

Re: [PATCH v1 2/5] cpukit: Add Exception Manager

2021-08-31 Thread Kinsey Moore

On 8/31/2021 17:50, Chris Johns wrote:

On 31/8/21 11:35 pm, Kinsey Moore wrote:

On 8/31/2021 04:31, Sebastian Huber wrote:

On 30/08/2021 17:13, Kinsey Moore wrote:

On 8/30/2021 07:50, Sebastian Huber wrote:

On 30/08/2021 14:27, Kinsey Moore wrote:

On 8/30/2021 00:42, Sebastian Huber wrote:

Hello Kinsey,

why can't you use the existing fatal error extension for this? You just
have to test for an RTEMS_FATAL_SOURCE_EXTENSION source.  The fatal code
is a pointer to the exception frame.

Unfortunately, the fatal error extensions framework necessarily assumes
that the exception is fatal and so does not include the machinery to
perform a thread dispatch or restore the exception frame for additional
execution. It could theoretically be done in the fatal error extensions
context, but it would end up being reimplemented for every architecture and
you'd have to unwind the stack manually. I'm sure there are other ragged
edges that would have to be smoothed over as well.

Non-interrupt exceptions are not uniformly handled across architectures in
RTEMS currently. Adding the RTEMS_FATAL_SOURCE_EXTENSION fatal source was an
attempt to do this. I am not that fond of adding a second approach unless
there are strong technical reasons to do this.

This was in an effort to formalize how recoverable exceptions are handled.
Currently, it's done on on SPARC by handling exception traps as you would an
interrupt trap since they share a common architecture on that platform. This
representation varies quite a bit among platforms, so we needed a different
mechanism.

I recently changed the non-interrupt exception handling on sparc, since it was
not robust against a corrupt stack pointer:

http://devel.rtems.org/ticket/4459


The initial fatal extensions are quite robust, you only need a stack, valid
read-only data and a valid code. So, using a user extension is the right
thing to do, but I don't thing we need a new one.

Doing the non-interrupt exception processing on the stack which caused the
exception is a bit problematic, since the stack pointer might be corrupt as
well. It is more robust to switch to for example the interrupt stack. If the
exception was caused by an interrupt, then this exception is not recoverable.

The non-interrupt exception processing occurs on the interrupt stack, not the
thread/user stack. In the AArch64 support code provided, the stack is
switched back to the thread/user stack before thread dispatch and exception
frame restoration occurs.

You can only switch back to the thread stack if it is valid. Doing a thread
dispatch should be only done if you are sure that the system state is still
intact. This is probably no the case for most exceptions.

If the handler has declared that it handled the exception and corrected the
cause underlying the exception then the system state should be valid. If it
can't make that claim then it should not have handled the exception.

There are valid use cases for this such as an unhandled fault in a task
suspending it. Till did this for EPICS years ago.

Our job is to provide the support to implement system level requirements and we
should leave the analysis and discussions about being valid to our users.

Libdebugger is a good example of a system that handles faults of all types and
RTEMS keeps running without an issue.


If the non-interrupt exception was caused by a thread, then you could do
some high level actions for some exceptions, such as floating-point
exceptions and arithmetic exceptions. If you get a data abort or instruction
error, then it is probably better to terminate the system.

I leave that decision to the handlers defined on this framework. In the case
of the exception-to-signal mapping, I'm carrying over the existing exception
set from the SPARC implementation.

It is probably this code:

+    case EXCEPTION_DATA_ABORT_READ:
+    case EXCEPTION_DATA_ABORT_WRITE:
+    case EXCEPTION_DATA_ABORT_UNSPECIFIED:
+    case EXCEPTION_INSTRUCTION_ABORT:
+    case EXCEPTION_MMU_UNSPECIFIED:
+    case EXCEPTION_ACCESS_ALIGNMENT:
+  signal = SIGSEGV;
+  break;
+
+    default:
+  /*
+   * Covers unknown, PC/SP alignment, illegal execution state, and any new
+   * exception classes that get added.
+   */
+  signal = SIGILL;
+  break;
+  }

Using signals to handle these exceptions is like playing Russian roulette.

You're right. Specifically, SP alignment faults should be moved to the
not-handled section because they're not actually handled here and would have to
be to proceed with further execution. I'll make that change, thanks.

Non-interrupt exception handling is always architecture-dependent. It is

just a matter how you organize it. In general, the most sensible way to deal
with non-interrupt exceptions is to log the error somehow and terminate the
system. The mapping to signals is a bit of a special case if you ask me. My
preferred way to handle non-interrupt exceptions would be to

1. switch to a dedicated stack

2. save th

Re: [PATCH] tester: Limit simultaneous QEMU jobs to 1

2021-08-31 Thread Kinsey Moore

On 8/31/2021 18:00, Chris Johns wrote:

On 31/8/21 6:30 pm, Sebastian Huber wrote:

On 31/08/2021 09:00, Chris Johns wrote:

On 31/8/21 3:20 pm, Sebastian Huber wrote:

On 30/08/2021 20:32, Kinsey Moore wrote:

On 8/30/2021 12:12, Sebastian Huber wrote:

On 24/08/2021 20:45, Kinsey Moore wrote:

diff --git a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
index 3beba06..581c59c 100644
--- a/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
+++ b/tester/rtems/testing/bsps/a53_ilp32_qemu.ini
@@ -36,3 +36,4 @@ bsp   = a53_ilp32_qemu
    arch  = aarch64
    tester    = %{_rtscripts}/qemu.cfg
    bsp_qemu_opts = %{qemu_opts_base} -serial mon:stdio -machine
virt,gic-version=3 -cpu cortex-a53 -m 4096
+jobs  = 1

Does this overwrite the command line option or is this a default value?


When this is set in the tester configuration, the command line switch has no
effect but it can be overridden in the user-config.

Overruling the command line option is not that great. I have a vastly different
test run duration with --jobs=1 vs. --jobs=48 with more or less the same test
results.

What does more or less mean?

On Qemu some tests have no reliable outcome. If I run with --jobs=48 only two of
these tests fail compare to --jobs=1.

It seems the experience varies between archs and hosts. It is the origin of this
patch series.


I appreciate the efforts Kinsey has gone to looking into why we have this
happening and I also believe we need to keep pushing towards repeatable result.
If limiting to 1 gives us repeatable results on qemu then I prefer this over
tainted test results with intermittent tags.

During development waiting one minute is much better than waiting 13 minutes.
Repeatable tests is one aspect, but there are other aspects too. Overruling
command line options is not that great. If you run with default values, it is
all right to trade off repeatable results against a fast test run. However, if I
want to run with --jobs=N, I want to run with N jobs and not just one.

Yes I agree. How we manage this so it is apparent seems to be the key issue 
here.


I think this option should be split into a "force-jobs" and
"default-jobs" option.

I am sorry I do not understand these options?

force-jobs forces the jobs to N regardless of what is specified on the command
line. Maybe a warning or error should be emitted if the command line option
conflicts with the configuration option.

default-jobs selects the job count if no --jobs command line option is 
specified.

What about adding a `max-job` field which is 0 for no limit? This cannot be
exceeded?

Then `default-jobs` can be used as the default, again 0 means no liimit?


The command line is ignored because and the value is fixed on purpose and I am
not seeing a reason to change this.

Ignoring command line options is not really a pleasant user experience.

Yes it is not. It was added in a hurry without much though when I added the TFTP
support.


When specified in a config it is a physical limit. A user being able to change
the number of TFTP jobs on the command line does not make sense.

Yes, for physical limits this makes sense.

We need to manage the managed this case for new users.


This tool's focus is testing on hardware and I see that as more important. And
as I have said before if we have problematic tests maybe the test or the tool
generating the results needs to be investigated.

I see this issue as something specific to the design of qemu and a few of our
tests. I can guess at some of the reasons qemu does this but also being able to
have the tick timer's clock be sync'ed with the CPU clock is important in some
types of simulation, ie our case and these problematic test. We are a real-time
operating system so needing this to be right to closer in simulation does not
seem unreasonable.

This discussion send a clear message, tier 1 archs and BSPs are very important
to this project.

There are several ways to address the sporadic test failures on Qemu. You could
for example also change the tests to make them independent of the simulator
timing. For now, my approach would be to change the default jobs count for the
Qemu BSPs and still let the user overrule the default with a custom value to get
for example a faster test run.

This is sensible. In summary:

1. Add `max-jobs` as a config file only settings with a default of 0

2. Change the config `jobs` to `default-jobs` again with 0 as the default 
default.

3. Let the command line override the default jobs and raise an error if over the
maximum jobs allowed.

4. Provide a clear notice at the start and end of a run if the jobs used do not
match the default.


I'll work toward this solution.


Kinsey

___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: [PATCH] score: Document Futex Handler

2021-08-31 Thread Sebastian Huber

On 31/08/2021 22:31, Joel Sherrill wrote:
On Tue, Aug 31, 2021, 7:31 AM Sebastian Huber 
> wrote:


Use EAGIN instead of EWOULDBLOCK to be in line with the Linux man page.


EAGAIN AND Linux man page for what?


The behaviour of the futex operations is defined by Linux:

https://man7.org/linux/man-pages/man2/futex.2.html

Using the same error numbers helps to avoid confusion.

When you look at the history of the Linux man page you see that they 
replaced EWOULDBLOCK with EAGAIN over time. At the time of the RTEMS 
futex implementation they used EWOULDBLOCK.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Re: [PATCH] tester: Limit simultaneous QEMU jobs to 1

2021-08-31 Thread Chris Johns
On 1/9/21 12:25 pm, Kinsey Moore wrote:
> I'll work toward this solution.

Thanks, I appreciate your support in this area.

Chris
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: [PATCH rtems-libbsd] imx: Remove ccm functions alredy defined in RTEMS

2021-08-31 Thread Chris Johns
On 1/9/21 7:26 am, Gedare Bloom wrote:
> Sorry, i think libbsd is still a bit slushy, wait for Chris to ok thx

It is so please wait a day or so. I have just finished the last stage of target
testing on PowerPC hardware.

I wanted to meet with Joel today but he could not make at the last minute.

Thanks
Chris
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel


Re: [PATCH rtems-libbsd] imx: Remove ccm functions alredy defined in RTEMS

2021-08-31 Thread Christian MAUDERER

Hello Gedare and Chris,

Am 01.09.21 um 07:55 schrieb Chris Johns:

On 1/9/21 7:26 am, Gedare Bloom wrote:

Sorry, i think libbsd is still a bit slushy, wait for Chris to ok thx


I planned to wait for Chris work anyway.



It is so please wait a day or so. I have just finished the last stage of target
testing on PowerPC hardware.



Of course I'll wait a few days. My patch is a really small one and it 
should be easy to rebase it.


Best regards

Christian


I wanted to meet with Joel today but he could not make at the last minute.

Thanks
Chris



--

embedded brains GmbH
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email: christian.maude...@embedded-brains.de
phone: +49-89-18 94 741 - 18
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
___
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel