The Classic RTEMS and POSIX APIs have at least three weaknesses.
* Dynamic memory (the workspace) is used to allocate object pools. This
requires a complex configuration with heavy use of the C pre-processor.
* Objects are created via function calls which return an object identifier.
The object operations use this identifier and internally map it to an
internal object representation.
* The object operations use a rich set of options and attributes. Each time
these parameters must be evaluated and validated to figure out what
to do.
For applications that use fine grained locking the overhead to map the
identifier to the object representation and the parameter evaluation is a
significant overhead the may degrade the performance dramatically. An
example
is the FreeBSD network stack which hundreds of locks in a basic setup. Here
the performance can be easily measured in terms of throughput and processor
utilization. The port of the FreeBSD network stack uses now its own
priority
inheritance mutex implementation which is not based on the classic RTEMS
objects. The blocking part however uses the standard thread queues. The
overall implementation is quite simple.
Another example which benefits from self-contained objects is OpenMP. For
OpenMP the performance of the POSIX configuration of libgomp and an
optimized
implementation using self-contained objects available via Newlib
<sys/lock.h>
is significantly different, see https://devel.rtems.org/ticket/2274.
Some test
cases are more than a hundred times slower in the POSIX configuration of
libgomp.
Since the Newlib should use locks to protect some global data structures
(https://devel.rtems.org/ticket/1247) and the GCC uses locks for the C++ and
OpenMP support the application must take this into account. It is
difficult to
figure out how many and which objects will be used by Newlib and GCC for a
particular application. It would be much easier with self-contained objects
where the object user has the responsibility to provide the storage space.
This could be a statically initialized global object or an embedded
object in a
structure.
A list of requirements for self-contained lock objects follows.
* The initial value of the lock object structure components shall be zero.
This makes it possible to use memset(lock, 0, sizeof(*lock)) for
initialization. Statically initialized lock objects can reside in
the .bss
section.
* The lock object structure definition shall be independent of RTEMS header
files and the RTEMS configuration. So only standard types and
pointers to
types with a forward declaration can be used. With the recent change
of the
thread queue implementation this is possible to fulfill.
* The lock shall avoid priority inversion problems.
Self-contained objects exist as a prototype implementation and show
excellent
results in terms of performance. The data structures defined in Newlib
must be
independent of RTEMS build configurations, like SMP enabled/disabled,
profiling
enabled/disabled, debug enabled/disabled, etc. The basic structure is like
this:
struct _Thread_Control;
struct _Thread_queue_Heads;
struct _Ticket_lock_Control {
unsigned int _next_ticket;
unsigned int _now_serving;
};
struct _Thread_queue_Queue {
struct _Thread_queue_Heads *_heads;
struct _Ticket_lock_Control _Lock;
};
struct _Mutex_Control {
struct _Thread_queue_Queue _Queue;
struct _Thread_Control *_owner;
};
So, a mutex object consists only of 16 bytes on a 32-bit architecture. It
supports uni-processor and SMP configurations (the SMP support needs 8 bytes
for the ticket lock). Two implementation details are exposed to Newlib.
1. The SMP lock data structure. One possible alternative to a ticket lock
are MCS locks. They use only one pointer instead of two integers.
So this
could be addressed with a union in case we really use MCS locks in the
future.
2. The thread queue structure. This is only a pointer to the thread queue
heads and the lock. This should be acceptable since it is unlikely that
the thread queue structure will change shortly, since this structure is
already highly optimized.
I see four uses cases for self-contained objects.
1. The new network stack (uses already its own implementation of
self-contained
objects).
2. The OpenMP support of GCC (libgomp). Here it is a must-have due to
performance reasons.
3. The Newlib internal locks. The advantage compared to using Classic API
objects is better performance, no configuration issues, and smaller
memory
footprint.
4. The GCC thread model. Same advantages as before. In addition this would
enable an easy to use and efficient C11 and C++11 thread support.
In the long run this could lead to a very small footprint system without
dependencies on dynamic memory and a purely static initialization.
What are your opinions?
--
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
/*
* Copyright (c) 2015 embedded brains GmbH. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef _SYS_LOCK_H_
#define _SYS_LOCK_H_
#include <sys/cdefs.h>
#include <stddef.h>
__BEGIN_DECLS
struct _Thread_Control;
struct _Thread_queue_Heads;
struct _Ticket_lock_Control {
unsigned int _next_ticket;
unsigned int _now_serving;
};
struct _Thread_queue_Queue {
struct _Thread_queue_Heads *_heads;
struct _Ticket_lock_Control _Lock;
};
struct _Mutex_Control {
struct _Thread_queue_Queue _Queue;
struct _Thread_Control *_owner;
};
struct _Mutex_recursive_Control {
struct _Mutex_Control _Mutex;
unsigned int _nest_level;
};
struct _Semaphore_Control {
struct _Thread_queue_Queue _Queue;
unsigned int _count;
};
struct _Futex_Control {
struct _Thread_queue_Queue _Queue;
};
#define _THREAD_QUEUE_INITIALIZER { 0, { 0, 0 } }
#define _MUTEX_INITIALIZER { _THREAD_QUEUE_INITIALIZER, 0 }
#define _MUTEX_RECURSIVE_INITIALIZER { _MUTEX_INITIALIZER, 0 }
#define _SEMAPHORE_INITIALIZER(_count) { _THREAD_QUEUE_INITIALIZER, _count }
#define _FUTEX_INITIALIZER { _THREAD_QUEUE_INITIALIZER }
static inline void
_Mutex_Initialize(struct _Mutex_Control *_mutex)
{
struct _Mutex_Control _init = _MUTEX_INITIALIZER;
*_mutex = _init;
}
void _Mutex_Acquire(struct _Mutex_Control *);
int _Mutex_Try_acquire(struct _Mutex_Control *);
void _Mutex_Release(struct _Mutex_Control *);
static inline void
_Mutex_Destroy(struct _Mutex_Control *_mutex)
{
(void)_mutex;
}
static inline void
_Mutex_recursive_Initialize(struct _Mutex_recursive_Control *_mutex)
{
struct _Mutex_recursive_Control _init = _MUTEX_RECURSIVE_INITIALIZER;
*_mutex = _init;
}
void _Mutex_recursive_Acquire(struct _Mutex_recursive_Control *);
int _Mutex_recursive_Try_acquire(struct _Mutex_recursive_Control *);
void _Mutex_recursive_Release(struct _Mutex_recursive_Control *);
static inline void
_Mutex_recursive_Destroy(struct _Mutex_recursive_Control *_mutex)
{
(void)_mutex;
}
static inline void
_Semaphore_Initialize(struct _Semaphore_Control *_semaphore,
unsigned int _count)
{
struct _Semaphore_Control _init = _SEMAPHORE_INITIALIZER(_count);
*_semaphore = _init;
}
void _Semaphore_Wait(struct _Semaphore_Control *);
void _Semaphore_Post(struct _Semaphore_Control *);
static inline void
_Semaphore_Destroy(struct _Semaphore_Control *_semaphore)
{
(void)_semaphore;
}
static inline void
_Futex_Initialize(struct _Futex_Control *_futex)
{
struct _Futex_Control _init = _FUTEX_INITIALIZER;
*_futex = _init;
}
int _Futex_Wait(struct _Futex_Control *, int *, int);
int _Futex_Wake(struct _Futex_Control *, int);
static inline void
_Futex_Destroy(struct _Futex_Control *_futex)
{
(void)_futex;
}
int _Sched_Count(void);
int _Sched_Index(void);
int _Sched_Name_to_index(const char *, size_t);
int _Sched_Processor_count(int);
/* Newlib internal locks */
typedef struct _Mutex_Control _LOCK_T;
typedef struct _Mutex_recursive_Control _LOCK_RECURSIVE_T;
#define __LOCK_INIT(_qualifier, _designator) \
_qualifier _LOCK_T _designator = _MUTEX_INITIALIZER
#define __LOCK_INIT_RECURSIVE(_qualifier, _designator) \
_qualifier _LOCK_T _designator = _MUTEX_RECURSIVE_INITIALIZER
#define __lock_init(_lock) _Mutex_Initialize(&_lock)
#define __lock_acquire(_lock) _Mutex_Acquire(&_lock)
#define __lock_try_acquire(lock) _Mutex_Try_acquire(&_lock)
#define __lock_release(_lock) _Mutex_Release(&_lock)
#define __lock_close(_lock) _Mutex_Destroy(&_lock)
#define __lock_init_recursive(_lock) _Mutex_recursive_Initialize(&_lock)
#define __lock_acquire_recursive(_lock) _Mutex_recursive_Acquire(&_lock)
#define __lock_try_acquire_recursive(lock) _Mutex_recursive_Try_acquire(&_lock)
#define __lock_release_recursive(_lock) _Mutex_recursive_Release(&_lock)
#define __lock_close_recursive(_lock) _Mutex_recursive_Destroy(&_lock)
__END_DECLS
#endif /* _SYS_LOCK_H_ */
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel