Hello,

I work currently on an update of the libbsd to the latest FreeBSD head (see also https://devel.rtems.org/ticket/3472). I was a quite smooth process until May 2018. FreeBSD seems to receive a significant amount of funding to perform better on NUMA systems. They started to use lock-free data structures in the kernel and included the Concurrency Kit in the base system:

https://github.com/freebsd/freebsd/tree/master/sys/contrib/ck

The weak point of lock-free data structures is the memory reclamation. FreeBSD introduced an epoch memory reclamation API:

https://github.com/freebsd/freebsd/blob/master/share/man/man9/epoch.9

It is now used for basic synchronization in the network stack and hard to avoid. The Concurrency Kit and the epoch memory reclamation API are interesting features for RTEMS as well. The FreeBSD implementation needs a thread pinning feature which is hard to implement in RTEMS. It turned out that this is only used as an optimization, see also:

https://lists.freebsd.org/pipermail/freebsd-hackers/2018-August/053165.html

To support everything in RTEMS is a lot of work, so I have to make some trade-offs. The implementation of this API must be as efficient as possible since it is used in the critical paths of the network stack. I will try to use a single global epoch and thread-specific records as suggested by Matthew Macy to avoid the need for per-processor data structures and the thread pinning. One key issue is that epoch records must not be destroyed:

https://www.mankier.com/3/ck_epoch_register

The consequence of this is that unlimited thread objects may lead to undefined behaviour with this implementation approach. Also thread-local storage cannot be used since it is reinitialized once a thread restarted or reused. The epoch record must be included in the Thread_Control and must not be touched by _Thread_Initialize(). This means I have to move the API and its implementation along with the Concurrency Kit to RTEMS.

Alternatively, I could try to implement the thread pinning feature. I am not sure if it is possible at all. It will definitely not work well together with mutex obtain timeouts.

Adding support for general purpose per-processor data structures would be quite easy. We just have to collect all per-processor data in a linker section and duplicate the section content for each secondary processor. Then use the _Per_CPU_Information[] to get a pointer to the corresponding memory area.

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Reply via email to