Hello Chris,

Am 19.03.21 um 02:11 schrieb Chris Johns:
On 3/3/21 7:41 pm, Christian MAUDERER wrote:
Hello Chris,

Am 03.03.21 um 02:17 schrieb Chris Johns:
On 2/3/21 7:26 pm, Christian MAUDERER wrote:
Hello Chris,

Am 02.03.21 um 01:03 schrieb Chris Johns:
On 1/3/21 7:24 pm, Christian MAUDERER wrote:
Hello Chris,

thanks for the review.

Am 26.02.21 um 19:04 schrieb Chris Johns:
On 26/2/21 2:01 am, Christian Mauderer wrote:
Dynamically allocate a big enough file descriptor set for select(). A
better solution would be to use kqueue() instead of select().
---
     .../racoon/rtems-bsd-racoon-session-data.h    |  6 +--
     ipsec-tools/src/racoon/session.c              | 40 +++++++++++++++++++
     2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/ipsec-tools/src/racoon/rtems-bsd-racoon-session-data.h
b/ipsec-tools/src/racoon/rtems-bsd-racoon-session-data.h
index b869a1518..196107a35 100644
--- a/ipsec-tools/src/racoon/rtems-bsd-racoon-session-data.h
+++ b/ipsec-tools/src/racoon/rtems-bsd-racoon-session-data.h
@@ -2,11 +2,11 @@
     #include <rtems/linkersets.h>
     #include "rtems-bsd-racoon-data.h"
     /* session.c */
-RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static fd_set active_mask);
-RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static fd_set preset_mask);
+RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static _types_fd_set
*allocated_active_mask);
+RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static _types_fd_set
*allocated_preset_mask);
     RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static int nfds);
     RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static int signals[]);
     RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static sig_atomic_t volatile
volatile sigreq[]);
-RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static struct fd_monitor
fd_monitors[]);
+RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static struct fd_monitor
*allocated_fd_monitors);
     RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static struct
fd_monitor_list
fd_monitor_tree[]);
     RTEMS_LINKER_RWSET_CONTENT(bsd_prog_racoon, static struct sched
scflushsa);
diff --git a/ipsec-tools/src/racoon/session.c
b/ipsec-tools/src/racoon/session.c
index 65124c15e..90120c761 100644
--- a/ipsec-tools/src/racoon/session.c
+++ b/ipsec-tools/src/racoon/session.c
@@ -65,6 +65,10 @@
     #include <sys/stat.h>
     #include <paths.h>
     #include <err.h>
+#ifdef __rtems__
+#include <sys/param.h>
+#include <rtems/libio_.h>
+#endif /* __rtems__ */
       #include <netinet/in.h>
     #include <resolv.h>
@@ -123,8 +127,16 @@ static void check_sigreq __P((void));
     static void check_flushsa __P((void));
     static int close_sockets __P((void));
     +#ifndef __rtems__
     static fd_set preset_mask, active_mask;
     static struct fd_monitor fd_monitors[FD_SETSIZE];
+#else /* __rtems__ */
+static fd_set *allocated_preset_mask, *allocated_active_mask;
+static struct fd_monitor *allocated_fd_monitors;
+#define preset_mask (*allocated_preset_mask)
+#define active_mask (*allocated_active_mask)
+#define fd_monitors (allocated_fd_monitors)
+#endif /* __rtems__ */
     static TAILQ_HEAD(fd_monitor_list, fd_monitor)
fd_monitor_tree[NUM_PRIORITIES];
     static int nfds = 0;
     @@ -134,7 +146,11 @@ static struct sched scflushsa =
SCHED_INITIALIZER();
     void
     monitor_fd(int fd, int (*callback)(void *, int), void *ctx, int
priority)
     {
+#ifndef __rtems__
         if (fd < 0 || fd >= FD_SETSIZE) {
+#else /* __rtems__ */
+    if (fd < 0 || fd >= rtems_libio_number_iops) {
+#endif /* __rtems__ */
             plog(LLV_ERROR, LOCATION, NULL, "fd_set overrun");
             exit(1);
         }
@@ -158,7 +174,11 @@ monitor_fd(int fd, int (*callback)(void *, int), void
*ctx, int priority)
     void
     unmonitor_fd(int fd)
     {
+#ifndef __rtems__
         if (fd < 0 || fd >= FD_SETSIZE) {
+#else /* __rtems__ */
+    if (fd < 0 || fd >= rtems_libio_number_iops) {
+#endif /* __rtems__ */
             plog(LLV_ERROR, LOCATION, NULL, "fd_set overrun");
             exit(1);
         }
@@ -186,7 +206,22 @@ session(void)
         struct fd_monitor *fdm;
           nfds = 0;
+#ifndef __rtems__
         FD_ZERO(&preset_mask);
+#else /* __rtems__ */
+    allocated_preset_mask = calloc(sizeof(fd_set),
+        howmany(rtems_libio_number_iops, sizeof(fd_set) * 8));

Does `maxfiles` work here?


To be honest: I'm not sure.

According to the comment in file.h:

extern int maxfiles; /* kernel limit on number of open files */


Yes.

Sounds like it _can_ be the same as the maximum file number but doesn't
have to.

I think we need to have them be the same value.

I didn't find where we implement it. It's declared as an extern int
maxfiles but
I didn't find any definition. I found it only in libbsd in
freebsd/sys/kern/uipc_socket.c where it is defined like follows:

#define maxfiles rtems_libio_number_iops

Ah OK. I knew it had been assigned somewhere and yes it looks like it is local
to that file.


So question is: Where and how is maxfiles defined?


I have provided a value in the rtemsbsd init file as part of the set og globals
we need to maintained.

Somehow I missed that. Where can I find it?

Again sorry, I was in a rush and I was not clear. I have add this in my new
changes for NFSv4 support and this is what I sorted out.


Ah, OK. In that case I'll keep the rtems_libio_number_iops at the moment if it
is OK for you.

Yes that is fine.


Thanks.




+    if (allocated_preset_mask == NULL)
+        errx(1, "failed to allocate preset_mask");
+    allocated_active_mask = calloc(sizeof(fd_set),
+        howmany(rtems_libio_number_iops, sizeof(fd_set) * 8));
+    if (allocated_active_mask == NULL)
+        errx(1, "failed to allocate active_mask");
+    allocated_fd_monitors = calloc(
+        rtems_libio_number_iops, sizeof(struct fd_monitor));
+    if (allocated_fd_monitors == NULL)
+        errx(1, "failed to allocate fd_monitors");

At the core of this issue is the rotating fd allocation that we have in
libio. A
report from a FreeBSD machine I have is:

$ sysctl -a | grep maxfiles
kern.maxfiles: 1037243
kern.maxfilesperproc: 933516

I doubt a select process in FreeBSD needs a select array with that many
bits. I
have added similar code else where but I am wondering if this is really
what we
want to do. A side effect for us is the stack usage is not bounded and that
is a
problem. I think our newlib based max setting is too small.

I think we have to distinguish between FreeBSD kernel space and user space.
If I
have seen it correctly, FreeBSD uses kqueues or maybe sometimes poll
instead of
select in kernel space most of the time. That avoids the problem with big file
numbers.

Yes kqueue is nice but we need to support existing code and this is a primary
role libbsd needs to perform. The lack is signals in libbsd is another
source of
issue.


I didn't want to say that we should not support select. I only wanted to say
that FreeBSD most likely just avoids that problem in kernel space so that they
don't have to support a select with 1037243 files. In user space they limit a
select to 1024 files.

Yes they do. My suggestion of us using 1024 is FreeBSD applications are fine
with this then they should be portable to RTEMS and libbsd and the need for us
to patch code is removed.


1024 for a select is quite much for an embedded system. It means that we will
need 128 bytes of stack for each select call in the future. It's a quite big
change.

The select call in LibBSD double this internally.


Doesn't make it better.


Another possibility would be 256 (32 bytes) like some other *BSD do.

Maybe we should start a new discussion thread about the right FD_SETSIZE so that
more developers get involved?

Default for FreeBSD seems to be a FD_SETSIZE of 1024. See the Notes section in
the select man page:

https://www.freebsd.org/cgi/man.cgi?query=select&sektion=2&manpath=freebsd-release-ports#end




Yes.

I think select is used in user space applications most of the time. I would
guess that FreeBSD has some file number mapping between kernel and user space
that would allow every application to open 1024 files.

Yes. There are other checks and arrays in the kernel's select that handle the
size of 1024 or smaller. We should align ourselves with FreeBSD. A select call
uses a lot of stack.


I have no problem if we align more with FreeBSD. But I'm not sure how much
effort that would be. And we have to be careful not to increase the memory usage
for small targets too much.

We need to find a path between that sort of space optimization and the
functionality libbsd brings. The select call is costly in terms of stack size as
the kern_select call has:

          /*
           * The magic 2048 here is chosen to be just enough for FD_SETSIZE
           * infds with the new FD_SETSIZE of 1024, and more than enough for
           * FD_SETSIZE infds, outfds and exceptfds with the old FD_SETSIZE
           * of 256.
           */
          fd_mask s_selbits[howmany(2048, NFDBITS)];

I do not know if this means we need to be more concerned about the size of less.


Hm. They could have used FD_SETSIZE instead of the magic value with a comment...

It might pay to check the surrounding code a litte more before raising something
like that on the FreeBSds hackers list. There are subtle optimisations happening
to make this call perform.

I didn't plan to rise it on the FreeBSD mailing list. If I plan to fight against existing magic numbers I'll start in RTEMS before I fight them in FreeBSD ;-)


I think a big problem for RTEMS is that we have shared file numbers for all
processes. This makes 1024 a lot less compared to starting from 0 for every
process.

We are a single process so I do not follow. We are no different to other
processes. We have lived with a small file set size for years until libio was
changed to cycle over the fd numbers

We started to include libbsd with some applications that are more used to desktop or server environments like wpa supplicant or racoon. These applications originally didn't expect to run in a single process environment but have an own process for each of them.

I'm just not sure whether increasing FD_SETSIZE is the best solution for that problem. It only moves it so that it does happen in fewer applications at the cost that all applications that use select and can work with the current limit will need more stack.

That means: I wouldn't object to such a patch but I don't want to be the one creating and suggesting it.


FYI I have a major set of changes to libbsd that enables FreeBSD descriptor
support and moves the libio support to specific interfaces.

OK. When do you plan to add that and to which branches?

I am looking 6-freebsd-12 and master. Thses changes are a major rework and not
suitable for 5.

I assumed that.


I would like to add this
patch set to 5 sooner or later because it fixes a potential bug in racoon for
that branch too.

Sorry I missed this for 5. I will need to think about this on 5. I do not think
it is a problem but I am not sure about it on 6 and beyond. I would prefer we
consider a better fix for select that the limited 64 descriptors we currently
have.

Maybe we should take a multi step approach:

Use the workaround like I suggested for 5 and (at the moment) for master. A
similar workaround is already used in quite some other locations in libbsd. So
it's not a new method.

Start a ticket and a discussion for a better solution.

Sounds a good approach. I am fine with the changes because it is what we have to
do and this was not meant to halt the progress, rather I wanted to have the
discussion and raise the concerns.


OK. Thanks.

If you don't add further objections, I'll create a ticket for the patch and apply it in the next days. I'll create an extra ticket and mail thread for applying it to 5 afterwards.

Best regards

Christian

Like I said: Maybe we should create a ticket and start an extra
discussion thread to get more ideas?

I'm not sure yet how that
could look like. Some impressions from other systems:

- FreeBSD, OpenBSD, NetBSD and DragonFly BSD are using the same method we use
(or most likely the other way round) only with different FD_SETSIZEs:

https://cgit.freebsd.org/src/tree/sys/sys/select.h
https://github.com/NetBSD/src/blob/trunk/sys/sys/fd_set.h
https://github.com/openbsd/src/blob/master/sys/sys/select.h
https://gitweb.dragonflybsd.org/dragonfly.git/blob/HEAD:/sys/sys/_fd_set.h

- Linux Kernels nolibc.h used in Tests does the same again (GPL or MIT):

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/include/nolibc/nolibc.h#n2522



- Apple is the same again. I'm not sure about the Apple license therefore no
link here so I don't spoil anyone beneath myself.

There are others that could be worth a look like Haiku, Musl-Libc, and so on.
But my general impression is that there is no much better solution at least in
the bigger systems except for avoiding select or setting a bigger FD_SETSIZE.
The most common ones seem to be 256 or 1024. But that only moves the problem a
bit further away.

Yes select does not offer a great range of solutions. As Sebastian has stated a
number of times kqueue does and this should be encouraged. And kqueue by passes
the signal issue that seems to acompany select, eg timer signals.

I see the central quesiton being the role libbsd plays and I feel it is to aid
porting code to RTEMS and easily as possible. New code written for RTEMS should
avoid select.


Agreed: Select is evil ;-)

Just kidding: It's clear that we have to handle application that use select too
due to portability reasons. I'm just not sure what a good solution would be. It
feels like every solution is a bad one: If we increase FD_SETSIZE to typical
values on desktop systems (256 or 1024), we will need a lot of stack and only
cover a part of the applications. If we change every location in the libraries
it increases maintenance effort.

We also keep in mind if an application uses select _and_ signals it will be
broken on LibBSD. PTP2 was like that.

Chris


--
--------------------------------------------
embedded brains GmbH
Herr Christian MAUDERER
Dornierstr. 4
82178 Puchheim
Germany
email: christian.maude...@embedded-brains.de
phone: +49-89-18 94 741 - 18
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/
_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Reply via email to