Package: musl-dev
Tags: upstream
User: helm...@debian.org
Usertags: rebootstrap

Hi Reiner,

I got quite a bit further with a musl bootstrap, but now it fails
building systemd and this one is hard. It probably requires some
discussion.

The immediate issue is that #include <printf.h> does not work. This is a
known problem, has been filed on the systemd side, rejected there, has
been filed on the musl side, rejected there, carried as a patch among
many patches in a number of distros.

I suppose the best musl + systemd patchset is the one from the
OpenEmbedded people at:
https://git.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd
You quickly notice that this is not for the faint of heart. It's
probably around 100kb worth of patches. NixOS also uses exactly these
patches.

In most of them the issue goes like this:

musl people (Rich Felker in particular) say that the requested
functionality is not specified by POSIX and thus not included in musl.
Get it specified and musl will add it.

systemd people say that it doesn't make sense to ship copies of the
relevant functions and that the C library should provide it.

This is not the first time we face missing functionality in musl. An
earlier example was the fts library. Back then, we agreed that Debian's
musl should include fts and you added it to the Debian packaging.

Now does the same work for printf.h? And strndupa? And basename? And
probably more?

I think we need a decision on how to deal with these issues in future to
avoid conflict.

Constraints:
 * musl upstream will not include these patches.
 * Neither systemd upstream nor Debian systemd will include the
   OpenEmbedded patchset. It is way too big.
 * On Debian, we expect a system libc to provide more than musl
   provides.

Possibly there can be some kind of addon library that fills in the
missing pieces and also gains the fts functionality? Maybe such a
library could be shared with OpenEmbedded and NixOS to reduce their
patch stack?

What do you think?

I've attached a demo patch to show how to include printf.h in a way that
works practically. However, since it is LGPL-licensed, you cannot
include it in the package as is.

Helmut
--- a/debian/clean
+++ b/debian/clean
@@ -4,4 +4,6 @@
 include/crypt.h
 src/crypt/
 include/fts.h
+include/printf.h
 src/fts/
+src/printf/
--- a/debian/musl-printf/printf.c
+++ b/debian/musl-printf/printf.c
@@ -0,0 +1,273 @@
+/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
+
+/***
+  This file is part of systemd.
+
+  Copyright 2014 Emil Renner Berthing <syst...@esmil.dk>
+
+  With parts from the musl C library
+  Copyright 2005-2014 Rich Felker, et al.
+
+  systemd is free software; you can redistribute it and/or modify it
+  under the terms of the GNU Lesser General Public License as published by
+  the Free Software Foundation; either version 2.1 of the License, or
+  (at your option) any later version.
+
+  systemd is distributed in the hope that it will be useful, but
+  WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+  Lesser General Public License for more details.
+
+  You should have received a copy of the GNU Lesser General Public License
+  along with systemd; If not, see <http://www.gnu.org/licenses/>.
+***/
+
+#include <printf.h>
+
+#include <stddef.h>
+#include <string.h>
+
+static const char *consume_nonarg(const char *fmt)
+{
+        do {
+                if (*fmt == '\0')
+                        return fmt;
+        } while (*fmt++ != '%');
+        return fmt;
+}
+
+static const char *consume_num(const char *fmt)
+{
+        for (;*fmt >= '0' && *fmt <= '9'; fmt++)
+                /* do nothing */;
+        return fmt;
+}
+
+static const char *consume_argn(const char *fmt, size_t *arg)
+{
+        const char *p = fmt;
+        size_t val = 0;
+
+        if (*p < '1' || *p > '9')
+                return fmt;
+        do {
+                val = 10*val + (*p++ - '0');
+        } while (*p >= '0' && *p <= '9');
+
+        if (*p != '$')
+                return fmt;
+        *arg = val;
+        return p+1;
+}
+
+static const char *consume_flags(const char *fmt)
+{
+        while (1) {
+                switch (*fmt) {
+                case '#':
+                case '0':
+                case '-':
+                case ' ':
+                case '+':
+                case '\'':
+                case 'I':
+                        fmt++;
+                        continue;
+                }
+                return fmt;
+        }
+}
+
+enum state {
+        BARE,
+        LPRE,
+        LLPRE,
+        HPRE,
+        HHPRE,
+        BIGLPRE,
+        ZTPRE,
+        JPRE,
+        STOP
+};
+
+enum type {
+        NONE,
+        PTR,
+        INT,
+        UINT,
+        ULLONG,
+        LONG,
+        ULONG,
+        SHORT,
+        USHORT,
+        CHAR,
+        UCHAR,
+        LLONG,
+        SIZET,
+        IMAX,
+        UMAX,
+        PDIFF,
+        UIPTR,
+        DBL,
+        LDBL,
+        MAXTYPE
+};
+
+static const short pa_types[MAXTYPE] = {
+        [NONE]   = PA_INT,
+        [PTR]    = PA_POINTER,
+        [INT]    = PA_INT,
+        [UINT]   = PA_INT,
+        [ULLONG] = PA_INT | PA_FLAG_LONG_LONG,
+        [LONG]   = PA_INT | PA_FLAG_LONG,
+        [ULONG]  = PA_INT | PA_FLAG_LONG,
+        [SHORT]  = PA_INT | PA_FLAG_SHORT,
+        [USHORT] = PA_INT | PA_FLAG_SHORT,
+        [CHAR]   = PA_CHAR,
+        [UCHAR]  = PA_CHAR,
+        [LLONG]  = PA_INT | PA_FLAG_LONG_LONG,
+        [SIZET]  = PA_INT | PA_FLAG_LONG,
+        [IMAX]   = PA_INT | PA_FLAG_LONG_LONG,
+        [UMAX]   = PA_INT | PA_FLAG_LONG_LONG,
+        [PDIFF]  = PA_INT | PA_FLAG_LONG_LONG,
+        [UIPTR]  = PA_INT | PA_FLAG_LONG,
+        [DBL]    = PA_DOUBLE,
+        [LDBL]   = PA_DOUBLE | PA_FLAG_LONG_DOUBLE
+};
+
+#define S(x) [(x)-'A']
+#define E(x) (STOP + (x))
+
+static const unsigned char states[]['z'-'A'+1] = {
+        { /* 0: bare types */
+                S('d') = E(INT), S('i') = E(INT),
+                S('o') = E(UINT),S('u') = E(UINT),S('x') = E(UINT), S('X') = E(UINT),
+                S('e') = E(DBL), S('f') = E(DBL), S('g') = E(DBL),  S('a') = E(DBL),
+                S('E') = E(DBL), S('F') = E(DBL), S('G') = E(DBL),  S('A') = E(DBL),
+                S('c') = E(CHAR),S('C') = E(INT),
+                S('s') = E(PTR), S('S') = E(PTR), S('p') = E(UIPTR),S('n') = E(PTR),
+                S('m') = E(NONE),
+                S('l') = LPRE,   S('h') = HPRE, S('L') = BIGLPRE,
+                S('z') = ZTPRE,  S('j') = JPRE, S('t') = ZTPRE
+        }, { /* 1: l-prefixed */
+                S('d') = E(LONG), S('i') = E(LONG),
+                S('o') = E(ULONG),S('u') = E(ULONG),S('x') = E(ULONG),S('X') = E(ULONG),
+                S('e') = E(DBL),  S('f') = E(DBL),  S('g') = E(DBL),  S('a') = E(DBL),
+                S('E') = E(DBL),  S('F') = E(DBL),  S('G') = E(DBL),  S('A') = E(DBL),
+                S('c') = E(INT),  S('s') = E(PTR),  S('n') = E(PTR),
+                S('l') = LLPRE
+        }, { /* 2: ll-prefixed */
+                S('d') = E(LLONG), S('i') = E(LLONG),
+                S('o') = E(ULLONG),S('u') = E(ULLONG),
+                S('x') = E(ULLONG),S('X') = E(ULLONG),
+                S('n') = E(PTR)
+        }, { /* 3: h-prefixed */
+                S('d') = E(SHORT), S('i') = E(SHORT),
+                S('o') = E(USHORT),S('u') = E(USHORT),
+                S('x') = E(USHORT),S('X') = E(USHORT),
+                S('n') = E(PTR),
+                S('h') = HHPRE
+        }, { /* 4: hh-prefixed */
+                S('d') = E(CHAR), S('i') = E(CHAR),
+                S('o') = E(UCHAR),S('u') = E(UCHAR),
+                S('x') = E(UCHAR),S('X') = E(UCHAR),
+                S('n') = E(PTR)
+        }, { /* 5: L-prefixed */
+                S('e') = E(LDBL),S('f') = E(LDBL),S('g') = E(LDBL), S('a') = E(LDBL),
+                S('E') = E(LDBL),S('F') = E(LDBL),S('G') = E(LDBL), S('A') = E(LDBL),
+                S('n') = E(PTR)
+        }, { /* 6: z- or t-prefixed (assumed to be same size) */
+                S('d') = E(PDIFF),S('i') = E(PDIFF),
+                S('o') = E(SIZET),S('u') = E(SIZET),
+                S('x') = E(SIZET),S('X') = E(SIZET),
+                S('n') = E(PTR)
+        }, { /* 7: j-prefixed */
+                S('d') = E(IMAX), S('i') = E(IMAX),
+                S('o') = E(UMAX), S('u') = E(UMAX),
+                S('x') = E(UMAX), S('X') = E(UMAX),
+                S('n') = E(PTR)
+        }
+};
+
+size_t parse_printf_format(const char *fmt, size_t n, int *types)
+{
+        size_t i = 0;
+        size_t last = 0;
+
+        memset(types, 0, n);
+
+        while (1) {
+                size_t arg;
+                unsigned int state;
+
+                fmt = consume_nonarg(fmt);
+                if (*fmt == '\0')
+                        break;
+                if (*fmt == '%') {
+                        fmt++;
+                        continue;
+                }
+                arg = 0;
+                fmt = consume_argn(fmt, &arg);
+                /* flags */
+                fmt = consume_flags(fmt);
+                /* width */
+                if (*fmt == '*') {
+                        size_t warg = 0;
+                        fmt = consume_argn(fmt+1, &warg);
+                        if (warg == 0)
+                                warg = ++i;
+                        if (warg > last)
+                                last = warg;
+                        if (warg <= n && types[warg-1] == NONE)
+                                types[warg-1] = INT;
+                } else
+                        fmt = consume_num(fmt);
+                /* precision */
+                if (*fmt == '.') {
+                        fmt++;
+                        if (*fmt == '*') {
+                                size_t parg = 0;
+                                fmt = consume_argn(fmt+1, &parg);
+                                if (parg == 0)
+                                        parg = ++i;
+                                if (parg > last)
+                                        last = parg;
+                                if (parg <= n && types[parg-1] == NONE)
+                                        types[parg-1] = INT;
+                        } else {
+                                if (*fmt == '-')
+                                        fmt++;
+                                fmt = consume_num(fmt);
+                        }
+                }
+                /* length modifier and conversion specifier */
+                state = BARE;
+                do {
+                        unsigned char c = *fmt++;
+
+                        if (c < 'A' || c > 'z')
+                                continue;
+                        state = states[state]S(c);
+                        if (state == 0)
+                                continue;
+                } while (state < STOP);
+
+                if (state == E(NONE))
+                        continue;
+
+                if (arg == 0)
+                        arg = ++i;
+                if (arg > last)
+                        last = arg;
+                if (arg <= n)
+                        types[arg-1] = state - STOP;
+        }
+
+        if (last > n)
+                last = n;
+        for (i = 0; i < last; i++)
+                types[i] = pa_types[types[i]];
+
+        return last;
+}
--- a/debian/musl-printf/printf.h
+++ b/debian/musl-printf/printf.h
@@ -0,0 +1,52 @@
+/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/
+
+/***
+  This file is part of systemd.
+
+  Copyright 2014 Emil Renner Berthing <syst...@esmil.dk>
+
+  With parts from the GNU C Library
+  Copyright 1991-2014 Free Software Foundation, Inc.
+
+  systemd is free software; you can redistribute it and/or modify it
+  under the terms of the GNU Lesser General Public License as published by
+  the Free Software Foundation; either version 2.1 of the License, or
+  (at your option) any later version.
+
+  systemd is distributed in the hope that it will be useful, but
+  WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+  Lesser General Public License for more details.
+
+  You should have received a copy of the GNU Lesser General Public License
+  along with systemd; If not, see <http://www.gnu.org/licenses/>.
+***/
+
+#ifndef _PRINTF_H_
+#define _PRINTF_H_
+
+#include <stddef.h>
+
+enum {				/* C type: */
+  PA_INT,			/* int */
+  PA_CHAR,			/* int, cast to char */
+  PA_WCHAR,			/* wide char */
+  PA_STRING,			/* const char *, a '\0'-terminated string */
+  PA_WSTRING,			/* const wchar_t *, wide character string */
+  PA_POINTER,			/* void * */
+  PA_FLOAT,			/* float */
+  PA_DOUBLE,			/* double */
+  PA_LAST
+};
+
+/* Flag bits that can be set in a type returned by `parse_printf_format'.  */
+#define	PA_FLAG_MASK		0xff00
+#define	PA_FLAG_LONG_LONG	(1 << 8)
+#define	PA_FLAG_LONG_DOUBLE	PA_FLAG_LONG_LONG
+#define	PA_FLAG_LONG		(1 << 9)
+#define	PA_FLAG_SHORT		(1 << 10)
+#define	PA_FLAG_PTR		(1 << 11)
+
+size_t parse_printf_format(const char *fmt, size_t n, int *types);
+
+#endif /* !_PRINTF_H_ */
--- a/debian/rules
+++ b/debian/rules
@@ -69,7 +69,12 @@
 	cp debian/musl-fts/fts.c debian/musl-fts/config.h src/fts/
 	cp debian/musl-fts/fts.h include/

-override_dh_auto_configure: debian/scripts/$(MUSL_TRIPLE).path copy_fts
+copy_printf:
+	mkdir -p src/printf
+	cp debian/musl-printf/printf.c src/printf/
+	cp debian/musl-printf/printf.h include/
+
+override_dh_auto_configure: debian/scripts/$(MUSL_TRIPLE).path copy_fts copy_printf
 	dh_auto_configure -- --libdir=/usr/lib/$(MUSL_TRIPLE) --includedir=/usr/include/$(MUSL_TRIPLE) --host=$(DEB_HOST_GNU_TYPE) --enable-gcc-wrapper=$(GCC_WRAPPER) --enable-debug

 execute_after_dh_auto_install:

Reply via email to