Re: [systemd-devel] give unprivileged nspawn container write access to host wayland socket

2021-11-22 Thread systemd-devel


Hey Nozz,

I've tried the exact same setup and run into this problem. I've explained it a 
bit better here[1].
Since the linux kernel 5.12 there are filesystem id mappings that can be used 
for that in combination with --private-users=pick.
I've written the pull request[0] to include support in nspawn for that. In my 
opinion this is the best way to share such a socket.
There is not yet a systemd release containing the pull request.
I'm not sure if the tempfs, where I guess your socket is located, implementation in linux does yet support those mappings, last time I checked (when I wrote the pull request) it 
didn't.
Yes support for filesystem id mappings depends on the source filesystem. You could solve this by moving the socket to another location, for example an ext4 filesystem, until tmpfs 
supports it as well.


Alternatively you could use extended acls for that.
Another option would be to allow access for "other" on the socket, but not the 
parent folder, and use --bind as is.


Best regards,
nd

[0] https://github.com/systemd/systemd/pull/19828
[1] https://lists.freedesktop.org/archives/systemd-devel/2021-May/046503.html


OpenPGP_signature
Description: OpenPGP digital signature


[systemd-devel] systemd-nspawn with filesystem id mapping

2021-05-30 Thread systemd-devel
Hi!

I was very pleased to see the "nspawn: add support for kernel 5.12 ID mapping 
mounts #19438"-pull request and went right at it to try it out.
The following was tested on the current git head of systemd running on 
archlinux.

What I try to achieve on a high level is kind of emulating bubblewrap and 
executing chromium under wayland with gpu acceleration and working audio using 
PipeWire.
For that I need to pass some sockets and devices to the container using 
--bind-ro . I want to use --private-users=pick to have easier separation 
between multiple Containers.
That means I do not know the running uid of the process before nspawn spawns my 
container. That results on problems accessing the sockets.
Until now I used setfacl to work around this limitation and allow access to the 
sockets.
I was hoping to be able to skip that with --private-users-ownership=map .

I'm passing three sockets belonging to uid 1000 on the host to a container with 
private-users=pick and and try to access it via uid 1000 (name "user") in the 
container.
Everything is happening on an ext4 file system. I'd prefer btrfs but that is 
(so far) lacking id mapping support.
The full call looks like that:

statepath="/machines/state/chromium/${profilename}"
systemd-nspawn \
-D /machines/images/archlinux-chromium/ \
--private-users=pick \
--private-users-ownership=map \
--no-new-privileges=yes \
--as-pid2 \
--machine "chromium-${profilename}" \
--user user \
--bind-ro /var/run/user/1000/pulse/native:/sockets/pulse/native \
--bind-ro /var/run/user/1000/wayland-1:/sockets/wayland-1 \
--bind-ro /var/run/user/1000/pipewire-0:/sockets/pipewire-0 \
--bind "${statepath}:/home/user" \
--bind /dev/dri/renderD128 \
-E WAYLAND_DISPLAY=wayland-1 \
-E XDG_RUNTIME_DIR=/sockets \
chromium --enable-features=UseOzonePlatform --ozone-platform=wayland

This results in the following output:

Spawning container chromium-default on /machines/images/archlinux-chromium.
Press ^] three times within 1s to kill container.
Selected user namespace base 552206336 and range 65536.
Failed to create mount point 
/machines/images/archlinux-chromium/sockets/pipewire-0: Value too large for 
defined data type

I've run strace on it, this results in the following relevant output:

[pid   524] mount("/machines/state/chromium/default", "/proc/self/fd/8", NULL, 
MS_BIND|MS_REC, NULL) = 0
[pid   524] close(8)= 0
[pid   524] newfstatat(AT_FDCWD, "/var/run/user/1000/pipewire-0", 
{st_mode=S_IFSOCK|0666, st_size=0, ...}, 0) = 0
[pid   524] openat(AT_FDCWD, "/machines/images/archlinux-chromium", 
O_RDONLY|O_CLOEXEC|O_PATH|O_DIRECTORY) = 8
[pid   524] openat(8, "sockets", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 10
[pid   524] newfstatat(10, "", {st_mode=S_IFDIR|0700, st_size=4096, ...}, 
AT_EMPTY_PATH) = 0
[pid   524] close(8)= 0
[pid   524] openat(10, "pipewire-0", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = -1 
ENOENT (No such file or directory
)
[pid   524] close(10)   = 0
[pid   524] newfstatat(AT_FDCWD, "/machines/images/archlinux-chromium/sockets", 
{st_mode=S_IFDIR|0700, st_size=40
96, ...}, 0) = 0
[pid   524] openat(AT_FDCWD, 
"/machines/images/archlinux-chromium/sockets/pipewire-0", 
O_RDONLY|O_NOFOLLOW|O_CLOE
XEC|O_PATH) = -1 ENOENT (No such file or directory)
[pid   524] openat(AT_FDCWD, 
"/machines/images/archlinux-chromium/sockets/pipewire-0", 
O_WRONLY|O_CREAT|O_EXCL|O_
CLOEXEC, 0644) = -1 EOVERFLOW (Value too large for defined data type)
[pid   524] writev(2, [{iov_base="Failed to create mount point /ma"..., 
iov_len=122}, {iov_base="\n", iov_len=1}]
, 2Failed to create mount point 
/machines/images/archlinux-chromium/sockets/pipewire-0: Value too large for 
defin
ed data type
) = 123

This maps to the touch in nspawn-mount.c at line 754.
If I skip the --bind(-ro) part this works fine (except chromium of course not 
working), same if I keep the binds and remove the --private-users-ownership=map.
I'm kind of lost on how to go on about this issue at this point.
Have I made a mistake or wrong assumption about how that should work?
Should I open an issue on github about that?

Thanks,
nd
___
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn with filesystem id mapping

2021-06-04 Thread systemd-devel
Hi again,

after some more debugging this EOVERFLOW seems to be the result of a call to 
may_o_create in fs/namei.c in the kernel.
There is a check:

if (!fsuidgid_has_mapping(dir->dentry->d_sb, mnt_userns))
return -EOVERFLOW;

This seems to be the one returning EOVERFLOW to nspawn and resulting in the 
container spawn to fail.
My guess would be that this is a systemd bug when combining filesystem id 
mapping with --bind.
Before I start spending more time debugging this, has anyone so far used --bind 
with --private-users=pick and --private-users-ownership=map successfull?

As far as I understand the pull request #19438 , didn't add any handling to the 
mount_bind function. Was this maybe overlooked?
In my understanding there is a remount_idmap missing in that function well as 
the touch needs to be done in the correct user namespace or with mapped 
uid/gids.

I'm new to the systemd source code, could somebody confirm that I'm on the 
right track there and not heading in the wrong direction?

Thanks,
nd



OpenPGP_signature
Description: OpenPGP digital signature
___________
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] (no subject)

2015-08-22 Thread systemd-devel-bounces
From [email protected] Sat Aug 22 17:58:23 2015
MIME-Version: 1.0
References: <[email protected]>
In-Reply-To: <[email protected]>
Message-ID: 
Date: Sat, 22 Aug 2015 14:58:24 +
Subject: Re: [systemd-devel] user/session buses
From: =?UTF-8?Q?Mantas_Mikul=C4=97nas?= 
To: =?UTF-8?Q?Micha=C5=82_Zegan?= , 
[email protected]
Content-Type: multipart/alternative; boundary=001a114b9560bdfa07051de79d10

--001a114b9560bdfa07051de79d10
Content-Type: text/plain; charset=UTF-8; format=flowed; delsp=yes
Content-Transfer-Encoding: base64

V2VsbCwgeW91IGp1c3Qgd291bGRuJ3QgaGF2ZSBtb3JlIHRoYW4gb25lIGdyYXBoaWNhbCBzZXNz
aW9uLiBUaGF0J3MgcGFydCAgDQpvZiB0aGUgZ2VuZXJhbCBwbGFuIGFmYWlrLg0KDQpOb3RlIHRo
YXQgdGhpcyBpcyBhbHJlYWR5IGhhbGYtYnJva2VuLCBiZWNhdXNlIHNvbWUgb2YgdGhvc2UgcHJv
Z3JhbXMgIA0KYWN0dWFsbHkgKmV4cGVjdCogdG8gYmUgdW5pcXVlICpwZXIgdXNlciog4oCTIGVn
IGRjb25mLWRhZW1vbiBmb3Igd3JpdGluZyB0byAgDQp0aGUgZGNvbmYgZGIg4oCTIGFuZCBoYXZp
bmcgdHdvIGNvcGllcyBvZiBpdCBpbiB0d28gc2Vzc2lvbnMgbWlnaHQgYmUgYmFk4oCmDQoNCg0K
DQpPbiBTYXQsIEF1ZyAyMiwgMjAxNSwgMTM6MzYgTWljaGHFgiBaZWdhbiA8d2ViY3phdF8yMDBA
cG9jenRhLm9uZXQucGw+IHdyb3RlOg0KDQoNCkhlbGxvLg0KDQpJIGJlbGlldmUsIGFsdGhvdWdo
IG1heSBiZSB3cm9uZywgdGhhdCBzZXNzaW9uIGJ1c2VzIHdlcmUgdXNlZCB0bw0KZW5mb3JjZSBz
aW5nbGUgaW5zdGFuY2VzIG9mIHByb2dyYW1zLCBsaWtlIGEgcHJvZ3JhbSByZWdpc3RlcmVkIGEg
bmFtZQ0Kb24gZGJ1cyBhbmQgYW5vdGhlciBpbnN0YW5jZSBvZiB0aGUgc2FtZSBwcm9ncmFtIGNv
dWxkIG5vdCBydW4uDQpIb3cgd291bGQgaXQgYWZmZWN0IHVzZXIgYnVzZXMgaW4gY2FzZSBvZiBt
dWx0aXBsZSBncmFwaGljYWwgdXNlciBzZXNzaW9ucz8NCl9fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fDQpzeXN0ZW1kLWRldmVsIG1haWxpbmcgbGlzdA0Kc3lz
dGVtZC1kZXZlbEBsaXN0cy5mcmVlZGVza3RvcC5vcmcNCmh0dHA6Ly9saXN0cy5mcmVlZGVza3Rv
cC5vcmcvbWFpbG1hbi9saXN0aW5mby9zeXN0ZW1kLWRldmVsDQoNCg0K
--001a114b9560bdfa07051de79d10
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Well, you just wouldn't have more than one graphical ses=
sion. That's part of the general plan afaik.
Note that this is already half-broken, because some of those=
 programs actually *expect* to be unique *per user* =E2=80=93 e.g. dconf-da=
emon for writing to the dconf db =E2=80=93 and having two copies of it in t=
wo sessions might be bad=E2=80=A6
On Sat, Aug 22, 2015, 13:36=
=C2=A0Micha=C5=82 Zegan <mailto:[email protected]";>w=
[email protected]> wrote:Hello.

I believe, although may be wrong, that session buses were used to
enforce single instances of programs, like a program registered a name
on dbus and another instance of the same program could not run.
How would it affect user buses in case of multiple graphical user sessions?=

_______
systemd-devel mailing list
mailto:[email protected]"; target=3D"_blank">sy=
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel"; rel=
=3D"noreferrer" target=3D"_blank">http://lists.freedesktop.org/mailman/list=
info/systemd-devel

--001a114b9560bdfa07051de79d10--
___
systemd-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Where does resolved takes its data from?

2018-09-05 Thread gima+ml . systemd-devel
systemd-resolved has a DBUS API, which is used by network configuration 
managers such as systemd-networkd and NetworkManager to set the hostname 
resolution -related configuration to be used by systemd-resolved.


You can see the runtime configuration of systemd-resolved by running 
`systemd-resolve --status`. To see what protocols (DNS, LLMNR, MDNS) are 
used to resolve a specific hostname, use `systemd-resolve 
somemachine.local`, for example.


The protocols that are used during hostname resolution can be toggled 
per-interface using the same command, or they can be set via the DBUS 
API by some network configuration manager.


Caution: The following is "as far as I know":
Please note that the systemd-resolved DBUS API provides methods to do 
hostname resolution with more control over the resolution method than 
the functions provided by GNU C libraries. These latter functions 
inspect `hosts:` entry of `/etc/nsswitch.conf` to determine plugins that 
are used to do hostname resolution, one of which should be `resolve` to 
direct queries to systemd-resolved in case the GNU C hostname resolution 
API is used.


Sorry if this veered into the territory of "I didn't ask this question". 
I just thought that clarifying the whole picture could help in better 
setting up hostname resolution.


___________
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] systemd.link MACAddress= matches OpenVPN tun device

2018-10-29 Thread gima+ml . systemd-devel
Name    : systemd (commit 
c38499d476026d999558a7eee9c95ca2fa41e115)

Version : 239.2-1
I have a systemd.link file that gives my usb modem a more recognizable 
name. I saw some renaming errors in the journal and noticed that systemd 
also tried to rename my VPN device. This shouldn't happen and I 
investigated. Here's the result:


It appears that the `50-usbmodem.link` file is being applied to the 
`tunvpn` device, even though the file has a MACAddress filter to match 
only the usbmodem.



I have the following file:


/etc/systemd/network/50-usbmodem.link

[Match]
MACAddress=aa:bb:cc:dd:ee:ff

[Link]
Name=usbmodem



And by running the following command, it can be seen that the problem 
really occurs.



$ udevadm test-builtin net_setup_link /sys/class/net/tunvpn/

calling: test-builtin
Load module index
Parsed configuration file /etc/systemd/network/50-usbmodem.link
Created link configuration context.
ID_NET_DRIVER=tun
Config file /etc/systemd/network/50-usbmodem.link applies to device tunvpn
link_config: autonegotiation is unset or enabled, the speed and duplex 
are not writable.

ID_NET_LINK_FILE=/etc/systemd/network/50-usbmodem.link
ID_NET_NAME=usbmodem
Unload module index
Unloaded link configuration context.


The tun device has no ethernet address, as it's a L3 interface, so the 
MACAddress really really shouldn't match.

$ ip link show tunvpn
xx: tunvpn:  mtu 1500 qdisc 
fq_codel state UNKNOWN mode DEFAULT group default qlen 100

    link/none



I fixed this temporarily by adding the following line to the [Match] 
section:



Driver=huawei_cdc_ncm


I'm not entirely sure, but this appears to be a bug.

Maybe relevant section: 
https://github.com/systemd/systemd/blob/c38499d476026d999558a7eee9c95ca2fa41e115/src/udev/net/link-config.c#L218



___________
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Missing PropertiesChanged signal for service start

2019-02-11 Thread ashitha v via systemd-devel
Hi,

I have a service file as follows:

[Unit]
Description= "Daemon description"
After=a.service
<https://opengrok-evo.juniper.net/source/s?path=mgd.service&project=EVO_TOT>
b.service 
<https://opengrok-evo.juniper.net/source/s?path=mgd-api.service&project=EVO_TOT>
c.service 
<https://opengrok-evo.juniper.net/source/s?path=jsd.service&project=EVO_TOT>
OnFailure=failure_handler@%p.service
<https://opengrok-evo.juniper.net/source/s?path=p.service&project=EVO_TOT>

[Service]
WorkingDirectory=/usr/sbin
<https://opengrok-evo.juniper.net/source/s?path=/usr/sbin&project=EVO_TOT>
ExecStartPre=/bin/sleep
<https://opengrok-evo.juniper.net/source/s?path=/bin/sleep&project=EVO_TOT>
30
ExecStart=


When this service starts I expected a signal indicating state=active.
When I reboot the system multiple times, the signal indicating
"active" is missing some times.

I got the signal ActiveState=activating, SubState=start-pre at all
times. But signal indicating ActiveState="active" and
SubState="running" was missing for some reboots.

The service is running and shows active state all the time. What is
reason for missing signal? I am also checking if the sleep in the
ExecStartPre is required for this

service. I am wondering if that has something to do with the missing signal.


Thanks
Ashitha
___
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Mount a remote FS as a user

2019-02-12 Thread Daniel Tihelka via systemd-devel
Hello,
OK, thanks for the clarification.

I was afraid that the situation is like you have described. Still it
surprises me that even the sshfs case cannot be handled by user
instance of systemd ...

Do you have any information that the kernel is going to open autofs
for unpriv clients?
Or, could it be a way to write a d-bus capable daemon (or use/extend
udisks or systemd capabilities?) which would handle the mounts for a
particular user, i.e. a user would provide remote host+fs
type+username+passwd+required mount point+access permissions and the
daemon would mount it then for the user as required. Or has this way a
security flow I don't see?

Thanks,
DT

On Mon, Feb 11, 2019 at 6:27 PM Lennart Poettering
 wrote:
>
> On Mo, 11.02.19 15:59, Daniel Tihelka ([email protected]) wrote:
>
> > Hello,
> > I can mount a shared file system (sshfs in particular) as an ordinary user.
> >
> > Now I would like to have it handled by systemd on-demand (automount).
> > However, creating the automount unit and starting it fails with error:
>
> autofs (the kernel subsystem behin the .automount unit type) is
> accessible to privileged clients only, and systemd --user is not
> privileged in general. This means what you are trying to do is simply
> not supported by the kernel.
>
> We could start supporting this if the kernel would open up autofs for
> unpriv clients, like it did for fuse mounts. However, I don't see that
> happening any time soon.
>
> Sorry!
>
> Lennart
>
> --
> Lennart Poettering, Red Hat
___
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Missing PropertiesChanged signal for service start

2019-02-12 Thread ashitha v via systemd-devel
Hi Lennart,

I missed some details in the previous mail.
This is seen on systemd 230. Unfortunately, I cannot do a systemd upgrade
now.

Subscribe() is done on org.freedesktop.systemd1 path
=/org/freedesktop/systemd1 intf =org.freedesktop.systemd1.Manager. To make
sure that the signal was not missed due to an error in the Subscribe()
logic, I also ran a dbus-monitor script that runs "/usr/bin/dbus-monitor
--system" and redirects it to a dbus-monitor log file. The dbus-monitor
script is always guaranteed to run before the service in question so it
doesn't miss any signal. I don't see any signal indicating active state in
the dbus-monitor log file when the issue happens.

Thanks
Ashitha

On Tue, Feb 12, 2019 at 2:14 AM Lennart Poettering 
wrote:

> On Mo, 11.02.19 19:50, systemd Mailing List (
> [email protected]) wrote:
>
> > Hi,
> >
> > I have a service file as follows:
> >
> > [Unit]
> > Description= "Daemon description"
> > After=a.service
> > <
> https://opengrok-evo.juniper.net/source/s?path=mgd.service&project=EVO_TOT
> >
> > b.service <
> https://opengrok-evo.juniper.net/source/s?path=mgd-api.service&project=EVO_TOT
> >
> > c.service <
> https://opengrok-evo.juniper.net/source/s?path=jsd.service&project=EVO_TOT
> >
> > OnFailure=failure_handler@%p.service
> > <
> https://opengrok-evo.juniper.net/source/s?path=p.service&project=EVO_TOT>
> >
> > [Service]
> > WorkingDirectory=/usr/sbin
> > <
> https://opengrok-evo.juniper.net/source/s?path=/usr/sbin&project=EVO_TOT>
> > ExecStartPre=/bin/sleep
> > <
> https://opengrok-evo.juniper.net/source/s?path=/bin/sleep&project=EVO_TOT>
> > 30
> > ExecStart=
> >
> >
> > When this service starts I expected a signal indicating state=active.
> > When I reboot the system multiple times, the signal indicating
> > "active" is missing some times.
> >
> > I got the signal ActiveState=activating, SubState=start-pre at all
> > times. But signal indicating ActiveState="active" and
> > SubState="running" was missing for some reboots.
> >
> > The service is running and shows active state all the time. What is
> > reason for missing signal? I am also checking if the sleep in the
> > ExecStartPre is required for this
> >
> > service. I am wondering if that has something to do with the missing
> signal.
>
> Have you called Subscribe() on the manager object? Unless there's at
> least one client doing that (which hasn't dsiconnected yet) these
> messages are not necessarily generated.
>
> Also, which systemd version is this? There have been some bugfixes in
> this area in the past, hence make sure to run a current version of systemd.
>
> Lennart
>
> --
> Lennart Poettering, Red Hat
>


-- 
thanks
Ashitha
___
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Delegate= on slice before v237

2019-02-13 Thread Filipe Brandenburger via systemd-devel
Hey Lennart,

Thanks for the clarification.

On Tue, Feb 12, 2019 at 2:17 AM Lennart Poettering 
wrote:

> On Mo, 11.02.19 16:39, Filipe Brandenburger ([email protected]) wrote:
> > Before systemd v237 (when Delegate= was no longer allowed on slice
> > units)... Did setting Delegate=yes on a slice have *any* effect at all?
> >
> > Or did it just do nothing (and a slice with Delegate=no or no setting
> > behave just the same)?
> >
> > Reason I ask is: I want to scrap this code
> > <
> https://github.com/opencontainers/runc/blob/v1.0.0-rc6/libcontainer/cgroups/systemd/apply_systemd.go#L195
> >
> > in libcontainer that tries to detect whether Delegate= is accepted in a
> > slice unit. (I'll just default it to false, never try it.)
> >
> > I'd like to be able to say that Delegate=yes never really did anything at
> > all on slice units... So I'm trying to confirm that is really the case
> > before stating it.
>
> So, it wasn't supposed to do anything, and what it does differs on
> cgroupsv2 and cgroupsv1.


libcontainer is pretty much cgroupv1 only, so that's what I'm concerned
about.


> The fact it wasn't refused outright was an
> accident, and because it was one I am not entirely sure what the
> precise effect of allowing it was. However, I am pretty sure it at
> least had two effects:
>
> 1. it would turn on all controllers for the cgroup
>

I don't *think* this is why libcontainer was trying to enable it, since a
few lines down it's explicitly enabling all the controllers by
setting MemoryAccounting, CPUAccounting and BlockIOAccounting during
transient unit creation:
https://github.com/opencontainers/runc/blob/v1.0.0-rc6/libcontainer/cgroups/systemd/apply_systemd.go#L275


> 2. it would stop systemd to ever migrating foreign processes below
>that slice, which is primarily relevant only when changing cgroup
>related props on the slice dynamically I guess.
>

I'm not sure I follow... Do you mean if libcontainer would write
to memory.limit_in_bytes (or one of the other properties of the memory or
other controller managed by systemd, such as cpu), then systemd would not
end up overwriting this as it does some other operation on the cgroup?

I'm not completely sure I understand what "migrate foreign processes"
means, given slices don't really hold any pids directly... Do you mean to
scope and service units below that slice?

In any case, for now I'll probably leave that alone... Though as I revamp
libcontainer support for unified hierarchy, I'll try to skip that check on
that case, that might make this a legacy-only setting, so not that
important to fully get rid of it for a while...

Cheers!
Filipe
___
systemd-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/systemd-devel