Slurmd is running inside the container, and no I've not tried running other 
eBPF programs inside the container - is there something that you would 
recommend trying just to see if eBPF is functional?

One thing that I've tried is just running:
bpftool prog show
This works on the bare metal server running the LXD containers but within the 
containers, I had to install linux-tools-generic and linux-tools-common, then 
running the same command shows:
root@gpu-4:~# bpftool prog show
Error: can't get next program: Operation not permitted

I think this possibly points to a limitation of unprivileged containers (or 
more likely, a misconfiguration of the container on my end due to my 
ignorance). For example, we added the GPUs and MIGs using lxc config device 
add, but maybe we also need to add lines specifying lxc.cgroup2.devices.allow 
based on 
https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html. Also 
possibly, I should look into privilege escalation (?) 
https://documentation.ubuntu.com/lxd/latest/explanation/bpf/ but I don't quite 
understand this yet.

-- 
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to