On Wed 17 Dec 2014 20:52:30 +0000, Simon Kelley wrote: > There's a potential problem with this: dnsmasq has an option to invoke > child processes when the DHCP lease database changes, using the > - --dhcp-script option. By making this change, those processes are going > to be invoked with read-only /usr. That's probably fine in most cases, > but there's no certainty that someone's script doesn't write /usr, and > for that script, this is a non-backwards compatible change.
Is it sensible for the default dnsmasq.service and the default dnsmasq.conf to "match"? That is, dnsmasq.service could block things dnsmasq doesn't do by default anyway. If the sysadmin chooses to turn on --dhcp-script in /etc/dnsmasq.d/foo.conf, they can also turn off hardening in /etc/systemd/system/dnsmasq.service.d/foo.conf. I agree this is a backwards-incompatible change, but you can drop a comment in debian/NEWS to say something like This version of dnsmasq enables systemd hardening by default. If you are upgrading and have previously enabled --dhcp-script, you may need to change ProtectSystem=strict to ProtectSystem=no in dnsmasq.service. e.g. "systemctl edit dnsmasq && systemctl restart dnsmasq". Attached is a full hardening I'm running on Debian 11. It definitely won't work for everyone, but surely SOME of these can be turned on for 99.9% of users, i.e. be on by default?
# Security hardening. # # dnsmasq needs to bind to low ports (53, 67). # It doesn't support socket activation. # <grawity> (and, it needs a raw socket for DHCPv4 which systemd.socket can't do) # Therefore we have two choices: # # 1. dnsmasq manages priv-dropping; # systemd allows access to seteuid, setpcap, /etc/passwd, &c # # 2. systemd manages priv-dropping; # systemd restricts it, but dnsmasq and its children permanently keep CAP_NET_* # (since dnsmasq lacks permission to remove them after it has initialized). # # For now, I'm going with #1. [Service] # Lease files are typically found in here. # It is root:root rw-r--r--. # Therefore either we start as root, or # we have CAP_DAC_OVERRIDE ambient capability. ReadWritePaths=-/var/lib/misc CapabilityBoundingSet=CAP_DAC_OVERRIDE # ProtectSystem=strict blocks /run, so # re-allow write access to /run/dnsmasq (pidfiles, mainly). RuntimeDirectory=%p CapabilityBoundingSet=CAP_SETGID CAP_SETUID CAP_SETPCAP CapabilityBoundingSet=CAP_NET_BIND_SERVICE CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_RAW RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 RestrictNamespaces=yes DevicePolicy=closed NoNewPrivileges=yes PrivateDevices=yes PrivateMounts=yes PrivateTmp=yes ProtectClock=yes ProtectControlGroups=yes ProtectHome=yes ProtectKernelLogs=yes ProtectKernelModules=yes ProtectKernelTunables=yes ProtectProc=noaccess ProtectSystem=strict ProcSubset=pid RestrictSUIDSGID=yes SystemCallArchitectures=native SystemCallFilter=@system-service #SystemCallFilter=~@privileged #SystemCallFilter=@chown @setuid capset capset32 SystemCallFilter=~@resources RestrictRealtime=yes LockPersonality=yes MemoryDenyWriteExecute=yes UMask=0077 ProtectHostname=yes # Upstream has this: # ExecReload=/bin/kill -HUP $MAINPID # which gets this: # kill[572676]: kill: (535038): Operation not permitted # # As a quick and dirty fix, just run kill with full privs. ExecReload= ExecReload=+/bin/kill -HUP $MAINPID
$ systemd-analyze security dnsmasq.service NAME DESCRIPTION EXPOSURE ✗ PrivateNetwork= Service has access to the host's network 0.5 ✗ User=/DynamicUser= Service runs as root user 0.4 ✗ CapabilityBoundingSet=~CAP_SET(UID|GID|PCAP) Service may change UID/GID identities/capabilities 0.3 ✓ CapabilityBoundingSet=~CAP_SYS_ADMIN Service has no administrator privileges ✓ CapabilityBoundingSet=~CAP_SYS_PTRACE Service has no ptrace() debugging abilities ✗ RestrictAddressFamilies=~AF_(INET|INET6) Service may allocate Internet sockets 0.3 ✓ RestrictNamespaces=~CLONE_NEWUSER Service cannot create user namespaces ✓ RestrictAddressFamilies=~… Service cannot allocate exotic sockets ✓ CapabilityBoundingSet=~CAP_(CHOWN|FSETID|SETFCAP) Service cannot change file ownership/access mode/capabilities ✗ CapabilityBoundingSet=~CAP_(DAC_*|FOWNER|IPC_OWNER) Service may override UNIX file/IPC permission checks 0.2 ✗ CapabilityBoundingSet=~CAP_NET_ADMIN Service has network configuration privileges 0.2 ✓ CapabilityBoundingSet=~CAP_SYS_MODULE Service cannot load kernel modules ✓ CapabilityBoundingSet=~CAP_SYS_RAWIO Service has no raw I/O access ✓ CapabilityBoundingSet=~CAP_SYS_TIME Service processes cannot change the system clock ✗ DeviceAllow= Service has a device ACL with some special devices 0.1 ✗ IPAddressDeny= Service does not define an IP address allow list 0.2 ✓ KeyringMode= Service doesn't share key material with other services ✓ NoNewPrivileges= Service processes cannot acquire new privileges ✓ NotifyAccess= Service child processes cannot alter service state ✓ PrivateDevices= Service has no access to hardware devices ✓ PrivateMounts= Service cannot install system mounts ✓ PrivateTmp= Service has no access to other software's temporary files ✗ PrivateUsers= Service has access to other users 0.2 ✓ ProtectClock= Service cannot write to the hardware clock or system clock ✓ ProtectControlGroups= Service cannot modify the control group file system ✓ ProtectHome= Service has no access to home directories ✓ ProtectKernelLogs= Service cannot read from or write to the kernel log ring buffer ✓ ProtectKernelModules= Service cannot load or read kernel modules ✓ ProtectKernelTunables= Service cannot alter kernel tunables (/proc/sys, …) ✗ ProtectProc= 0.1 ✓ ProtectSystem= Service has strict read-only access to the OS file hierarchy ✓ RestrictAddressFamilies=~AF_PACKET Service cannot allocate packet sockets ✓ RestrictSUIDSGID= SUID/SGID file creation by service is restricted ✓ SystemCallArchitectures= Service may execute system calls only with native ABI ✓ SystemCallFilter=~@clock System call allow list defined for service, and @clock is not included ✓ SystemCallFilter=~@debug System call allow list defined for service, and @debug is not included ✓ SystemCallFilter=~@module System call allow list defined for service, and @module is not included ✓ SystemCallFilter=~@mount System call allow list defined for service, and @mount is not included ✓ SystemCallFilter=~@raw-io System call allow list defined for service, and @raw-io is not included ✓ SystemCallFilter=~@reboot System call allow list defined for service, and @reboot is not included ✓ SystemCallFilter=~@swap System call allow list defined for service, and @swap is not included ✗ SystemCallFilter=~@privileged System call allow list defined for service, and @privileged is included (e.g. chown is allowed) 0.2 ✓ SystemCallFilter=~@resources System call allow list defined for service, and @resources is not included ✓ AmbientCapabilities= Service process does not receive ambient capabilities ✓ CapabilityBoundingSet=~CAP_AUDIT_* Service has no audit subsystem access ✓ CapabilityBoundingSet=~CAP_KILL Service cannot send UNIX signals to arbitrary processes ✓ CapabilityBoundingSet=~CAP_MKNOD Service cannot create device nodes ✗ CapabilityBoundingSet=~CAP_NET_(BIND_SERVICE|BROADCAST|RAW) Service has elevated networking privileges 0.1 ✓ CapabilityBoundingSet=~CAP_SYSLOG Service has no access to kernel logging ✓ CapabilityBoundingSet=~CAP_SYS_(NICE|RESOURCE) Service has no privileges to change resource use parameters ✓ RestrictNamespaces=~CLONE_NEWCGROUP Service cannot create cgroup namespaces ✓ RestrictNamespaces=~CLONE_NEWIPC Service cannot create IPC namespaces ✓ RestrictNamespaces=~CLONE_NEWNET Service cannot create network namespaces ✓ RestrictNamespaces=~CLONE_NEWNS Service cannot create file system namespaces ✓ RestrictNamespaces=~CLONE_NEWPID Service cannot create process namespaces ✓ RestrictRealtime= Service realtime scheduling access is restricted ✓ SystemCallFilter=~@cpu-emulation System call allow list defined for service, and @cpu-emulation is not included ✓ SystemCallFilter=~@obsolete System call allow list defined for service, and @obsolete is not included ✗ RestrictAddressFamilies=~AF_NETLINK Service may allocate netlink sockets 0.1 ✗ RootDirectory=/RootImage= Service runs within the host's root directory 0.1 SupplementaryGroups= Service runs as root, option does not matter ✓ CapabilityBoundingSet=~CAP_MAC_* Service cannot adjust SMACK MAC ✓ CapabilityBoundingSet=~CAP_SYS_BOOT Service cannot issue reboot() ✓ Delegate= Service does not maintain its own delegated control group subtree ✓ LockPersonality= Service cannot change ABI personality ✓ MemoryDenyWriteExecute= Service cannot create writable executable memory mappings RemoveIPC= Service runs as root, option does not apply ✓ RestrictNamespaces=~CLONE_NEWUTS Service cannot create hostname namespaces ✓ UMask= Files created by service are accessible only by service's own user by default ✓ CapabilityBoundingSet=~CAP_LINUX_IMMUTABLE Service cannot mark files immutable ✓ CapabilityBoundingSet=~CAP_IPC_LOCK Service cannot lock memory into RAM ✓ CapabilityBoundingSet=~CAP_SYS_CHROOT Service cannot issue chroot() ✓ ProtectHostname= Service cannot change system host/domainname ✓ CapabilityBoundingSet=~CAP_BLOCK_SUSPEND Service cannot establish wake locks ✓ CapabilityBoundingSet=~CAP_LEASE Service cannot create file leases ✓ CapabilityBoundingSet=~CAP_SYS_PACCT Service cannot use acct() ✓ CapabilityBoundingSet=~CAP_SYS_TTY_CONFIG Service cannot issue vhangup() ✓ CapabilityBoundingSet=~CAP_WAKE_ALARM Service cannot program timers that wake up the system ✗ RestrictAddressFamilies=~AF_UNIX Service may allocate local sockets 0.1 ✓ ProcSubset= Service has no access to non-process /proc files (/proc subset=) → Overall exposure level for dnsmasq.service: 2.5 OK 🙂