I can't find anything wrong with /run/named. On the system now:

buzz:~# ls -la /run/named
total 8
drwxrwxr-x  2 root bind   80 Apr 26 17:46 .
drwxr-xr-x 58 root root 1940 Apr 26 16:23 ..
-rw-r--r--  1 bind bind    7 Apr 26 17:46 named.pid
-rw-------  1 bind bind  102 Apr 26 17:46 session.key
buzz:~# dpkg -l bind9
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version              Architecture Description
+++-==============-====================-============-==========================>
ii  bind9          1:9.11.5.P4+dfsg-5.1 amd64        Internet Domain Name Server
buzz:~#

To recap the situation:

1. System running "n-1" testing version of bind9, everything operates normally
2. System updates to bind9 1:9.16.2-3, bind9 refuses to start
3. I look at directories, which exist, and compare settings/permissions with my 
secondary server (rpi3 running Debian 10.3), and can't find anything that looks 
different. Even removing "-u bind" from startup doesn't change anything! 
(although I confirmed at that time that "touch /run/named/xxxx" as root worked.)
4. Do (paraphrased for brevity) "install bind9/stable" to downgrade to 
known-good package version
5. bind9 starts and runs normally

To clarify, step 2 was an "apply all outstanding updates" activity, so it is 
possible that another update performed in that bundle is responsible for the 
breakage. However, step 4 modified only bind9 and its supporting libraries, and 
was the ONLY thing I did to move from broken to stable operation. Therefore I 
conclude that although it may not be at fault with respect to root cause, the 
latest testing version clearly is less resistant to something in my 
configuration than the stable version.

I have modified my zone files a lot (well, this IS a name server... ;) ) but do 
not believe I have meddled with any of the other security-related settings. I 
just went and looked specifically for signs of life from AppArmor, and all I 
could locate was this in /var/log/daemon.log:

buzz:/var/log# grep apparmor daemon.log.1
Apr 25 19:53:19 buzz apparmor.systemd[1211]: Restarting AppArmor
Apr 25 19:53:19 buzz apparmor.systemd[1211]: Reloading AppArmor profiles
Apr 25 19:53:19 buzz apparmor.systemd[1248]: Warning: found usr.sbin.chronyd in 
/etc/apparmor.d/force-complain, forcing complain mode
Apr 25 19:53:19 buzz apparmor.systemd[1248]: Warning from /etc/apparmor.d 
(/etc/apparmor.d/usr.sbin.chronyd line 60): Warning failed to create cache: 
usr.sbin.chronyd
buzz:/var/log#

And: (from a system boot while troubleshooting this issue - at the same time as 
above)

buzz:/var/log# grep -Fi apparmor messages*
messages.1:Apr 25 19:53:19 buzz kernel: [    0.184487] AppArmor: AppArmor 
initialized
messages.1:Apr 25 19:53:19 buzz kernel: [    0.510404] AppArmor: AppArmor 
Filesystem Enabled
messages.1:Apr 25 19:53:19 buzz kernel: [    1.407974] AppArmor: AppArmor sha1 
policy hashing enabled
buzz:/var/log#

Thanks to noisy kvm messages, I cannot positively confirm there is nothing of 
interest in dmesg... but I would have expected messages to have been recorded 
somewhere in /var/log if they were being generated, and there's nothing beyond 
what I pasted above.

I am totally happy to try upgrading and playtesting again, but I would argue 
that "the error messages provide enough information for you to fix the problem" 
is inaccurate when the message is "permission denied", the user of interest 
clearly has the required permissions on the target directory (which exists), 
and EVEN THE ROOT ACCOUNT gets "permission denied" -- and yet no other error 
messages are generated anywhere on the system.

Does anybody have any additional suggestions on what to look at the next time I 
start a troubleshooting session?

Thanks,

Scott

Reply via email to