Turns out the patch I submitted didn't fix the issue. Sorry about that. Please ignore the patch in my previous email.
On Wed, Jul 15, 2020 at 11:08 PM Francesc Zacarias <fzacaria...@gmail.com> wrote: > Hi! > > I'm experiencing the same issue as described in this bug with debian sid > and mysql-server-5.7 (5.7.26-1+b1). I noticed that one out of three times > my system boots, logrotate service fails with the following error: > > ``` > Jul 09 07:12:01 baal systemd[1]: Starting Rotate log files... > Jul 09 07:12:01 baal logrotate[851]: error: error running shared > postrotate script for '/var/log/mysql.log /var/log/mysql/*log> > Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited, > code=exited, status=1/FAILURE > Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result > 'exit-code'. > Jul 09 07:12:01 baal systemd[1]: Failed to start Rotate log files. > ``` > > Looking at journalctl, I can see that every time this issue occurs, mysqld > is started and logrotate fails *at the same time*: > > ``` > Jul 09 07:12:01 baal mysqld[847]: 2020-07-09T06:12:01.283924Z 0 [Note] > /usr/sbin/mysqld: ready for connections. > Jul 09 07:12:01 baal mysqld[847]: Version: '5.7.26-1+b1' socket: > '/var/run/mysqld/mysqld.sock' port: 3306 (Debian) > ... > Jul 09 07:12:01 baal systemd[1]: logrotate.service: Main process exited, > code=exited, status=1/FAILURE > Jul 09 07:12:01 baal systemd[1]: logrotate.service: Failed with result > 'exit-code'. > ``` > > This and the fact that the issue does not reproduce every time makes me > suspect that is a race-condition between logrotate and mysqld. Looking at > the code in /etc/logrotate.d/mysql-server it seems like logrotate first > checks if mysql is running before issuing the "mysqladmin flush-logs" > command. That check is done by issuing "mysqladmin ping" (this also checks > that /etc/mysql/debian.cnf is valid). If this check fails, then it will try > to verify if the mysql process exists. If it does, then it will fail with > an exit code = 1, which causes the error messages in my logs above. > I'm sure you already noticed the issue here: what if mysqld has not > started yet by the time "mysqladmin ping" runs, but it's up right by the > time killall is executed? Apparently, this happens quite often in my system. > > To fix this, I suggest reverting the checks: first validate that mysqld is > running. If not, then there is no reason to issue "mysqladmin flush-logs" > and the script can finish successfully. If mysqld is running *and* > "mysqladmin ping" fails, then it should definitely throw an error. > > This approach is not completely free of race-conditions either, but it > should be more rare. I'm attaching a patch with the solution I just > described. I've had this patch applied to my system for a week and the > issue has not been reproduced since. The *best* solution would be to simply > attempt to apply "flush-logs" and react differently depending on the exit > code. Unfortunately, it seems like mysqladmin does not return different > exit codes on different errors (in my tests it always returns 1), so this > approach is not possible at the moment. > > Please, apply this patch and fix this issue! > > Kind regards, > Francesc > > >