Previous versions of mysql are suppose to have nasty security issues.
I'm not sure why I had mysql instead of mariadb anyway.


On Mon, May 11, 2020 at 9:29 AM Relu Patrascu <r...@vectorinstitute.ai> wrote:
>
> We've experienced the same problem on several versions of slurmdbd
> (18, 19) so we downgraded mysql and put a hold on the package.
>
> Hey Dustin, funny we meet here :)
> Relu
>
> On Tue, May 5, 2020 at 3:43 PM Dustin Lang <dstnd...@gmail.com> wrote:
> >
> > I tried upgrading Slurm to 18.08.9 and I am still getting this Segmentation 
> > Fault!
> >
> >
> >
> > On Tue, May 5, 2020 at 2:39 PM Dustin Lang <dstnd...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> Apparently my colleague upgraded the mysql client and server, but, as far 
> >> as I can tell, this was only 5.7.29 to 5.7.30, and checking the mysql 
> >> release notes I  don't see anything that looks suspicious there...
> >>
> >> cheers,
> >> --dustin
> >>
> >>
> >> On Tue, May 5, 2020 at 1:37 PM Dustin Lang <dstnd...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> We're running Slurm 17.11.12.  Everything has been working fine, and then 
> >>> suddenly slurmctld is crashing and slurmdbd is crashing.
> >>>
> >>> We use fair-share as part of the queuing policy, and previously set up 
> >>> accounts with sacctmgr; that has been working fine for months.
> >>>
> >>> If I run slurmdbd in debug mode,
> >>>
> >>>  slurmdbd -D -v -v -v -v -v
> >>>
> >>> it eventually (after being contacted by slurmctld) segfaults with:
> >>>
> >>> ...
> >>> slurmdbd: debug2: DBD_NODE_STATE: NODE:cn049 STATE:UP REASON:(null) 
> >>> TIME:1588695584
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug2: DBD_NODE_STATE: NODE:cn050 STATE:UP REASON:(null) 
> >>> TIME:1588695584
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug2: DBD_GET_TRES: called
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug2: DBD_GET_QOS: called
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug2: DBD_GET_USERS: called
> >>> slurmdbd: debug4: got 0 commits
> >>> slurmdbd: debug2: DBD_GET_ASSOCS: called
> >>> slurmdbd: debug4: 10(as_mysql_assoc.c:2033) query
> >>> call get_parent_limits('assoc_table', 'root', 'slurm_cluster', 0); select 
> >>> @par_id, @mj, @msj, @mwpj, @mtpj, @mtpn, @mtmpj, @mtrm, @def_qos_id, 
> >>> @qos, @delta_qos;
> >>> Segmentation fault (core dumped)
> >>>
> >>>
> >>> It looks (running slurmdbd in gdb) like that segfault is coming from
> >>>
> >>> https://github.com/SchedMD/slurm/blob/slurm-17-11-12-1/src/plugins/accounting_storage/mysql/as_mysql_assoc.c#L2073
> >>>
> >>> and If I connect to the mysql database directly and call that stored 
> >>> procedure, I get
> >>>
> >>> mysql> call get_parent_limits('assoc_table', 'root', 'slurm_cluster', 0);
> >>> +---------------------+-----------------+-------------------------+----------------------+---------------------------+-------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------+-----------------------------+
> >>> | @par_id := id_assoc | @mj := max_jobs | @msj := max_submit_jobs | @mwpj 
> >>> := max_wall_pj | @def_qos_id := def_qos_id | @qos := qos | @delta_qos := 
> >>> REPLACE(CONCAT(delta_qos, @delta_qos), ',,', ',') | @mtpj := 
> >>> CONCAT(@mtpj, if (@mtpj != '' && max_tres_pj != '', ',', ''), 
> >>> max_tres_pj) | @mtpn := CONCAT(@mtpn, if (@mtpn != '' && max_tres_pn != 
> >>> '', ',', ''), max_tres_pn) | @mtmpj := CONCAT(@mtmpj, if (@mtmpj != '' && 
> >>> max_tres_mins_pj != '', ',', ''), max_tres_mins_pj) | @mtrm := 
> >>> CONCAT(@mtrm, if (@mtrm != '' && max_tres_run_mins != '', ',', ''), 
> >>> max_tres_run_mins) | @my_acct_new := parent_acct |
> >>> +---------------------+-----------------+-------------------------+----------------------+---------------------------+-------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------+-----------------------------+
> >>> |                   1 |            NULL |                    NULL |       
> >>>           NULL |                      NULL | ,1,         | NULL           
> >>>                                                  | NULL                   
> >>>                                                              | NULL       
> >>>                                                                          
> >>> | NULL                                                                    
> >>>                          | NULL                                           
> >>>                                                  |                        
> >>>      |
> >>> +---------------------+-----------------+-------------------------+----------------------+---------------------------+-------------+-----------------------------------------------------------------+-------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------+-----------------------------+
> >>>
> >>> and if I run
> >>>
> >>> mysql> call get_parent_limits('assoc_table', 'root', 'slurm_cluster', 0); 
> >>> select @par_id, @mj, @msj, @mwpj, @mtpj, @mtpn, @mtmpj, @mtrm, 
> >>> @def_qos_id, @qos, @delta_qos;
> >>>
> >>> I get
> >>>
> >>> +---------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
> >>> | @par_id | @mj  | @msj | @mwpj | @mtpj | @mtpn | @mtmpj | @mtrm | 
> >>> @def_qos_id | @qos | @delta_qos |
> >>> +---------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
> >>> |       1 | NULL | NULL |  NULL | NULL  | NULL  | NULL   | NULL  |        
> >>> NULL | ,1,  | NULL       |
> >>> +---------+------+------+-------+-------+-------+--------+-------+-------------+------+------------+
> >>>
> >>> but I don't know what to do about this.
> >>>
> >>> We use another product ("Bright Cluster Manager") to manage some aspects 
> >>> of the cluster and Slurm installation, so we are hesitant to just upgrade 
> >>> Slurm.
> >>>
> >>> I would appreciate any tips.
> >>>
> >>> Thanks,
> >>> --dustin
> >>>
> >>>
>
>
> --
>
> +1-647-680-7564
>

Reply via email to