Hello!

I'm trying to test an upgrade of our production slurm db on a test cluster. 
Specifically I'm trying to verify a update from 20.11.7 to 21.08.4. I have a 
dump of the production db, and imported as normal. Then firing up slurmdbd to 
perform the conversion. I've verified everything I can think of but I'm 
thinking maybe I'm missing a timeout related mariadb tweak or something to 
prevent the db from "going away" during the conversion.. See the slurmdbd log 
below ... I've tried doing the upgrade both ways, via the systemd start script, 
and manually starting slurmdbd by hand. Anyone run into this before? 

Here are my innodb.conf settings:

[mysqld]
innodb_buffer_pool_size=10000M
innodb_log_file_size=64M
innodb_lock_wait_timeout=10000
max_allowed_packet=16M
net_read_timeout=10000
connect_timeout=10000

===
[root@storm mariadb]# time slurmdbd -D -vvv
slurmdbd: WARNING: MessageTimeout is too high for effective fault-tolerance
slurmdbd: debug:  Log file re-opened
slurmdbd: pidfile not locked, assuming no running daemon
slurmdbd: debug:  auth/munge: init: Munge authentication plugin loaded
slurmdbd: debug2: accounting_storage/as_mysql: init: mysql_connect() called for 
db slurm_acct_db
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: accounting_storage/as_mysql: _check_mysql_concat_is_sane: MySQL 
server version is: 5.5.5-10.3.28-MariaDB
slurmdbd: debug2: accounting_storage/as_mysql: _check_database_variables: 
innodb_buffer_pool_size: 10737418240
slurmdbd: debug2: accounting_storage/as_mysql: _check_database_variables: 
innodb_log_file_size: 67108864
slurmdbd: debug2: accounting_storage/as_mysql: _check_database_variables: 
innodb_lock_wait_timeout: 10000
slurmdbd: accounting_storage/as_mysql: as_mysql_convert_tables_pre_create: 
pre-converting usage table for monsoon
slurmdbd: error: mysql_query failed: 1054 Unknown column 'resv_secs' in 
'monsoon_usage_day_table'
alter table "monsoon_usage_day_table" change resv_secs plan_secs bigint 
unsigned default 0 not null;
slurmdbd: accounting_storage/as_mysql: as_mysql_convert_alter_query: The 
database appears to have been altered by a previous upgrade attempt, continuing 
with upgrade.
slurmdbd: error: mysql_query failed: 1054 Unknown column 'resv_secs' in 
'monsoon_usage_hour_table'
alter table "monsoon_usage_hour_table" change resv_secs plan_secs bigint 
unsigned default 0 not null;
slurmdbd: accounting_storage/as_mysql: as_mysql_convert_alter_query: The 
database appears to have been altered by a previous upgrade attempt, continuing 
with upgrade.
slurmdbd: error: mysql_query failed: 1054 Unknown column 'resv_secs' in 
'monsoon_usage_month_table'
alter table "monsoon_usage_month_table" change resv_secs plan_secs bigint 
unsigned default 0 not null;
slurmdbd: accounting_storage/as_mysql: as_mysql_convert_alter_query: The 
database appears to have been altered by a previous upgrade attempt, continuing 
with upgrade.
slurmdbd: accounting_storage/as_mysql: as_mysql_convert_tables_pre_create: 
pre-converting job table for monsoon
slurmdbd: adding column container after consumed_energy in table 
"monsoon_step_table"
slurmdbd: adding column submit_line after req_cpufreq_gov in table 
"monsoon_step_table"
slurmdbd: debug:  Table "monsoon_step_table" has changed.  Updating...
slurmdbd: error: mysql_query failed: 2013 Lost connection to MySQL server 
during query
alter table "monsoon_step_table" modify `job_db_inx` bigint unsigned not null, 
modify `deleted` tinyint default 0 not null, modify `exit_code` int default 0 
not null, modify `id_step` int not null, modify `step_het_comp` int unsigned 
default 0xfffffffe not null, modify `kill_requid` int default -1 not null, 
modify `nodelist` text not null, modify `nodes_alloc` int unsigned not null, 
modify `node_inx` text, modify `state` smallint unsigned not null, modify 
`step_name` text not null, modify `task_cnt` int unsigned not null, modify 
`task_dist` int default 0 not null, modify `time_start` bigint unsigned default 
0 not null, modify `time_end` bigint unsigned default 0 not null, modify 
`time_suspended` bigint unsigned default 0 not null, modify `user_sec` bigint 
unsigned default 0 not null, modify `user_usec` int unsigned default 0 not 
null, modify `sys_sec` bigint unsigned default 0 not null, modify `sys_usec` 
int unsigned default 0 not null, modify `act_cpufreq` double unsigned default 
0.0 not null, modify `consumed_energy` bigint unsigned default 0 not null, add 
`container` text after consumed_energy, modify `req_cpufreq_min` int unsigned 
default 0 not null, modify `req_cpufreq` int unsigned default 0 not null, 
modify `req_cpufreq_gov` int unsigned default 0 not null, add `submit_line` 
text after req_cpufreq_gov, modify `tres_alloc` text not null default '', 
modify `tres_usage_in_ave` text not null default '', modify `tres_usage_in_max` 
text not null default '', modify `tres_usage_in_max_taskid` text not null 
default '', modify `tres_usage_in_max_nodeid` text not null default '', modify 
`tres_usage_in_min` text not null default '', modify `tres_usage_in_min_taskid` 
text not null default '', modify `tres_usage_in_min_nodeid` text not null 
default '', modify `tres_usage_in_tot` text not null default '', modify 
`tres_usage_out_ave` text not null default '', modify `tres_usage_out_max` text 
not null default '', modify `tres_usage_out_max_taskid` text not null default 
'', modify `tres_usage_out_max_nodeid` text not null default '', modify 
`tres_usage_out_min` text not null default '', modify 
`tres_usage_out_min_taskid` text not null default '', modify 
`tres_usage_out_min_nodeid` text not null default '', modify 
`tres_usage_out_tot` text not null default '', drop primary key, add primary 
key (job_db_inx, id_step, step_het_comp), drop key no_step_comp, add key 
no_step_comp (job_db_inx, id_step);
slurmdbd: accounting_storage/as_mysql: init: Accounting storage MYSQL plugin 
failed
slurmdbd: error: mysql_commit failed: 2006 MySQL server has gone away
slurmdbd: error: rollback failed
slurmdbd: error: Couldn't load specified plugin name for 
accounting_storage/mysql: Plugin init() callback failed
slurmdbd: error: cannot create accounting_storage context for 
accounting_storage/mysql
slurmdbd: fatal: Unable to initialize accounting_storage/mysql accounting 
storage plugin

real    11m52.197s
user    0m0.009s
sys     0m0.001s ===
===

Thank you for any ideas!

Best,
Chris
 
-- 
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 
 

Reply via email to