Liron Aravot has posted comments on this change. Change subject: core:WIP: introducing OvfAutoUpdate ......................................................................
Patch Set 3: (17 inline comments) .................................................... File backend/manager/dbscripts/upgrade/03_01_1490_add_vm_generation_columns.sql Line 1: select fn_db_add_column('vm_static', 'ovf_generation', 'BIGINT'); Line 2: select fn_db_add_column('vm_static', 'db_generation', 'BIGINT default 1'); I've added a different table, in postgres writers do not interrupt readers so no problem with that in terms of concurrency. using vm_static might cause us to have a bigger used space for db snapshots, i don't see this as an issue that should cause us to use different table, as agreed with amureini right now i've separated the tables. Line 3: UPDATE vm_static set ovf_generation = 1; Line 1: select fn_db_add_column('vm_static', 'ovf_generation', 'BIGINT'); Line 2: select fn_db_add_column('vm_static', 'db_generation', 'BIGINT default 1'); Line 3: UPDATE vm_static set ovf_generation = 1; Line 4: ALTER TABLE vm_static ALTER COLUMN ovf_generation set default 0; changed .................................................... File backend/manager/dbscripts/upgrade/pre_upgrade/0000_config.sql Line 548: select fn_db_add_config_value('VmPoolMonitorMaxAttempts','3','general'); Line 549: select fn_db_add_config_value('VmPriorityMaxValue','100','general'); Line 550: --Handling Keyboard Layout configuration for VNC Line 551: select fn_db_add_config_value('VncKeyboardLayout','en-us','general'); Line 552: select fn_db_add_config_value('WaitForVdsInitInSec','60','general'); Done Line 553: select fn_db_add_config_value('OvfUpdateIntervalInMinutes','1','general'); Line 554: --The default network connectivity check timeout Line 555: select fn_db_add_config_value('NetworkConnectivityCheckTimeoutInSeconds','120','general'); Line 556: -- AutoRecoveryConfiguration Line 549: select fn_db_add_config_value('VmPriorityMaxValue','100','general'); Line 550: --Handling Keyboard Layout configuration for VNC Line 551: select fn_db_add_config_value('VncKeyboardLayout','en-us','general'); Line 552: select fn_db_add_config_value('WaitForVdsInitInSec','60','general'); Line 553: select fn_db_add_config_value('OvfUpdateIntervalInMinutes','1','general'); Done Line 554: --The default network connectivity check timeout Line 555: select fn_db_add_config_value('NetworkConnectivityCheckTimeoutInSeconds','120','general'); Line 556: -- AutoRecoveryConfiguration Line 557: select fn_db_add_config_value('AutoRecoveryAllowedTypes','{\"storage domains\":\"false\",\"hosts\":\"true\"}','general'); .................................................... File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/InitBackendServicesOnStartupBean.java Line 37: log.infoFormat("InitResourceManager: {0}", new Date()); Line 38: ResourceManager.getInstance().init(); Line 39: AsyncTaskManager.getInstance().InitAsyncTaskManager(); Line 40: log.infoFormat("AsyncTaskManager: {0}", new Date()); Line 41: OvfDataUpdater.getInstance().InitOvfDataUpdater(); Done Line 42: log.infoFormat("OvfDataUpdater: {0}", new Date()); Line 43: Line 44: if (Config.<Boolean> GetValue(ConfigValues.EnableVdsLoadBalancing)) { Line 45: VdsLoadBalancer.EnableLoadBalancer(); .................................................... File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/OvfDataUpdater.java Line 28: private static final Log log = LogFactory.getLog(OvfDataUpdater.class); Line 29: private static final OvfDataUpdater INSTANCE = new OvfDataUpdater(); Line 30: Line 31: private OvfDataUpdater(){ Line 32: SchedulerUtil scheduler = SchedulerUtilQuartzImpl.getInstance(); Done Line 33: scheduler.scheduleAFixedDelayJob(this, "ovfUpdate_timer", new Class[] {}, Line 34: new Object[] {}, Config.<Integer> GetValue(ConfigValues.OvfUpdateIntervalInMinutes), Line 35: Config.<Integer> GetValue(ConfigValues.OvfUpdateIntervalInMinutes), TimeUnit.SECONDS); Line 36: } Line 45: Line 46: @OnTimerMethodAnnotation("ovfUpdate_timer") Line 47: public void ovfUpdate_timer() { Line 48: log.info("OvfDataUpdater: Attempting to update VMs/Templates Ovf."); Line 49: List<storage_pool> storagePools = DbFacade.getInstance().getStoragePoolDao().getAll(); Done Line 50: for (storage_pool pool : storagePools) { Line 51: try { Line 52: if (StoragePoolStatus.Up == pool.getstatus()) { Line 53: log.infoFormat("OvfDataUpdater: Attempting to update VMs/Templates Ovf in Data Center {0}", Line 56: getVmsForUpdate(DbFacade.getInstance() Line 57: .getVmDao() Line 58: .getAllVmsForOvfUpdateForStoragePool(pool.getId())); Line 59: VmCommand.updateVmInSpm(pool.getId(), vmsForUpdate); Line 60: if (!vmsForUpdate.isEmpty()) { Running it within one stored procedure (sorry, didn't understand your meaning before) , i've done it with atomic updates to prevent db deadlocks - but we need generally to be very careful with those batch updates..even if they are simple update statement they can be hazardous. performance and halt : when we will run it for 100 vms\templates for example have success in acquiring 99 and wait for the 100 vm, it means that we have database ROW EXCLUSIVE locks in our transaction on that rows till we acquire that last lock, meaning that no one else can update them - which will halt all update operation related to vms in the system and our transaction is on wait till all acquired. each wrong implementation of update within transaction can cause that halt (and we still have many commands that run vds command within transaction) deadlocks:can cause to a db deadlock potentially if there will be a try to increment generation version of two vms within the same transaction somewhere in the system (for example - in ReconstructCommand after suggested comment) - we have to take care to perform the operation in a sorted way. we have few possible solutions for generally dealing with possible db deadlocks -perform 'atomic' updates, so no deadlock is possible as only one ROW EXCLUSIVE lock is acquired per db transaction. -separate into two tables, it will solve the problem for that issue we have - but we might encounter it when other commands are added. - sort before each update so we'll take to locks on the same order, still we take locks on many rows and block other system operations while having dependencies, we should add some util class for doing that. - perform operations with retries and count on DB to detect the deadlock and throw exception. we need to be very careful about performing such batch updates as it can cause some serious errors now and in the future. Regardless of this issue I've decided to add a new table for ovf_generations that only the ovf updater will write to so no problem with batch update, so it'll only read from other tables and we won't have db deadlocks while be able to perform batch update, but we need to decide on methodology that we want to go with generally regards that issue. Line 61: for (VM vm : vmsForUpdate) { Line 62: DbFacade.getInstance() Line 63: .getVmDao() Line 64: .updateVmOvfGeneration(vm.getId(), vm.getStaticData().getDb_generation()); Line 65: } Line 66: } Line 67: List<VmTemplate> templatesForUpdate = getTemplatesForUpdate(DbFacade.getInstance() Line 68: .getVmTemplateDao() Line 69: .getAllVmTemplatesForOvfUpdateForStoragePool(pool.getId())); split to chunks was added, no vms/templates in memory during vds operation. Line 70: if (!templatesForUpdate.isEmpty()) { Line 71: VmTemplateCommand.UpdateTemplateInSpm(pool.getId(), templatesForUpdate); Line 72: for (VmTemplate template : templatesForUpdate) { Line 73: DbFacade.getInstance() Line 89: * returns a list of vms that are valid for update from the given vm list. Line 90: * valid vm is a vm which is not locked and none of it's disks is locked. Line 91: * @param vms Line 92: * @return Line 93: */ Done Line 94: private List<VM> getVmsForUpdate(List<VM> vms) { Line 95: List<VM> toReturn = new ArrayList<VM>(); Line 96: for (VM vm : vms) { Line 97: if (VMStatus.ImageLocked != vm.getstatus()) { Line 131: * @return Line 132: */ Line 133: private boolean verifyDisksNotLocked (List<DiskImage> disks) { Line 134: for (DiskImage disk : disks) { Line 135: if (disk.getimageStatus() == ImageStatus.LOCKED) { as agreed, correct code. Line 136: return false; Line 137: } Line 138: } Line 139: return true; .................................................... File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VmCommand.java Line 266: if (getVm() != null) { Line 267: if (getVm().getstatus() == VMStatus.ImageLocked) { Line 268: VmHandler.unlockVm(getVm(), getCompensationContext()); Line 269: } Line 270: throw new RuntimeException("asd"); Done Line 271: } else { Line 272: setCommandShouldBeLogged(false); Line 273: log.warn("VmCommand::EndVmCommand: Vm is null - not performing EndAction on Vm"); Line 274: } .................................................... File backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/businessentities/VmBase.java Line 107: private boolean autoStartup; Line 108: Line 109: @Column(name = "is_stateless") Line 110: private boolean stateless; Line 111: no need, discussed with yzaslav on patch comments. Line 112: @Column(name = "db_generation") Line 113: private long db_generation; Line 114: Line 115: @Column(name = "ovf_generation") .................................................... File backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/irsbroker/UpdateVMVDSCommand.java Line 45: counter++; Line 46: } Line 47: return result; Line 48: } Line 49: Done Line 50: @Override Line 51: protected VDSExceptionBase createDefaultConcreteException(String errorMessage) { Line 52: return new IrsOperationFailedNoFailoverException(errorMessage); Line 53: } .................................................... Commit Message Line 13: vms/templates - update, save, adding/removing a disk, etc. What's more, Line 14: currently updating the ovf (updateVM vdsm call) is usually done within a Line 15: transcation. Line 16: Line 17: The idea behined OvfAutoUpdater is to perform batch ovf update Done Line 18: operations that aggregate all outstanding updates per data center. These Line 19: updates will be done in specifed time intervals which will reduce Line 20: XML-RPC calls and will enable the removal of this syncronous vdsm call Line 21: from all over the code. Line 14: currently updating the ovf (updateVM vdsm call) is usually done within a Line 15: transcation. Line 16: Line 17: The idea behined OvfAutoUpdater is to perform batch ovf update Line 18: operations that aggregate all outstanding updates per data center. These Done Line 19: updates will be done in specifed time intervals which will reduce Line 20: XML-RPC calls and will enable the removal of this syncronous vdsm call Line 21: from all over the code. Line 22: Line 20: XML-RPC calls and will enable the removal of this syncronous vdsm call Line 21: from all over the code. Line 22: Line 23: known issues that will be addressed: Line 24: 1. Unite the operations of vm template ovf update/ vm update to single Done Line 25: vdsm command execution. Line 26: 2. remove updateVmInSpm/updateVmTemplateInSpm from Line 27: VmTemplateCommand/VmCommand Line 28: 3. Remove call to this methods from all calling commands while updating -- To view, visit http://gerrit.ovirt.org/9328 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b5132300fb1f1fd94f771cab15efe5246dbeca8 Gerrit-PatchSet: 3 Gerrit-Project: ovirt-engine Gerrit-Branch: master Gerrit-Owner: Liron Aravot <lara...@redhat.com> Gerrit-Reviewer: Allon Mureinik <amure...@redhat.com> Gerrit-Reviewer: Ayal Baron <aba...@redhat.com> Gerrit-Reviewer: Liron Aravot <lara...@redhat.com> Gerrit-Reviewer: Maor Lipchuk <mlipc...@redhat.com> Gerrit-Reviewer: Michael Kublin <mkub...@redhat.com> Gerrit-Reviewer: Tal Nisan <tni...@redhat.com> Gerrit-Reviewer: Vered Volansky <vvola...@redhat.com> Gerrit-Reviewer: Yair Zaslavsky <yzasl...@redhat.com> Gerrit-Reviewer: liron aravot <liron.ara...@gmail.com> _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches