Martin Peřina has posted comments on this change. Change subject: fklsnr: Introduce standalone fence_kdump listener ......................................................................
Patch Set 12: (3 comments) http://gerrit.ovirt.org/#/c/27201/12/packaging/services/ovirt-fence-kdump-listener/listener.py File packaging/services/ovirt-fence-kdump-listener/listener.py: Line 191: ): Line 192: self._dao.update_heartbeat() Line 193: self._lastHeartbeat = datetime.datetime.utcnow() Line 194: Line 195: def _save_sessions(self): > I think that we can update dump in much lower frequency, 30 seconds should 1) As I wrote above currently we do 2 db inserts/updates per dumping host (one to set status 'dumping' and one to set status 'finished'). 2) If we do this every 30 seconds (for example) I will have to readjust KdumpStartTimeout (currently 30 seconds) in engine (otherwise engine will execute hard fencing on host) and all kdump detection will take much more time. So I don't think we need to complicate whole kdump detection process even further with status updates per longer interval Line 196: # update db state for all updated sessions Line 197: for session in self._sessions.values(): Line 198: if session['dirty']: Line 199: rows_updated = self._dao.update_vds_kdump_status( Line 208: "address '{address}'." Line 209: ).format( Line 210: address=session['address'][0], Line 211: ) Line 212: ) > it is not misconfiguration, this is abuse of our listener, no? It doesn't matter how do you call it :-) But if this is logged only to debug, it will hard for sysadmin to identify that problem is someone using DOS attack to listener (for example). From my personal experience as a sysadmin setting log level to debug for some non working service is one of the last step when identifying problem primary due to HUGE size of debug log with usually contains ton's of unrelated logs ... Line 213: # set status to finished to be removed in next step Line 214: session['status'] = self.SESSION_STATE_CLOSED Line 215: Line 216: if session['status'] == self.SESSION_STATE_FINISHED: Line 237: Line 238: self._afterFirstDbSync = True Line 239: Line 240: def _db_sync(self): Line 241: if self._db_manager.validate_connection(): > try to set database address that no host is there, example 1.1.1.1 Right, I forgot about this scenario, even I don't think the db down will happen too often: 1) If db is down, the engine doesn't work either so sysadmin will try to fix it asap (and if not possible he can shutdown engine and also listener) 2) Main objective is to survive db restarts (for example during package upgrades) which are usually short term 3) It's really hard to properly detect host kdump in engine, if listener will detect db is up again after much longer period than engine Line 242: self._db_connection_valid = True Line 243: try: Line 244: self._heartbeat() Line 245: self._save_sessions() -- To view, visit http://gerrit.ovirt.org/27201 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec3bad47bbba860a52a9ff4e2eb7f61277f4e36 Gerrit-PatchSet: 12 Gerrit-Project: ovirt-engine Gerrit-Branch: master Gerrit-Owner: Martin Peřina <mper...@redhat.com> Gerrit-Reviewer: Alon Bar-Lev <alo...@redhat.com> Gerrit-Reviewer: Barak Azulay <bazu...@redhat.com> Gerrit-Reviewer: Eli Mesika <emes...@redhat.com> Gerrit-Reviewer: Martin Peřina <mper...@redhat.com> Gerrit-Reviewer: Oved Ourfali <oourf...@redhat.com> Gerrit-Reviewer: Saggi Mizrahi <smizr...@redhat.com> Gerrit-Reviewer: automat...@ovirt.org Gerrit-Reviewer: oVirt Jenkins CI Server Gerrit-HasComments: Yes _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches