Dudi Maroshi has posted comments on this change. Change subject: hosted-engine: hosted-engine client, with storage connection timeout ......................................................................
Patch Set 6: (4 comments) https://gerrit.ovirt.org/#/c/40392/6/ovirt_hosted_engine_ha/broker/listener.py File ovirt_hosted_engine_ha/broker/listener.py: Line 178: MetadataError, FatalMetadataError, Line 179: SanlockInitializationError, Line 180: BrokerInitializationError, BrokerConnectionError, Line 181: NotImplementedError, BlockBackendCorruptedException, Line 182: IOError, OSError) as e: > Uh... not here. This is a very bad way of catching errors, because it will Done Line 183: response = "failure " + format(str(e)) Line 184: self._log.debug("Response: %s", response) Line 185: util.socket_sendline(self.request, self._log, response) Line 186: except socket.timeout: https://gerrit.ovirt.org/#/c/40392/6/ovirt_hosted_engine_ha/broker/storage_broker.py File ovirt_hosted_engine_ha/broker/storage_broker.py: Line 126: f = os.open(path, direct_flag | os.O_RDONLY) Line 127: os.lseek(f, offset, os.SEEK_SET) Line 128: data = os.read(f, read_size) Line 129: os.close(f) Line 130: except (IOError, OSError) as e: > This is enough, no need for circuit breaker. Done Line 131: self._log.error("Failed to read metadata from %s", Line 132: path, exc_info=True) Line 133: raise RequestError("failed to read metadata: {0}".format(str(e))) Line 134: https://gerrit.ovirt.org/#/c/40392/6/ovirt_hosted_engine_ha/lib/brokerlink.py File ovirt_hosted_engine_ha/lib/brokerlink.py: Line 248: """ Line 249: prevTimeout = self._socket.gettimeout() Line 250: self._socket.settimeout(20) Line 251: self._log.debug("Set socket timeout = %d", 20) Line 252: response = self._communicate(request) > And I actually explained to you that no timeout will help here. The added e I want to solve the hanging issue on the client/broker side. The client/broker hanging can occur with NFS block, or network disruption. I did not find a mechanism to release the client/broker from hanging socket, see my comment before. Line 253: self._socket.settimeout(prevTimeout) Line 254: self._log.debug("Reset socket timeout = %s", str(prevTimeout)) Line 255: try: Line 256: status, message = response.split(" ", 1) Line 248: """ Line 249: prevTimeout = self._socket.gettimeout() Line 250: self._socket.settimeout(20) Line 251: self._log.debug("Set socket timeout = %d", 20) Line 252: response = self._communicate(request) > You are calling self._communicate with is calling util.socket_readline with Testing this mechanism with upstream version. Timeout works! maybe on a single socket only (I do not know how it is implemented). The other options are: 1. Implement a buffered reader on the socket. 2. Let hanging client/broker hang, till released by infrastructure mechanism. Line 253: self._socket.settimeout(prevTimeout) Line 254: self._log.debug("Reset socket timeout = %s", str(prevTimeout)) Line 255: try: Line 256: status, message = response.split(" ", 1) -- To view, visit https://gerrit.ovirt.org/40392 To unsubscribe, visit https://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3be8d3cff0912c7a88ebb00145a17fa6dd2d892d Gerrit-PatchSet: 6 Gerrit-Project: ovirt-hosted-engine-ha Gerrit-Branch: master Gerrit-Owner: Dudi Maroshi <d...@redhat.com> Gerrit-Reviewer: Dudi Maroshi <d...@redhat.com> Gerrit-Reviewer: Jenkins CI Gerrit-Reviewer: Martin Sivák <msi...@redhat.com> Gerrit-Reviewer: Roy Golan <rgo...@redhat.com> Gerrit-Reviewer: Sandro Bonazzola <sbona...@redhat.com> Gerrit-Reviewer: Simone Tiraboschi <stira...@redhat.com> Gerrit-Reviewer: automat...@ovirt.org Gerrit-HasComments: Yes _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches