** Description changed:

  [Impact]
  
   * When cancelling a connector task, the associated resolver task (if
  not finished) is not cancelled and continues running.
  
  Unfortunately, if the resolver task eventually raises an exception
  (e.g., socket.gaierror), the exception will go directly to the exception
  handler because none will be awaiting the task anymore.
  
  This results in applications crashing with exceptions such as:
  
  Task exception was never retrieved
  future: <Task finished name='Task-3' coro=<TCPConnector._resolve_host() done, 
defined at /usr/lib/python3/dist-packages/aiohttp/connector.py:774> 
exception=gaierror(-2, 'Name or service not known')>
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/aiohttp/connector.py", line 829, in 
_resolve_host
      addrs = await \
    File "/usr/lib/python3/dist-packages/aiohttp/resolver.py", line 29, in 
resolve
      infos = await self._loop.getaddrinfo(
    File "/usr/lib/python3.8/asyncio/base_events.py", line 825, in getaddrinfo
      return await self.run_in_executor(
    File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
      result = self.fn(*self.args, **self.kwargs)
    File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
      for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
  socket.gaierror: [Errno -2] Name or service not known
  
   * This aiohttp bug is the root cause of a crash in Subiquity: 
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1969393
  Currently, we build the Subiquity snap based on deb packages (including 
python3-aiohttp) from focal. We temporarily moved from python3-aiohttp to 
python3-requests in Subiquity to work around this crash.
   * Other applications based on python3-aiohttp can be affected and crash 
despite the absence of other bug reports.
  
   * The patch in the debdiff makes sure that resolver tasks end up being
  awaited when the associated connector tasks gets cancelled.
  
  [Test Plan]
  
-  * The following chunk of code can be executed to reproduce the issue:
- https://paste.ubuntu.com/p/XrnfqVHtBh/ (it takes ~60 seconds to
- execute).
+  * The following chunk of code can be executed to reproduce the issue: 
https://paste.ubuntu.com/p/XrnfqVHtBh/ (it takes ~60 seconds to execute).
+  * The bug is way easier to reproduce is there is some delay in the DNS 
resolution. One way to simulate some delay is to use a non-local DNS server and 
run:
+ 
+ # tc qdisc add dev eth0 root netem delay 200m
+ 
+ where eth0 is the name of the interface used for DNS resolutions.
+ This can be reverted using:
+ # tc qdisc del dev eth0 root netem
  
      * On python3-aiohttp 3.6 (focal), the exception handler wakes up
  with:
  
      Task exception was never retrieved
      future: <Task finished name='Task-3' coro=<TCPConnector._resolve_host() 
done, defined at /usr/lib/python3/dist-packages/aiohttp/connector.py:774> 
exception=gaierror(-2, 'Name or service not known')>
  
      * When aiohttp is patched, nothing should happen.
  
   * Manually testing the patched library against Subiquity to make sure it 
solves https://bugs.launchpad.net/ubuntu-power-systems/+bug/1969393.
  This was done on my end using this PPA: 
https://launchpad.net/~ogayot/+archive/ubuntu/focal-bugfix . Tests were green.
  
  [Where problems could occur]
  
   * Since the patch affects a python library, any application that
  depends on this library (aka. python3-aiohttp) on focal would be
  affected by the upload.
  
   * In the unlikely event that this patch introduces a regression,
  applications that depend on python3-aiohttp (i.e., in focal/universe)
  can crash or raise exceptions.
  
   * If any package in focal/main has python3-aiohttp as a Build-Depends,
  a regression could cause said package to FTBFS.
  
  [Other Info]
  
   * The debdiff brings a backport of an upstream patch that is present in
  aiohttp 3.7 and newer versions:
  
  https://github.com/aio-libs/aiohttp/pull/5050
  
   * Upstream bug report: https://github.com/aio-libs/aiohttp/issues/4330

** Description changed:

  [Impact]
  
   * When cancelling a connector task, the associated resolver task (if
  not finished) is not cancelled and continues running.
  
  Unfortunately, if the resolver task eventually raises an exception
  (e.g., socket.gaierror), the exception will go directly to the exception
  handler because none will be awaiting the task anymore.
  
  This results in applications crashing with exceptions such as:
  
  Task exception was never retrieved
  future: <Task finished name='Task-3' coro=<TCPConnector._resolve_host() done, 
defined at /usr/lib/python3/dist-packages/aiohttp/connector.py:774> 
exception=gaierror(-2, 'Name or service not known')>
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/aiohttp/connector.py", line 829, in 
_resolve_host
      addrs = await \
    File "/usr/lib/python3/dist-packages/aiohttp/resolver.py", line 29, in 
resolve
      infos = await self._loop.getaddrinfo(
    File "/usr/lib/python3.8/asyncio/base_events.py", line 825, in getaddrinfo
      return await self.run_in_executor(
    File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
      result = self.fn(*self.args, **self.kwargs)
    File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
      for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
  socket.gaierror: [Errno -2] Name or service not known
  
   * This aiohttp bug is the root cause of a crash in Subiquity: 
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1969393
  Currently, we build the Subiquity snap based on deb packages (including 
python3-aiohttp) from focal. We temporarily moved from python3-aiohttp to 
python3-requests in Subiquity to work around this crash.
   * Other applications based on python3-aiohttp can be affected and crash 
despite the absence of other bug reports.
  
   * The patch in the debdiff makes sure that resolver tasks end up being
  awaited when the associated connector tasks gets cancelled.
  
  [Test Plan]
  
   * The following chunk of code can be executed to reproduce the issue: 
https://paste.ubuntu.com/p/XrnfqVHtBh/ (it takes ~60 seconds to execute).
-  * The bug is way easier to reproduce is there is some delay in the DNS 
resolution. One way to simulate some delay is to use a non-local DNS server and 
run:
+  * The bug is way easier to reproduce is there is some delay in the DNS 
resolution. One way to simulate some delay is to use a non-local DNS server and 
run:
  
- # tc qdisc add dev eth0 root netem delay 200m
+ # tc qdisc add dev eth0 root netem delay 200ms
  
  where eth0 is the name of the interface used for DNS resolutions.
  This can be reverted using:
  # tc qdisc del dev eth0 root netem
  
      * On python3-aiohttp 3.6 (focal), the exception handler wakes up
  with:
  
      Task exception was never retrieved
      future: <Task finished name='Task-3' coro=<TCPConnector._resolve_host() 
done, defined at /usr/lib/python3/dist-packages/aiohttp/connector.py:774> 
exception=gaierror(-2, 'Name or service not known')>
  
      * When aiohttp is patched, nothing should happen.
  
   * Manually testing the patched library against Subiquity to make sure it 
solves https://bugs.launchpad.net/ubuntu-power-systems/+bug/1969393.
  This was done on my end using this PPA: 
https://launchpad.net/~ogayot/+archive/ubuntu/focal-bugfix . Tests were green.
  
  [Where problems could occur]
  
   * Since the patch affects a python library, any application that
  depends on this library (aka. python3-aiohttp) on focal would be
  affected by the upload.
  
   * In the unlikely event that this patch introduces a regression,
  applications that depend on python3-aiohttp (i.e., in focal/universe)
  can crash or raise exceptions.
  
   * If any package in focal/main has python3-aiohttp as a Build-Depends,
  a regression could cause said package to FTBFS.
  
  [Other Info]
  
   * The debdiff brings a backport of an upstream patch that is present in
  aiohttp 3.7 and newer versions:
  
  https://github.com/aio-libs/aiohttp/pull/5050
  
   * Upstream bug report: https://github.com/aio-libs/aiohttp/issues/4330

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1969817

Title:
  Uncaught exception when connector is cancelled

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-aiohttp/+bug/1969817/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to