Hello @Łukasz Zemczak

I understand your concern about breaking existing behaviour.

I would like to clarify 2 things:
 1) The change introduced here only ensures that user-requested timeouts are 
never exceeded. That means that if something is currently accepting DHCP offers 
after the specified timeout has expired, then that is buggy behaviour. It would 
make sense that such behaviour would change after this change. If such a user 
exists, they would need to change their timeout values so as to be long enough 
to receive the DHCP offer that they want. I think it makes sense that such 
buggy behaviour would get exposed with this fix.
 2) That said, I consider it very unlikely for that scenario to even exist. The 
main reason is that the bug (going over the specified timeout) happens in the 
case where there is a problem bringing up the interface, so the ipconfig logic 
simply says "try again in 10 seconds" in that case, even if that 10 seconds is 
longer than the entire specified timeout. Since this happens when there are 
interface problems, it can be safely assumed that no DHCP offers will be 
received anyway on that problematic interface (in fact, no DHCP request was 
even sent out). These 10 seconds are just dead time, basically.

So I would consider it unlikely that anyone who is successfully using
ipconfig today would even notice a change. This patch should only affect
those 10-seconds delays caused by problematic interfaces not normal
functioning interfaces, (and even then, it would only affect them if the
user-specified timeout is less than 10 seconds).

I am not sure what you mean by how it differs from focal+. The last time
I checked, the timeout-bug was affecting ipconfig in both focal and
bionic (and upstream HEAD). However, we are only interested in Bionic
since Focal does not even use ipconfig it uses dhclient instead.

I will add a regression-potential section to the description with more
info. Thanks.

I will try to expand the regression-potential section with additional
info about this.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to klibc in Ubuntu.
https://bugs.launchpad.net/bugs/1947099

Title:
  ipconfig does not honour user-requested timeouts in some cases

Status in klibc package in Ubuntu:
  New
Status in klibc source package in Bionic:
  Incomplete

Bug description:
  
  [Impact]
  In some cases, ipconfig can take a longer time than the user-specified 
timeouts, causing unexpected delays.

  [Test Plan]
  Any situation where ipconfig encounters an error sending the DHCP packet, it 
will automatically set a delay of 10 seconds, which could be longer than the 
user-specified timeout. It can be reproduced by creating a dummy interface and 
attempting to run ipconfig on it with a timeout value of less than 10:

  # ip link add eth1 type dummy
  # date; /usr/lib/klibc/bin/ipconfig -t 2 eth1; date
  Thu Nov 18 04:46:13 EST 2021
  IP-Config: eth1 hardware address ae:e0:f5:9d:7e:00 mtu 1500 DHCP RARP
  IP-Config: no response after 2 secs - giving up
  Thu Nov 18 04:46:23 EST 2021

  ^ Notice above, ipconfig thinks that it waited 2 seconds, but the
  timestamps show an actual delay of 10 seconds.

  [Where problems could occur]
  Please see reproduction steps above. We are seeing this in production too 
(see comment #2).

  [Other Info]
  A patch to fix the issue is being proposed here. It is a safe fix - it only 
checks before going into sleep that the timeout never exceeds the 
user-requested value.

  [Original Description]

  In some cases, ipconfig can take longer than the user-specified
  timeouts, causing unexpected delays.

  in main.c, in function loop(), the process can go into
  process_timeout_event() (or process_receive_event() ) and if it
  encounters an error situation, will set an attempt to "try again
  later" at time equal now + 10 seconds by setting

  s->expire = now + 10;

  This can happen at any time during the main event loop, which can end
  up extending the user-specified timeout if "now + 10" is greater than
  "start_time + user-specified-timeout".

  I believe a patch like the following is needed to avoid this problem:

  --- a/usr/kinit/ipconfig/main.c
  +++ b/usr/kinit/ipconfig/main.c
  @@ -437,6 +437,13 @@ static int loop(void)

                          if (timeout > s->expire - now.tv_sec)
                                  timeout = s->expire - now.tv_sec;
  +
  +                       /* Compensate for already-lost time */
  +                       gettimeofday(&now, NULL);
  +                       if (now.tv_sec + timeout > start + loop_timeout) {
  +                               timeout = loop_timeout - (now.tv_sec - start);
  +                               printf("Lowered timeout to match user request 
= (%d s) \n", timeout);
  +                       }
                  }

  I believe the current behaviour is buggy. This is confirmed when the
  following line is executed:

                          if (loop_timeout >= 0 &&
                              now.tv_sec - start >= loop_timeout) {
                                  printf("IP-Config: no response after %d "
                                         "secs - giving up\n", loop_timeout);
                                  rc = -1;
                                  goto bail;
                          }

  'loop_timeout' is the user-specified time-out. With a value of 2, in
  case of error, this line prints:

  IP-Config: no response after 2 secs - giving up

  So it thinks that it waited 2 seconds - however, in reality it had
  actually waited for 10 seconds.

  The suggested code-change ensures that the timeout that is actually
  used never exceeds the user-specified timeout.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/klibc/+bug/1947099/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to