Ross Finlayson wrote:

I have no doubt that your problem is real. However, your proposed solution - messing with the "BasicTaskScheduler" code - appears to be merely masking your real problem: That incorrect socket numbers are getting passed to the call to "select()" in the event loop.
Ross, I have solved the problem, I have mastered it. I know that the
socket number is invalid, and it is invalid because of SingleStep
behavior with my callback performing a reconnection.

Now you have 2 options:

1. It is allowed to make a restart (close the sockets and reopen new
ones) in one single step - and then SingleStep is bugged - or
2. this option is neglected: it is mandatory to close the connection in
one event, so that SingleStep will not call any wrong handler, and then
reopen in a second one, event-driven by the first. In such a case
SingleStep works correctly - I have choosen this 2nd solution in order to avoid to modify BasicTaskScheduler, knowing the problem it was easy to find a workaround.

Since that's *your* library it's up to you to choose what. see below.


You should make sure that "TaskScheduler::turnOffBackgroundReadHandling()" gets called on each socket once it's no longer being used (before its number gets reclaimed for another socket). There's apparently something about your custom code that is causing that not to happen for some sockets.
I have logged all the TurnOn's, TurnOff's, and selects to find out this
problem. They all are called - but before the end of SingleStep - look,
this is my last attempt to explain it:

void BasicTaskScheduler::SingleStep(unsigned maxDelayTime) {
  fd_set readSet = fReadSet; // make a copy for this select() call
  // this is the origin of the problem: taking a local copy
  // fReadSet contains the 3 socket ids, referring to:
  // TCP, UDP-RCP, UDP-RTCP
  // for example:
  // TCP -> 2000
  // UDP-RCP -> 2100
  // UDP-RTCP -> 2200

  // ... omissis ...

  int selectResult = select(fMaxNumSockets, &readSet, NULL, NULL,
                            &tv_timeToDelay);
  // after the select readSet.fd_count is 1
  // and readSet.fd_array[0] is 2000
  // referring always to the TCP socket in our scenario

  // ... omissis ...

  // my restart callback is called here, scheduled by a bye handler
  fDelayQueue.handleAlarm();
  // all the 3 sockets defined before are correctly closed
  // and 3 new ones are opened.
  // They are registered in fReadSet / fReadHandlers
  // but we are here again:
  // the local variable readSet refers to the old selected socket ID.
  // so if handleAlarm change the connections, the following of
  // this routine is going to be based on outdated data.
  // This is not consistent: That's why I suggested to move this call to
  // the end.
  // When the socket ID in readSet is reused by the OS for an UDP
  // socket, the scheduler stop looping.
  // for example, it has happened that after the call:
  // TCP -> 2500
  // UDP-RCP -> 2000
  // UDP-RTCP -> 2600

  HandlerIterator iter(*fReadHandlers);
  HandlerDescriptor* handler;

  // ... omissis ...
  // (the trick with fLastHandledSocketNum does not interfere at all)

  // the 3 handlers being scanned are the NEW ones,
  // but one of them have colliding socket id (2000) with the old one.
  while ((handler = iter.next()) != NULL) {     
    if (FD_ISSET(handler->socketNum, &readSet) &&
       // TRUE for socket 2000, it is the only one in the old readSet
       // this is senseless because of the reconnetion
       // 2000 is not the tcp socket id anymore, it is a new udp socket
       // where no packets will arrive
       FD_ISSET(handler->socketNum, &fReadSet) &&
       // TRUE for socket 2000 on the fReadSet,
       // it is one of the 3 new ones
       handler->handlerProc != NULL) {
       // TRUE since it points to the newly set UDP handler

       // ... omissis ...
       // now call the UDP handler on the oldly selected socket 2000
       // but nothing incoming in UDP,
       // so it got stuck in the select inside it.
      (*handler->handlerProc)(handler->clientData, SOCKET_READABLE);
      break;
    }
  }
}

I can not be clearer than this.
If you are still on your position, that's not a problem - really - I just tried to give a contribution... but this now is going to be overkilling.

Thanks again for your great work with the live555.

Regards,
   Sigismondo
--
------------------------------------------------------------------
Sigismondo Boschi, PhD                     TotalWire S.r.l.
[EMAIL PROTECTED]                      http://www.totalwire.it
cell: +39 346 7911896                      tel:  +39 051 302696

begin:vcard
fn:Sigismondo Boschi
n:Boschi;Sigismondo
org:TotalWire
adr:;;via Valdossola, 25/b;Bologna;BO;40134;Italy
email;internet:[EMAIL PROTECTED]
tel;work:+39 051 302696
tel;cell:+39 346 7911896
x-mozilla-html:FALSE
url:http://www.totalwire.it
version:2.1
end:vcard

_______________________________________________
live-devel mailing list
live-devel@lists.live555.com
http://lists.live555.com/mailman/listinfo/live-devel

Reply via email to