Il 06/04/2011 12:04, Cristiano Belloni ha scritto:
Hi to all,
I wrote a custom shared memory source. it inherits from FramedSource.
The shared memory is synchronized via Linux semaphores (simple producer-consumer algorithm), but since I didn't want to subclass TaskScheduler, I still use a "dummy" file descriptor-based communication with live555. In pseudocode:


~~~Client (without live555):

wait on semaphore_empty (blocking)
copy frame in shared memory
write one byte in a dedicated FIFO (this should wake up live555' TaskScheduler select())
post on semaphore_fill

~~~Server (with live555, in SharedMemSource::incomingPacketHandler1())
[turnOnBackgroundReadHandling is called in doGetNextFrame]

wait on semaphore_fill (blocking)
read one byte from the dedicated FIFO (to flush the FIFO buffer)
copy frame from shared memory
post on semaphore_empty

This works. Altought the blocking wait on semaphore_fill might make you wonder, the client wakes up my source with the write() in the dedicated FIFO and immediately posts on semaphore_fill, so the server almost never waits, and if it does, it doesn't block for a really small time.

The problem is that, after a while (1 or 2 hours usually), the client does its cycle and the server never wakes up. It *doesn't* get stuck on the wait, I checked: it simply never wakes up, as if the client write() was lost (but it *always* succeed on the client side) or the select() didn't wake up even if the write succeeded.

I would like to emphasize this: the server *never* gets stuck forever on its wait. When it gets stuck, the client is one frame ahead of the server, incomingPacketHandler1() simply is never called anymore and the wait is not even reached.

At this point, I have two questions:

1) In your knowledge, can the select() not wake up even if a write() on the other side succeeded? If it can, how is it possible? Note that the system is an embedded ARM processor, and it could get quite busy while acquiring and streaming video.

2) First thing I do in SharedMemSource::incomingPacketHandler1() is to check for isCurrentlyAwaitingData(). If it's false, I simply return before doing all the cycle, and this happens quite often. What's the meaning of isCurrentlyAwaitingData()? I mean, if the select() in TaskScheduler returned, some data must be present on the file/fifo/socket. How is it possible that the select() did return but still there's no data available? I'm getting really confused on this.

Thanks and regards,
Cristiano Belloni.


--
Belloni Cristiano
Imavis Srl.
www.imavis.com <http://www.imavis.com>
bell...@imavis.com <mailto://bell...@imavis.com>


_______________________________________________
live-devel mailing list
live-devel@lists.live555.com
http://lists.live555.com/mailman/listinfo/live-devel

Update: I put some logs in BasicTaskScheduler to see what happens.

one before the select():

printf ("[SYNCHROBUG] About to do the select, timeout %d.%d\n", tv_timeToDelay.tv_sec, tv_timeToDelay.tv_usec); int selectResult = select(fMaxNumSockets, &readSet, &writeSet, &exceptionSet, &tv_timeToDelay);
    if (selectResult < 0) {
[...]

two after the select(), (one catches an EINTR or EAGAIN error value should they happen):

#else
    if (errno != EINTR && errno != EAGAIN) {
#endif
        // Unexpected error - treat this as fatal:
#if !defined(_WIN32_WCE)
        perror("BasicTaskScheduler::SingleStep(): select() fails");
#endif
        internalError();
      }
  }
    if (errno == EINTR || errno == EAGAIN) {
       perror ("[SYNCHROBUG] error is");
    }

  printf ("[SYNCHROBUG] Select done, getting sockets\n");
[...]

two after the first and second pass of readable socket check:

 int resultConditionSet = 0;
if (FD_ISSET(sock, &readSet) && FD_ISSET(sock, &fReadSet)/*sanity check*/) { printf ("[SYNCHROBUG] Socket %d found readable on first pass\n", sock);
       resultConditionSet |= SOCKET_READABLE;
    }

    [...]

if (FD_ISSET(sock, &readSet) && FD_ISSET(sock, &fReadSet)/*sanity check*/) { printf ("[SYNCHROBUG] Socket %d found readable on second pass\n", sock);
         resultConditionSet |= SOCKET_READABLE;
      }

And one to check if we found some readable/writable/excepting socket at all:


if (handler == NULL) {
       fLastHandledSocketNum = -1;//because we didn't call a handler
       printf ("[SYNCHROBUG] No socket found at all\n");
    }


at first everything is ok:

[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] No socket found at all
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass

(socket 5 must be the FIFO, I guess)


But then, select keeps randomly return errno=11, aka EAGAIN or "Resource temporarily unavailable":

[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.480311
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 1.479309
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.478362
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 1.477358
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.476431
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass

Now, I don't even know the reason why a select() could return EAGAIN (a lot of people say it souldn't at all, and even my "man 3 select" agrees: http://stackoverflow.com/questions/4193043/select-on-a-pipe-in-blocking-mode-returns-eagain ), but I see this case is handled in your code and ignored, just like the EINTR case:

    if (errno != EINTR && errno != EAGAIN) {
#endif
        // Unexpected error - treat this as fatal:
#if !defined(_WIN32_WCE)
        perror("BasicTaskScheduler::SingleStep(): select() fails");
#endif
        internalError();
      }

[if errno is EINTR or EAGAIN, then the scheduler goes on inspecting the select()'s returned sets].

Could that be the origin of my problems?

As obviously you can't try my executables on my hardware, please tell me what else could I log. BTW the rtsp/rtp client in this picture is openRTSP. Here's an ascii schema :)

client program generating frames ----FIFO---> rtsp server based on live555 ----RTSP/RTP/TCP----> openRTSP

Thank you and best regards,

Cristiano Belloni.


--
Belloni Cristiano
Imavis Srl.
www.imavis.com <http://www.imavis.com>
bell...@imavis.com <mailto://bell...@imavis.com>
_______________________________________________
live-devel mailing list
live-devel@lists.live555.com
http://lists.live555.com/mailman/listinfo/live-devel

Reply via email to