I'm not sure this is the problem that you're seeing, but I see a problem with the example. It boils down to the fact that futures do not provide concurrency.
That may sound like a surprising claim, because the whole point of futures is to run multiple things at a time. But futures merely offer best-effort parallelism; they do not provide any guarantee of concurrency. As a consequence, trying to treat an fsemaphore as a lock can go wrong. If a future manages to take an fsemaphore lock, but the future is not demanded by the main thread --- or in a chain of future demands that are demanded by the main thread --- then nothing obliges the future to continue running; it can hold the lock forever. (I put the blame on femspahores. Adding fsemaphores to the future system was something like adding mutation to a purely functional language. The addition makes certain things possible, but it also breaks local reasoning that the original design was supposed to enable.) In your example program, I see (define workers (do-start-workers)) (displayln "started") (for ((i 10000)) (mfqueue-enqueue! mfq 1)) where `do-start-workers` creates a chain of futures, but there's no `touch` on the root future while the loop calls `mfqueue-enqueue!`. Therefore, the loop can block on an fsemaphore because some future has taken the lock but stopped running for whatever reason. In this case, adding `(thread (lambda () (touch workers)))` before the loop after "started" might fix the example. In other words, you can use the `thread` concurrency construct in combination with the `future` parallelism construct to ensure progress. I think this will work because all futures in the program end up in a linear dependency chain. If there were a tree of dependencies, then I think you'd need a `thread` for each `future` to make sure that every future has an active demand. If you're seeing a deadlock at the `(touch workers)`, though, my explanation doesn't cover what you're seeing. I haven't managed to trigger the deadlock myself. At Sat, 23 May 2020 18:51:23 +0200, Dominik Pantůček wrote: > Hello again with futures! > > I started working on futures-based workers and got quickly stuck with a > dead-lock I think does not originate in my code (although it is two > semaphores, 8 futures, so I'll refrain from strong opinions here). > > I implemented a very simple futures-friendly queue using mutable pairs > and created a minimal-deadlocking-example[1]. I am running racket 3m > 7.7.0.4 which includes fixes for the futures-related bugs I discovered > recently. > > Sometimes the code just runs fine and shows the numbers of worker > iterations performed in different futures (as traced by the 'fid' > argument). But sometimes it locks in a state where there is one last > number in the queue (0 - zero) and yet the fsemaphore-count for the > count fsemaphore returns 0. Which means the semaphore was decremented > twice somewhere. The code is really VERY simple and I do not see a > race-condition within the code, that would allow any code path to > decrement the fsema-count fsemaphore twice once the worker future > receives 0. > > I am able to reproduce the behavior with racket3m running under gdb and > get the stack traces for all the threads pretty consistently. The > deadlock is apparently at: > > 2 Thread 0x7ffff7fca700 (LWP 46368) "mfqueue.rkt" > futex_wait_cancelable (private=<optimized out>, expected=0, > futex_word=0x5555559d8e78) at ../sysdeps/nptl/futex-internal.h:183 > > But that is just where the issue is showing up. The real question is how > the counter gets decremented twice (given that fsemaphores should be > futures-safe). > > Any hints would be VERY appreciated! > > > Cheers, > Dominik > > [1] http://pasterack.org/pastes/28883 > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/racket-users/5dcf1260-e8bf-d719-adab-5a0fd937 > 8075%40trustica.cz. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/20200523112413.15a%40sirmail.smtp.cs.utah.edu.

