We did actually discuss doing something like this a long time ago. The worry that comes to mind is whether it should actually be forbidden (an invariant) or merely strong frowned upon. It is possible to arrange a situation where you know (based on other knowledge) that a put on a channel will succeed. Should that be allowed? I'd have to hammock on that one for a while.
On Tuesday, August 29, 2017 at 3:19:51 PM UTC-5, Aaron Iba wrote: > > Ahh that makes a lot of sense. Indeed, I'm guilty of doing a blocking >!! > inside a go-block. I was so careful to avoid other kinds of blocking calls > (like IO) that I forgot that blocking variants of core.async calls > themselves were forbidden. > > Thank you for pointing this out! I will rewire things to not do this. > > Per Gary's suggestion, I also think it'd be useful if core.async blocking > ops checked a dynamic var (or a property of the thread itself) and at least > warned if they are being called from a forbidden context. To resolve my > original issue, I'm considering doing this in my dev environment: > > (doseq [v '[<!! >!!]] > (alter-var-root (ns-resolve 'clojure.core.async v) > (fn [f] > (fn [& args] > (if (.startsWith (.getName (Thread/currentThread)) > "async-dispatch-") > (throw (Exception. (str v " called inside > async-dispatch"))) > (apply f args)))))) > > > > > On Tuesday, August 29, 2017 at 1:43:53 PM UTC-4, Gary Trakhman wrote: >> >> Hm, I came across a similar ordering invariant (No code called by a go >> block should ever call the blocking variants of core.async functions) while >> wrapping an imperative API, and I thought it might be useful to use >> vars/binding to enforce it. Has this or other approaches been considered >> in core.async? I could see a *fixed-thread-pool* var being set and >!! >> checking for false. >> >> An analogy in existing clojure.core would be the STM commute's 'must be >> running in a transaction' check that uses a threadlocal. >> https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LockingTransaction.java#L205 >> >> On Tue, Aug 29, 2017 at 1:30 PM Timothy Baldridge <[email protected]> >> wrote: >> >>> To add to what Alex said, look at this trace: >>> https://gist.github.com/anonymous/65049ffdd37d43df8f23630928e8fed0#file-thread-dump-out-L1337-L1372 >>> >>> Here we see a go block calling mapcat, and inside the inner map >>> something is calling >!!. As Alex mentioned this can be a source of >>> deadlocks. No code called by a go block should ever call the blocking >>> variants of core.async functions (<!!, >!!, alts!!, etc.). So I'd start at >>> the code redacted in those lines and go from there. >>> >>> >>> >>> On Tue, Aug 29, 2017 at 11:09 AM, Alex Miller <[email protected]> >>> wrote: >>> >>>> go blocks are multiplexed over a thread pool which has (by default) 8 >>>> threads. You should never perform any kind of blocking activity inside a >>>> go >>>> block, because if every go block in work happens to end up blocked, you >>>> will prevent all go blocks from making any further progress. It sounds to >>>> me like that's what has happened here. The go block threads are named >>>> "async-dispatch-<n>" and it looks like there are 8 blocked ones in your >>>> thread dump. >>>> >>>> It also looks like they are all blocking on a >!!, which is a blocking >>>> call. So I would look for a go block that contains a >!! and convert that >>>> to a >! or do something else to avoid blocking there. >>>> >>>> >>>> On Tuesday, August 29, 2017 at 11:48:25 AM UTC-5, Aaron Iba wrote: >>>>> >>>>> My company has a production system that uses core.async extensively. >>>>> We've been running it 24/7 for over a year with occasional restarts to >>>>> update things and add features, and so far core.async has been working >>>>> great. >>>>> >>>>> The other day, during a particularly high workload, the whole system >>>>> got locked up. All the channels seemed blocked at once. I was able to >>>>> connect with a REPL and poke around, and noticed strange behavior of >>>>> core.async. Specifically, the following code, when evaluated in the REPL, >>>>> blocked on the put (third expression): >>>>> >>>>> (def c (async/chan)) >>>>> (go-loop [] >>>>> (when-some [x (<! c)] >>>>> (println x) >>>>> (recur))) >>>>> (>!! c true) >>>>> >>>>> Whereas on any fresh system, the above expressions obviously succeed. >>>>> >>>>> Puts succeeded if they went onto the channel's buffer, but not when >>>>> they should go through to a consumer. For example with the following >>>>> expressions, evaluated in the REPL, the first put succeeded (presumably >>>>> because it went on the buffer), but subsequent puts blocked: >>>>> >>>>> (def c (async/chan 1)) >>>>> (def m (async/mult c)) >>>>> (def out (async/chan (async/sliding-buffer 3))) >>>>> (async/tap m out) >>>>> (>!! c true) ;; succeeds >>>>> (>!! c true) ;; blocks forever >>>>> >>>>> This leads me to wonder if core.async itself somehow got into a bad >>>>> state. It's entirely possible I caused this by misusing the API somewhere >>>>> in the codebase, but we use core.async so extensively that I wouldn't >>>>> know >>>>> where to begin looking. >>>>> >>>>> I'm wondering if someone more familiar with core.async internals has >>>>> an idea about what could cause the above situation. Or if we notice it >>>>> happening again, what could I do to gather more helpful information. >>>>> >>>>> I also have a redacted thread dump, in case it's useful: >>>>> >>>>> https://gist.github.com/anonymous/65049ffdd37d43df8f23630928e8fed0 >>>>> >>>>> Any help would be much appreciated, >>>>> >>>>> Aaron >>>>> >>>>> P.S. core.async has been a godsend in terms of helping us structure >>>>> and modularize our large system. Thank you to all those who contributed >>>>> to >>>>> this wonderful library! >>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To post to this group, send email to [email protected] >>>> Note that posts from new members are moderated - please be patient with >>>> your first post. >>>> To unsubscribe from this group, send email to >>>> [email protected] >>>> For more options, visit this group at >>>> http://groups.google.com/group/clojure?hl=en >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Clojure" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> “One of the main causes of the fall of the Roman Empire was that–lacking >>> zero–they had no way to indicate successful termination of their C >>> programs.” >>> (Robert Firth) >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to [email protected] >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
