That makes sense about them not being designed for that use case. I would 
add, though, that transducers could certainly be used in a parallel context 
*if* the current transducer implementations were abstracted such that you 
could pass internal state generator and modifier functions and use the 
correct ones in whichever context is appropriate (single-threaded 
read/write, sequentially multi-threaded read/write à la core.async, 
concurrently multi-threaded read/write à la core.reducers). In the case of 
`map-indexed`, the fact that its transducer uses a volatile as currently 
implemented is not part of the `map-indexed` "contract", if you will, and 
seems to me to be an implementation detail. One could just as easily write 
the transducer for `map-indexed` as below:

(defn map-indexed-transducer-base [f box-mutable inc-mutable]
  (fn [rf]
    (let [i (box-mutable -1)]
      (fn
        ([] (rf))
        ([result] (rf result))
        ([result input]
          (rf result (f (inc-mutable i) input)))))))

(defn map-indexed-transducer-single-threaded [f]
  (map-indexed-transducer-base f unsynchronized-mutable-long! 
#(unsynchronized-mutable-swap! 
% inc))

(defn map-indexed-transducer-sequentially-multi-threaded [f]
  (map-indexed-transducer-base f volatile! #(vswap! % inc))

(defn map-indexed-transducer-concurrently-multi-threaded [f]
  (map-indexed-transducer-base f atom #(swap! % inc)) ; or an AtomicLong 
variant


On Sunday, April 9, 2017 at 6:47:46 PM UTC-4, tbc++ wrote:
>
> Transducers were never designed to work in parallel context. So I'd define 
> any behavior that arises from using the same transducers in multiple 
> threads *at the same time*, as undefined behavior. 
>
> On Sun, Apr 9, 2017 at 4:39 PM, Alexander Gunnarson <
> [email protected] <javascript:>> wrote:
>
>> I should add that, as Timothy pointed out, if multiple threads mutate and 
>> read the value but only one ever does so at a time, as is the case in 
>> `core.async`, then a volatile is sufficient. My preliminary conclusions 
>> above about volatiles apply only to concurrent mutation via e.g. `fold` or 
>> the like.
>>
>> Also, regarding the locks you mentioned, Seth, I read up a little on the 
>> Java memory model here 
>> <http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#synchronization>
>>  
>> and I can confirm that a lock is sufficient to provide *both* write *and* 
>> read thread-safety guarantees: 
>>
>> ... acquir[ing a] monitor ... has the effect of invalidating the local 
>>> processor cache so that variables will be reloaded from main memory. We 
>>> will then be able to see all of the writes made visible by the previous 
>>> release.
>>>
>>
>> `Volatile` only provides a subset of these read-safety guarantees, so a 
>> `volatile` in addition to a lock is indeed overkill, if that's what is 
>> happening.
>>
>> On Sunday, April 9, 2017 at 6:19:51 PM UTC-4, Alexander Gunnarson wrote:
>>>
>>> It looks that way to me too, Seth, though I'd have to comb over the 
>>> details of the locks implemented there to give a reasoned opinion of my 
>>> own. But yes, if that's the case, the volatile isn't adding anything.
>>>
>>> Anyway, I'm not trying to poke holes in the current implementation of 
>>> transducers — on the contrary, I'm very appreciative of and impressed by 
>>> the efforts the clojure.core (and core.async) contributors have made on 
>>> that and other fronts. Transducers are an extremely powerful and elegant 
>>> way to express code that would otherwise be a lot more complex and 
>>> difficult to reason about. I'm just trying to figure out where I can get 
>>> away with having unsynchronized mutable versions of stateful transducers 
>>> that currently use volatiles, and where I need even stronger measures of 
>>> thread safety than volatiles.
>>>
>>> To take these thoughts further, I did a simple test to compare the three 
>>> types of mutability we've been talking about (unsynchronized, volatile, and 
>>> atomic — I can reproduce the code here if you'd like) and the takeaway is 
>>> that `map-indexed` really does rely on atomic operations in a multithreaded 
>>> context, as each index depends on the previous index value. When doing a 
>>> `volatile`-based `map-indexed` in parallel on a small collection (8 
>>> elements), the `volatile` value stays consistent — that is, all the correct 
>>> indices are passed to the mapping function. However, over a sufficiently 
>>> large collection (100 elements, though it could happen on smaller scales 
>>> too), the `volatile` value starts to break down: duplicate index values are 
>>> passed to the mapping function and the highest index value only ever 
>>> reaches 97 at the maximum. The same phenomenon happens, of course, with the 
>>> unsynchronized-mutable-box-based `map-indexed`, though it happens at a 
>>> small scale too (calling the unsynchronized `map-indexed` on 8 elements 
>>> operated on by 2 threads produces only 7 unique indices).
>>>
>>> My preliminary conclusions are:
>>> - Unsynchronized mutability is fine in contexts known to be only 
>>> single-threaded, in which I could replace the `volatile` in `map-indexed` 
>>> and other transducers with unsynchronized mutable boxes.
>>> - Volatiles are good when all you want to do is set the value and have 
>>> multiple threads always read the most up-to-date value, without having to 
>>> depend on a previous value via e.g. `inc`.
>>> - Atomic boxes (`atom`, `AtomicLong`, etc.) are necessary when the 
>>> mutable value relies on the previous value via e.g. `inc`, as is the case 
>>> with `map-indexed`.
>>>
>>> My guess is that all this applies to e.g. the unsynchronized `ArrayList` 
>>> in `partition-by` as well, which might need to be a synchronized collection 
>>> or an immutable one boxed in an atom, but I haven't tested this.
>>>
>>> Would you agree with these conclusions, Seth and Timothy?
>>>
>>> On Sunday, April 9, 2017 at 1:56:38 PM UTC-4, Seth Verrinder wrote:
>>>>
>>>> I'll defer to Timothy on the particulars of core.async but it looks 
>>>> like [1] the transducer in channel is protected by a lock. If that's the 
>>>> case volatile isn't adding anything in terms memory barriers.
>>>>
>>>> 1: 
>>>> https://github.com/clojure/core.async/blob/master/src/main/clojure/clojure/core/async/impl/channels.clj#L71
>>>>
>>>> On Sunday, April 9, 2017 at 11:58:00 AM UTC-5, Alexander Gunnarson 
>>>> wrote:
>>>>>
>>>>> Thanks so much for your well-considered reply, Timothy! That makes 
>>>>> sense about volatiles being used in e.g. core.async or core.reducers 
>>>>> contexts where the reducing function that closes over the mutable value 
>>>>> of 
>>>>> the stateful transducer is called in different threads. Why, then, are 
>>>>> unsynchronized ArrayLists used e.g. in 'partition-by'? It's also closed 
>>>>> over by the reducing function in just the same way as the volatile long 
>>>>> value internal to e.g. 'map-indexed'. I'm not yet clear on how one (the 
>>>>> ArrayList) is acceptable being non-volatile and the other (the volatile 
>>>>> long) is unacceptable. When .add is called, an unsynchronized mutable 
>>>>> counter is updated so the ArrayList can insert the next value at the 
>>>>> correct index. Do you have any insight into this? Meanwhile I'll go do 
>>>>> some 
>>>>> digging myself on the Clojure JIRA etc. so I'm more informed on the 
>>>>> subject. 
>>>>
>>>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to [email protected] 
>> <javascript:>
>> Note that posts from new members are moderated - please be patient with 
>> your first post.
>> To unsubscribe from this group, send email to
>> [email protected] <javascript:>
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> “One of the main causes of the fall of the Roman Empire was that–lacking 
> zero–they had no way to indicate successful termination of their C 
> programs.”
> (Robert Firth) 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to