On Tuesday, January 15, 2013 12:36:04 AM UTC+1, Jim foo.bar wrote:
> On 14/01/13 22:15, Marko Topolnik wrote:
>
> But... you were quite clear in stating that your function isn't lazy, and
> you were right to say that: *doall* will not return until *all* the tasks
> have completed. Maybe you did want laziness, after all?
>
>
> I'm not entirely sure what you mean here...The only real reason I am using
> 'doall' is so that I can shutdown the ExecutorService, otherwise I wouldn't
> mind returning a lazy-seq. In fact that is how I originally had it - I was
> never shutting down the pool until recently! Inside the doall however I'm
> polling the pool for answers and *it* looks for finished jobs...I'm not
> deref-ing the futures in a serial manner which has higher chances of
> bocking...Is my thinking not correct?
>
The order in which you are polling is not very relevant given the fact that
*doall* won't return until *all* futures are realized. It's just an
internal detail. As for shutting down the executor service, it is safe to
do so immediatly after all tasks have been submitted:
(defn pool-map [f coll]
(let [exec (Executors/newFixedThreadPool (.availableProcessors
(Runtime/getRuntime)))]
(->> (try (.invokeAll exec (for [x coll] #(f x))) (finally (.shutdown
exec)))
(map #(.get ^FutureTask %)))))
> Yet another issue with your function is that it doesn't submit all the
> tasks up front so it clearly won't prevent longer tasks from delaying the
> shorter ones. But, even if they are submitted all at once, like in my
> version, the executor service will still enqueue them internally, with the
> same end result. You'd need an unbounded thread pool, and that would
> obviously bring in other issues.
>
>
> You're quite right here...Again using 'for' was a remnant from the old
> version...Perhaps 'mapv' would be a better choice for submitting the
> jobs...In any case, is this executor-based approach (with or without the
> completion service) encouraged? Would you care if tomorrow pmap started
> using this approach?
>
Using *mapv* (or simply *doall*, same thing) won't solve the root problem
of tasks waiting in executor's own queue.
Taking into account all that was said, *pool-map* can't offer much more
than *pmap*. You can't know which tasks will take less time until they are
already done. It is theoretically impossible to pre-order them according to
execution time, thereby harvesting the results of the fastest ones earlier,
eventually promoting total concurrency. The way to go is grinding the tasks
down to chunks of predictable runtime and submitting them to a fork/join
solver, which will optimize concurrency through work-stealing.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en