Re: Transducers: sequence versus eduction

vvedee Wed, 01 Apr 2015 01:38:54 -0700

Eduction retains the ability to be recomposed with other transducers higher 
in the function chain. The following two are nearly equivalent:
(transduce (take 1e2) + (eduction (filter odd?) (range)))
(transduce (comp (filter odd?) (take 1e2)) + (range))


This will be slower:
(transduce (take 1e2) + (sequence (filter odd?) (range)))

Execution time mean : 19.054407 µs
Execution time mean : 19.530890 µs
Execution time mean : 39.955692 µs

I also wonder to what extent eduction can be used as a drop 
in replacement for sequence.


On Wednesday, April 1, 2015 at 9:13:07 AM UTC+2, Tassilo Horn wrote:
>
> Hi all, 
>
> I've switched many nested filter/map/mapcat applications in my code to 
> using transducers.  That brought a moderate speedup in certain cases 
> and the deeper the nesting has been before, the clearer the transducers 
> code is in comparison, so yay! :-) 
>
> However, I'm still quite unsure about the difference between `sequence` 
> and `eduction`.  From the docs and experimentation, I came to the 
> assumptions below and I'd be grateful if someone with more knowledge 
> could verify/falsify/add: 
>
>   - Return types differ: Sequence returns a standard lazy seq, eductions 
>     an instance of Eduction. 
>
>   - Eductions are reducible/sequable/iterable, i.e., basically I can use 
>     them wherever a (lazy) seq would also do, so sequence and eduction 
>     are quite interchangeable except when poking at internals, e.g., 
>     (.contains (sequence ...) x) works whereas (.contains (eduction ...) 
>     x) doesn't. 
>
>   - Both compute their contents lazily. 
>
>   - Lazy seqs cache their already realized contents, eductions compute 
>     them over and over again on each iteration. 
>
> Because of that, I came to the conclusion that whenever I ask myself if 
> one of my functions should return a lazy seq or an eduction, I should 
> use these rules: 
>
>   1. If the function is likely to be used like 
>
>      (let [xs (seq-producing-fn args)] 
>        (or (do-stuff-with xs) 
>            (do-other-stuff-with xs) 
>            ...)) 
>
>      that is, the resulting seq is likely to be bound to a variable 
>      which is then used multiple times (and thus lazy seq caching is 
>      benefitical), then use sequence. 
>
>   2. If it is a private function only used internally and never with the 
>      usage pattern of point 1, then definitively use eduction. 
>
>   3. If its a public function which usually isn't used with a pattern as 
>      in point 1, then I'm unsure.  eduction is probably more efficient 
>      but sequence fits better in the original almost everything returns 
>      a lazy seq design.  Also, the latter has the benefit that users of 
>      the library don't need to know anything about transducers. 
>
> Is that sensible?  Or am I completely wrong with my assumptions about 
> sequence and eduction? 
>
> On a related note, could someone please clarify the statement from the 
> transducers docs for `sequence`? 
>
> ,----[ Docs of sequence at http://clojure.org/transducers ] 
> | The resulting sequence elements are incrementally computed. These 
> | sequences will consume input incrementally as needed and fully realize 
> | intermediate operations.  This behavior differs from the equivalent 
> | operations on lazy sequences. 
> `---- 
>
> I'm curious about the "fully realize intermediate operations" part. 
> Does it mean that in a "traditional" 
>
>     (mapcat #(range %) (range 10000)) 
>
> the inner range is also evaluated lazy but with 
>
>     (sequence (mapcat #(range %)) (range 10000)) 
>
> it is not?  It seems so.  At least dorun-ning these two expressions 
> shows that the "traditional" version is more than twice as fast than the 
> transducer version.  Also, the same seems to hold for 
>
>     (eduction (mapcat #(range %)) (range 10000)) 
>
> which is exactly as fast (or rather slow) as the sequence version. 
>
> But wouldn't that mean that transducers with mapcat where the mapcatted 
> function isn't super-cheap is a bad idea in general at least from a 
> performance POV? 
>
> Bye, 
> Tassilo 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Transducers: sequence versus eduction

Reply via email to