Just a quick ping to share that I've kept playing with this PipeGraph toy.
The following example reflects its current state.
* As you can see scikit-learn models can be used as steps in the nodes of
the graph just by saying so, for example:
'Gaussian_Mixture':
{'step': GaussianMixture,
There are two cases : n_jobs > 1 works when data is smaller - when the
training docs numpy array is 15MB. It does not work when training matrix is
100MB. My Mac has 16GB RAM.
In the second case, the jobs die out pretty quickly, in seconds, and the
main python process seems to die out (min CPU usag