Hi Frank,
You could also use Hadoop with no reducer or with IdentityReducer, which
ensures data locality as long as you start task tracker on Cassandra nodes
where the data resides.
Concerning the difficulty to get tokens in a vnodes environment that's what
Hadoop core functions do. You can ha
Hi,
We are using Cassandra at Orange to manage a big sparse matrix on a cluster of
servers.
On this database we want to run a sparse matrix factorization algorithm.
We need to parallelize this matrix factorization algorithm, for instance by
computing the factorization model rows by rows.
S