When you say query 1 million records in my mind i'm saying "dump 1 million
records to another system as a back office job".
Hadoop will split the job over multiple nodes and will assign a task to read
the range "owned" by each node. From memory it uses CL ONE (by default) for the
read so the n
Hi Alexandru,
Things got hectic and I put off the project until this weekend. I'm
actually learning about Hadoop right now and how to implement it. I can
respond to this thread when I have something running.
In the meantime, I'd like to bump this email up and see if there are others
who can provi
Hi Aaron and Martin,
Sorry about my previous reply, I thought you wanted to process only all the
row keys in CF.
I have a similar issue as Martin because I see myself being forced to hit
more than a million rows with a query (I only get a few columns from every
row). Aaron, we've talked about thi
If you want to process 1 million rows use Hadoop with Hive or Pig. If you use
Hadoop you are not doing things in real time.
You may need to rephrase the problem.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 14/02/2012, at 11:00 AM, Ma
Hey Martin,
Have you tried CQL query: "SELECT FIRST 0 * FROM cfName" ?
Cheers,
Alex
On Mon, Feb 13, 2012 at 11:00 PM, Martin Arrowsmith <
arrowsmith.mar...@gmail.com> wrote:
> Hi Experts,
>
> My program is such that it queries all keys on Cassandra. I want to do
> this as quick as possible, in o
Hi Experts,
My program is such that it queries all keys on Cassandra. I want to do this
as quick as possible, in order to get as close to real-time as possible.
One solution I heard was to use the sstables2json tool, and read the data
in as JSON. I understand that reading from each line in Cassan