> but i'll have to make 5 times as many requests to the database
5 times a small number can be less than 1 big number :)
see http://wiki.apache.org/cassandra/HadoopSupport
It's also covered in the O'Reilly cassandra book, however that book is somewhat
out of date.
also search for posts from Jere
We'll try doing multithreaded requests today-tomorrow
As for tuning down the number of supercolumns per slice, I tried doing
that, but I've noticed that the time was decreasing linearly with the
length of the slice. So, grabbing 1000 per slice would take 1/5 as long as
5000, but i'll have to make
Here's a test I did a while ago about creating column objects in python
http://www.mail-archive.com/user@cassandra.apache.org/msg06729.html
As Tyler said, the best approach is to limit the size of the slices.
If are are trying to load 125K super columns with 25 columns each your are
asking fo
Hi Paolo,
Thanks for the hint - JNA indeed wasn't installed. However, now that
cassandra is actually using it, there doesn't seem to be any change in
terms of speed - still 7 seconds with pycassa.
On Thu, Apr 19, 2012 at 12:14 AM, Paolo Bernardi wrote:
> Look into your Cassandra's logs to see i
Look into your Cassandra's logs to see if JNA is really enabled (it
really should be, by default), and more importantly if JNA is loaded
correctly. You might find some surprising message over there: if this
is the case, just install JNA with your distro's package manager and,
if still doesn't work,
Hi Tyler and Aaron,
Thanks for your replies.
Tyler,
fetching scs using your pycassa script on our server takes ~7 s -
consistent with the times we've been seeing. Now, we aren't really experts
in Cassandra, but it seems that JNA is enabled by default for Cassandra >
1.0 according to Jeremy (
http
I tested this out with a small pycassa script:
https://gist.github.com/2418598
On my not-very-impressive laptop, I can read 5000 of the super columns in 3
seconds (cold) or 1.5 (warm). Reading in batches of 1000 super columns at
a time gives much better performance; I definitely recommend going w
On Wed, Apr 18, 2012 at 5:00 PM, Dan Feldman wrote:
> Hi all,
>
> I'm trying to optimize moving data from Cassandra to HDFS using either Ruby
> or Python client. Right now, I'm playing around on my staging server, an 8
> GB single node machine. My data in Cassandra (1.0.8) consist of 2 rows (for
>
Hi all,
I'm trying to optimize moving data from Cassandra to HDFS using either Ruby
or Python client. Right now, I'm playing around on my staging server, an 8
GB single node machine. My data in Cassandra (1.0.8) consist of 2 rows (for
now) with ~150k super columns each (I know, I know - super colu