Re: High latencies for simple queries

2015-03-31 Thread Tyler Hobbs
To clarify, that's in Cassandra 2.1+. In 2.0 and earlier, we used http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/ for cqlsh. On Tue, Mar 31, 2015 at 10:40 AM, Tyler Hobbs wrote: > The python driver that we bundle with Cassandra for cqlsh is the normal > python driver (https://git

Re: High latencies for simple queries

2015-03-31 Thread Tyler Hobbs
The python driver that we bundle with Cassandra for cqlsh is the normal python driver (https://github.com/datastax/python-driver), although sometimes it's patched for bugfixes or is not an official release. On Sat, Mar 28, 2015 at 5:36 PM, Ben Bromhead wrote: > cqlsh runs on the internal cassand

Re: High latencies for simple queries

2015-03-28 Thread Ben Bromhead
cqlsh runs on the internal cassandra python drivers: cassandra-pylib and cqlshlib. I would not recommend using them at all (nothing wrong with them, they are just not built with external users in mind). I have never used python-driver in anger so I can't comment on whether it is genuinely slower

Re: High latencies for simple queries

2015-03-28 Thread Artur Siekielski
On 03/28/2015 12:13 AM, Ben Bromhead wrote: One other thing to keep in mind / check is that doing these tests locally the cassandra driver will connect using the network stack, whereas postgres supports local connections over a unix domain socket (this is also enabled by default). Unix domain so

Re: High latencies for simple queries

2015-03-27 Thread Ben Bromhead
One other thing to keep in mind / check is that doing these tests locally the cassandra driver will connect using the network stack, whereas postgres supports local connections over a unix domain socket (this is also enabled by default). Unix domain sockets are significantly faster than tcp as you

Re: High latencies for simple queries

2015-03-27 Thread Laing, Michael
Actually I am in the middle of setting up the same sort of thing for PostgreSQL using psycopg2 and pyev. I'll be using Cassandra and PostgreSQL in an IoT experiment as the backend for swarms of MQTT brokers at something in the 10-100M client range. ml On Fri, Mar 27, 2015 at 4:59 PM, Laing, Mich

Re: High latencies for simple queries

2015-03-27 Thread Laing, Michael
I use callback chaining with the python driver and can confirm that it is very fast. You can "chain the chains" together to perform sequential processing. I do this when retrieving "metadata" and then the referenced "payload" for example, when the metadata has been inverted and the payload is larg

Re: High latencies for simple queries

2015-03-27 Thread Tyler Hobbs
Since you're executing queries sequentially, you may want to look into using callback chaining to avoid the cross-thread signaling that results in the 1ms latencies. Basically, just use session.execute_async() and attach a callback to the returned future that will execute your next query. The cal

Re: High latencies for simple queries

2015-03-27 Thread Artur Siekielski
I think that in your example Postgres spends most time on waiting for fsync() to complete. On Linux, for a battery-backed raid controller, it's safe to mount ext4 filesystem with "barrier=0" option which improves fsync() performance a lot. I have partitions mounted with this option and I did a

Re: High latencies for simple queries

2015-03-27 Thread Artur Siekielski
Yes, I'm concerned about the latency. Throughput can be high even when using Python: http://datastax.github.io/python-driver/performance.html. But in my scenarios I need to run queries sequentially, so latencies matter. And Cassandra requires issuing more queries than SQL databases so these lat

Re: High latencies for simple queries

2015-03-27 Thread Ben Bromhead
Latency can be so variable even when testing things locally. I quickly fired up postgres and did the following with psql: ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i)); CREATE TABLE ben=# \timing Timing is on. ben=# INSERT INTO foo VALUES(2, 'yay'); INSERT 0 1 Time: 1.162 ms ben=# INSERT I

Re: High latencies for simple queries

2015-03-27 Thread Tyler Hobbs
Just to check, are you concerned about minimizing that latency or maximizing throughput? I'll that latency is what you're actually concerned about. A fair amount of that latency is probably happening in the python driver. Although it can easily execute ~8k operations per second (using cpython),

High latencies for simple queries

2015-03-27 Thread Artur Siekielski
I'm running Cassandra locally and I see that the execution time for the simplest queries is 1-2 milliseconds. By a simple query I mean either INSERT or SELECT from a small table with short keys. While this number is not high, it's about 10-20 times slower than Postgresql (even if INSERTs are w