Hello,
I have a Spark Streaming job reading data from kafka, processing it and
inserting it into Cassandra. The job is running on a cluster with 3
machines. I use Mesos to submit the job with 3 executors using 1 core each.
The problem is that when all executors are running on the same node, the
i
Hi David,
Could you describe why you chose to include the create date in the
partition key? If the vin in enough "partitioning", meaning that the size
(number of rows x size of row) of each partition is less than 100MB, then
remove the date and just use the create_time, because the date is already
HIVE?
What is the standard in this cases?
F Javier Pareja
You can use variable-length zig-zag coding to encode an interview if using
a blob. It is used in able and protocol buffers.
Some examples:
valuehex
0 00
-1 01
1 02
-2 03
2 04
...
-64 7f
64 80 01
...
On Sat, 10 Mar 2018, 07:52 onmstester onmstester,
wrote:
> I've find out that blobs has no gain
some sort of binary index for the
clustering keys and for relatively large partitions it can be relatively
expensive to maintain.
F Javier Pareja
On Wed, Mar 7, 2018 at 5:20 PM, Jeff Jirsa wrote:
>
>
> On Wed, Mar 7, 2018 at 7:13 AM, Carlos Rolo wrote:
>
>> Hi Jeff,
>>
uld I find another field for the partition and use
the UUID for the clustering instead?
F Javier Pareja
On Wed, Mar 7, 2018 at 2:36 PM, Jeff Jirsa wrote:
> There is no limit
>
> The token range of murmur3 is 2^64, but Cassandra properly handles token
> overlaps (we use a key that’s effe
Thank you Rahul, but is it a good practice to use a large range here? Or
would it be better to create partitions with more than 1 row (by using a
clustering key)?
>From a data query point of view I will be accessing the rows by a UID one
at a time.
F Javier Pareja
On Wed, Mar 7, 2018 at 11:12
this partition key can have?
Is it recommended to have a clustering key to reduce this number by storing
several rows in each partition instead of one row per partition.
Regards,
F Javier Pareja
Doesn't cassandra have TIMEUUID for these use cases?
Anyways, hopefully someone can help me better understand possible delays
when writing a counter.
F Javier Pareja
On Mon, Mar 5, 2018 at 1:54 PM, Hannu Kröger wrote:
> Traditionally auto increment counters have been used to generate
Hi Kyrulo,
I don't understand how UUIDs are related to counters, but I use counters to
increment the value of a cell in an atomic manner. I could try reading the
value and then writing to the cell but then I would lose the atomicity of
the update.
F Javier Pareja
On Mon, Mar 5, 2018 at 1:
on about how the counter lock is
acquired, is there a shared lock across all the nodes?
Hope I am not oversimplifying things, but I think this will be useful to
better understand how to tune up the system.
Thanks in advance.
F Javier Pareja
Thank you Jürgen,
The default consistency in the library in already ONE. I tried setting it
anyways but it made no difference. Hopefully it is a configuration issue,
that would be very good news!!
Do you have any past/present experience with large counter tables?
F Javier Pareja
On Fri, Mar 2
MiB, capacity 480 MiB,
4635329524 misses, 6673522516 requests, 0.305 recent hit rate, NaN
microseconds miss latency
Percent Repaired : 0.0%
Regards,
Javier
F Javier Pareja
On Fri, Mar 2, 2018 at 7:01 PM, Alain RODRIGUEZ wrote:
> Hi Javier,
>
> The only bottleneck in the writes
there are plenty of them) and only share the CPU
and RAM.
The only bottleneck in the writes as far as I understand it is the commit
log. Shall I create RAID0 (for speed) or install an SSD just for the
commitlog?
Thanks,
Javier
F Javier Pareja
On Fri, Mar 2, 2018 at 12:21 PM, Javier Pareja
Hello everyone,
I have configured a Cassandra cluster with 3 nodes, however I am not
getting the write speed that I was expecting. I have tested against a
counter table because it is the bottleneck of the system.
So with the system iddle I run the attached sample code (very simple async
writes wit
table.
- I also enabled tracing in the CQLSH but it showed nothing when querying
this row. It however did when querying other tables...
Thanks again for your reply!! I am very excited to be part of the Cassandra
user base.
Javier
F Javier Pareja
On Mon, Feb 19, 2018 at 8:08 AM, Alain RODRIGUEZ
Hello everyone,
I get a timeout error when reading a particular row from a large counters
table.
I have a storm topology that inserts data into a Cassandra counter table.
This table has 6 partition keys, 4 primary keys and 5 counters.
When data starts to be inserted, I can query the counters cor
17 matches
Mail list logo