There are n vnodes regardless of the size of the physical cluster.
Regards
Milind
On Jun 10, 2013 7:48 AM, "Theo Hultberg" wrote:
> Hi,
>
> The default number of vnodes is 256, is there any significance in this
> number? Since Cassandra's vnodes don't work like for example Riak's, where
> there i
Why would you use Cassandra for primary store of logging information? Have
you considered Kafka ?
You could , of course, then fan out the logs to both Cassandra (on a near
real time basis ) and then on a daily basis (if you wish) extract the
"deltas" from Kafka into a RDBMS; with no PIG/Hive etc.
IMO
You would use Cassandra Counters (or other variation of distributed
counting) in case of having determined that a centralized version of
counting is not going to work.
You'd determine the non_feasibility of centralized counting by figuring the
speed at which you need to sustain writes and rea
1. Assuming that the majorirty of the line items are new and
2. The lookup of an existing line-item will dictate the performance of the
system because reads are slower than writes in C*.
3. Assuming that you are using counters in C*
Therefore eliminate that problem by implementing a bloom filte
Kafka is relatively stable and has a active well-supported news-group as
well.
As discussed by Brian, you would be inverting the paradigm of
store-process. Essentially in your original approach, you are storing the
messages first and then processing them after the fact. In the Kafka model,
you wou
On 1, countandra.org.
On 2, the issue is a little more deep (we have investigated this at
countandra). To approach it a little more comprehensively, the issue has
more to do with events rather than counts (at least in IMO).
A similar issue is about averages... countandra does sums and counts quite
Coolwww.countandra.org calls them cascaded counters and it will be also
based on Kafka.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Feb 22, 2012 7:22 PM, "Edward Capriolo" wrote:
I have been
My bad ~s/X:X-Value/Y:Y-Value/ after rereading the SELECT.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Jan 22, 2012 6:40 AM, "Milind Parikh" wrote:
The composite-key approach with coun
The composite-key approach with counters would work very well in this case.
It will also obviate the concern of not knowing the exact column names
apriori...although for efficiencies, you might to look at maintaining a
secondary cachelike cf for lookup
Depending on your data patterns(not to hi
I used rainbird as inspiration for Countandra (& some of publicly available
data structures from rainbird preso). That said, there are significant
differences between the two architectures. Additiomally as Cassandra begins
to provide triggets, some very interesting things will become possible in
Co
You might want to look at the code in countandra.org; regardless of whether
you use it. It use a model of dynamic composite keys (although static
composite keys would have worked as well). For the actual query,only one
row is hit. This of course only works bc the data model is attuned for the
query
Inspired by twitter's rainbird project, Countandra is a hierarchical
distributed counting engine at scale.
It provides a complete http based interface to both posting events and
getting queries. The syntax of a event posting is done in a FORMS
compatible way. The result of the query is emitted in
For 99% of current applications requiing a persistent datastore, Oracle,
PgSQL and MySQL variants will suffice.
For the 1% of the applications, consider C* if
(a) you have given up on distributed transactions ("ACID"LY; but
NOT "BASE"ICLY)
(b) wondering about this new fangled ho
Why have two rings? Cassandra manages the replication for youone ring
with physical nodes in two dc might be a better option. Of course, depending
on the inter-dc failure characteristics, might need to endure split-brain
for a while.
/***
sent from my android...please pardo
use zookeeper. Scott Fines has a great library on top of zk.
On Fri, Sep 16, 2011 at 7:08 PM, Daning Wang wrote:
> We try to implement an ordered queue system in Cassandra(ver 0.8.5). In
> initial design we use a row as queue, a column for each item in queue.
> that means creating new column w
Why not use couchdb for this use case?
Milind
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Aug 18, 2011 9:07 PM, "Nicholas Neuberger" wrote:
I've been using Cassandra as a database storage device
In order to be predicable @ big data scale, the intensity and periodicity of
STW Garbage Collection has to be brought down. Assume that SLABS (Cass 2252)
will be available in the main line at some time and assume that this will
have the impact that other projects (hbase etc) are reporting. I womder
If I understand this correctly, then the epoch integer would be generated by
each node. Since time always flows forward, the assumption would be, I
suppose, that the epochs would be tagged with the node that generated them
and additionally the counter would carry as much history as necessary (and
p
I believe that the key reason is souped up performance for most recent data.
And yes, "an intelligent flush" leaves you vulnerable to some data loss.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On May
Other interesting flavors in a distributed cache terracotta,
gemfire.together with a complex event processing engine. like
OCEP
drives a lot of low latency, high freq trading where nano seconds matter
/***
sent from my android...please pardon occasional typos as
Most likely because in the wild, you can't assume a reliable DNS.
Just as an aside...This question comes up often in context of managing
Cassandra clusters;especially in elastic situations. Most CMDBs assume a
static name (host names/static IPs) for nodes. However this often proves to
be mismatche
At the risk of repeating the previous conclusions:
(a) This configuration obviates the need for a patch that I had posted
earlier. This is a good thing.
(b) The reported latency(@Sasha) is less than ordinary latencies in EC2. The
reasons behind this are not well understood. However I wouldn't look
You can't route traffic over private ips across data centers.this is the
point of the patch.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Apr 26, 2011 6:59 AM, "pankaj soni" wrote:
one last d
t; Just read your paper on this. Must say helped a great deal.
> >>
> >> 1 more query does amazon by default award both external and internal IP
> >> address for each node? or we have to explicitly buy the external IP's?
> >>
> >> I am looking
thought
/
On Apr 25, 2011 7:43 AM, "Milind Parikh" wrote:
I have authored exactly this paperplease search this ml. Please be aware
about ec2's internal network as you design your deployment. Ec2 also does
not support multicast; which is a pain,but not
unable to find documentation of any such deployment
online.
Because of this multi-regions the public-private IP address issue is
important.
pankaj
On Mon, Apr 25, 2011 at 4:55 PM, Milind Parikh
wrote:
>
> It will be thro...
It will be through an overlay n/w. unfortunately setting up such n/w is
complex. Look @ something like openvpn.
If multicast is supported, it will be easier. With complex software such as
Cassandra, it is much better to go with the expected flow; rather than
devicing your own flows.my2c.
/***
I respond @ the
speed of thought
/
On Apr 25, 2011 3:54 AM, "David Strauss" wrote:
On Fri, 2011-04-22 at 13:31 -0700, Milind Parikh wrote:
> Is there a chance of getting manual confli...
You can actually already perform "manual conflict resolution" in
Same process or not: only successful QR reads after successful QW will
behave with this guarantee.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Apr 17, 2011 10:04 AM, "James Cipar" wrote:
> For a
William
The issue is regarding whether you will see A or B; with any guarantee of
either. The discussion implies no; until the QW is complete.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Apr 17, 20
Successful reads after a successful write @Q have the property of once the
read is seen @ one Q, the same read will be seen at any other Q.
All others are details that will change with implementation; but,imo, are
not bugs.
James: in your case, I would think that you have not completed a successf
#x27;t think I need other ports for basic
setup , right ?
If anyone coud get 'nodetool repair' working with this patch (across
regions), let me know. It may be I am doing something wrong.
On Wed, Mar 23, 2011 at 1:08 AM, Milind Parikh
wrote:
> @aj
> are you sure...
@aj
are you sure that all ports are accessible from all node?
@sasha
I think that being able to have the semantics of address aNAT address can
emable security from different perspective. Describing an overlay nw will
take long hete. But that may solve your security concerns over the internet.
/*
gt;>>
> >>> Great work here. Can you provide the patch against the 2 files?
> >>>
> >>> Perhaps there's some way to incorporate it into the trunk of cassandra
> so that this is feasible (in a future release) without patching the source
> code.
&
https://docs.google.com/document/d/13Yc2t4d07290TdiRmSTchuAk9sbp4BeqOpqeYhbcDFM/edit?hl=en
There was an excellent session on vector clocks and synchronous writes in
cassandra. Here are my gleanings out of it.
/***
sent from my android...please pardon occasional typos as I resp
35 matches
Mail list logo