Thank you, we'll see that instrument,
2015-04-06 12:30 GMT+02:00 Srinivasa T N :
> Comparison to OpenTSDB HBase
>
> For one we do not use id’s for strings. The string data (metric names and
> tags) are written to row keys and the appropriate indexes. Because
> Cassandra has much wider rows there
Comparison to OpenTSDB HBase
For one we do not use id’s for strings. The string data (metric names and
tags) are written to row keys and the appropriate indexes. Because
Cassandra has much wider rows there are far fewer keys written to the
database. The space saved by using id’s is minor and by n
Thanks, is it a kind of opentsdb?
2015-04-05 18:28 GMT+02:00 Kevin Burton :
> > Hi, I switched from HBase to Cassandra and try to find problem solution
> for timeseries analysis on top Cassandra.
>
> Depending on what you’re looking for, you might want to check out KairosDB.
>
> 0.95 beta2 just s
> Hi, I switched from HBase to Cassandra and try to find problem solution
for timeseries analysis on top Cassandra.
Depending on what you’re looking for, you might want to check out KairosDB.
0.95 beta2 just shipped yesterday as well so you have good timing.
https://github.com/kairosdb/kairosdb
Okay, so bucketing by day/week/month is a capacity planning stuff and
actual questions I want to ask.
As as a conclusion:
I have a table events
CREATE TABLE user_plans (
id timeuuid,
user_id timeuuid,
event_ts timestamp,
event_type int,
some_other_attr text
PRIMARY KEY (user_id, ends)
)
It sounds like your time bucket should be a month, but it depends on the
amount of data per user per day and your main query range. Within the
partition you can then query for a range of days.
Yes, all of the rows within a partition are stored on one physical node as
well as the replica nodes.
--
>non-equal relation on a partition key is not supported
Ok, can I generate select query:
select some_attributes
from events where ymd = 20150101 or ymd = 20150102 or 20150103 ... or
20150331
> The partition key determines which node can satisfy the query
So you mean that all rows with the same *(y
Unfortunately, a non-equal relation on a partition key is not supported.
You would need to bucket by some larger unit, like a month, and then use
the date/time as a clustering column for the row key. Then you could query
within the partition. The partition key determines which node can satisfy
the
Hi, we plan to have 10^8 users and each user could generate 10 events per
day.
So we have:
10^8 records per day
10^8*30 records per month.
Our timewindow analysis could be from 1 to 6 months.
Right now PK is PRIMARY KEY (user_id, ends) where endts is exact ts of
event.
So you suggest this approac
It depends on the actual number of events per user, but simply bucketing
the partition key can give you the same effect - clustering rows by time
range. A composite partition key could be comprised of the user name and
the date.
It also depends on the data rate - is it many events per day or just
Hi, I switched from HBase to Cassandra and try to find problem solution for
timeseries analysis on top Cassandra.
I have a entity named "Event".
"Event" has attributes:
user_id - a guy who triggered event
event_ts - when even happened
event_type - type of event
some_other_attr - some other attrs we
11 matches
Mail list logo