Re: [EXTERNAL] fine tuning for wide rows and mixed worload system

2019-01-11 Thread Marco Gasparini
n key and clustering key. Can you give > us the table schema? I’m also concerned about the IF EXISTS in your delete. > I think that invokes a light weight transaction – costly for performance. > Is it really required for your use case? > > > > > > Sean Durity > > &

RE: [EXTERNAL] fine tuning for wide rows and mixed worload system

2019-01-11 Thread Durity, Sean R
Subject: [EXTERNAL] fine tuning for wide rows and mixed worload system Hello everyone, I need some advise in order to solve my use case problem. I have already tried some solutions but it didn't work out. Can you help me with the following configuration please? any help is very appreciate

fine tuning for wide rows and mixed worload system

2019-01-11 Thread Marco Gasparini
(write strategy) - no row cache (because of wide rows dimension is better to have no row cache) - gc_grace_seconds = 1 day (unfortunately, I did no repair schedule at all) results: too many timeouts, losing data 2) - added repair schedules - RF=3 (in order increase reads speed) results: - too many timeo

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Jeff Jirsa
ter keys, all the reads are going to be served >> from >> > a single replica. Compared to many concurrent individual equal >> statements >> > you can get the performance gain of leaning on several replicas for >> > parallelism. >> > >> > On

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Carl Mueller
cas for > > parallelism. > > > > On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins < > gareth.o.coll...@gmail.com> > > wrote: > >> > >> Hello, > >> > >> When querying large wide rows for multiple specific values is it > >> be

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Gareth Collins
t; a single replica. Compared to many concurrent individual equal statements > you can get the performance gain of leaning on several replicas for > parallelism. > > On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins > wrote: >> >> Hello, >> >> When querying large w

Re: Performance Of IN Queries On Wide Rows

2018-02-21 Thread Rahul Singh
0, 2018 at 11:43 AM Gareth Collins > > wrote: > > > Hello, > > > > > > When querying large wide rows for multiple specific values is it > > > better to do separate queries for each value...or do it with one query > > > and an "IN"? I am

Re: Performance Of IN Queries On Wide Rows

2018-02-20 Thread Eric Stevens
plicas for parallelism. On Tue, Feb 20, 2018 at 11:43 AM Gareth Collins wrote: > Hello, > > When querying large wide rows for multiple specific values is it > better to do separate queries for each value...or do it with one query > and an "IN"? I am using Cassandra 2

Performance Of IN Queries On Wide Rows

2018-02-20 Thread Gareth Collins
Hello, When querying large wide rows for multiple specific values is it better to do separate queries for each value...or do it with one query and an "IN"? I am using Cassandra 2.1.14 I am asking because I had changed my app to use 'IN' queries and it **appears** to be slowe

Re: Wide rows splitting

2017-09-18 Thread Stefano Ortolani
You might find this interesting: https://medium.com/@foundev/synthetic-sharding-in-cassandra-to-deal-with-large-partitions-2124b2fd788b Cheers, Stefano On Mon, Sep 18, 2017 at 5:07 AM, Adam Smith wrote: > Dear community, > > I have a table with inlinks to URLs, i.e. many URLs point to > http://

Wide rows splitting

2017-09-17 Thread Adam Smith
Dear community, I have a table with inlinks to URLs, i.e. many URLs point to http://google.com, less URLs point to http://somesmallweb.page. It has very wide and very skinny rows - the distribution is following a power law. I do not know a priori how many columns a row has. Also, I can't identify

Re: wide rows

2016-10-18 Thread Yabin Meng
With CQL data modeling, everything is called a "row". But really in CQL, a row is just a logical concept. So if you think of "wide partition" instead of "wide row" (partition is what is determined by the has index of the partition key), it will help the understanding a bit: one wide-partition may c

Re: wide rows

2016-10-18 Thread DuyHai Doan
// user table: skinny partition CREATE TABLE user ( user_id uuid, firstname text, lastname text, PRIMARY KEY ((user_id)) ); // sensor_data table: wide partition CREATE TABLE sensor_data ( sensor_id uuid, date timestamp, value double, PRIMARY KEY ((senso

RE: wide rows

2016-10-18 Thread S Ahmed
Hi, Can someone clarify how you would model a "wide" row cassandra table? From what I understand, a wide row table is where you keep appending columns to a given row. The other way to model a table would be the "regular" style where each row contains data so you would during a SELECT you would w

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Vladimir Yudovin
>querying them would be inefficient (impossible? Impossible. In the case of multi-column partition key all of them must be restricted in WHERE clause: CREATE TABLE data.table (id1 int, id2 int, primary KEY ((id1,id2))); SELECT * FROM data.table WHERE id1 = 0; InvalidRequest: Error from server: co

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Graham Sanderson
No the employees would end up in arbitrary partitions, and querying them would be inefficient (impossible? - I am levels back on C* so don’t know if ALLOW FILTERING even works for this). I would be tempted to use organization_id only or organization_Id and maybe a few shard bits (if you are wor

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
In the case of PRIMARY KEY((organization_id, employee_id)), could I still do a query like Select ... where organization_id = x, to get all employees in a particular organization? And, this will put all those employees in the same node, right? On Sun, Oct 9, 2016 at 9:17 AM, Graham Sanderson wrot

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Graham Sanderson
Nomenclature is tricky, but PRIMARY KEY((organization_id, employee_id)) will make organization_id, employee_id the partition key which equates roughly to your latter sentence (I’m not sure about the 4 billion limit - that may be the new actual limit, but probably not a good idea). > On Oct 8, 2

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
the last '4 billion rows' should say '4 billion columns / cells' On Sun, Oct 9, 2016 at 6:34 AM, Ali Akhtar wrote: > Say I have the following primary key: > PRIMARY KEY((organization_id, employee_id)) > > Will this create 1 row whose primary key is the organization id, but it > has a 4 billion c

Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
Say I have the following primary key: PRIMARY KEY((organization_id, employee_id)) Will this create 1 row whose primary key is the organization id, but it has a 4 billion column / cell limit? Or will this create 1 row for each employee in the same organization, so if i have 5 employees, they will

Re: Partition Key - Wide rows?

2016-10-06 Thread Saladi Naidu
Cheers, -PhilFrom: Ali Akhtar Sent: ‎2016-‎10-‎06 9:04 AM To: user@cassandra.apache.org Subject: Partition Key - Wide rows? Heya, I'm designing some tables, where data needs to be stored in the following hierarchy: Organization -> Team -> Project -> Issues I need to be able to re

Re: Partition Key - Wide rows?

2016-10-06 Thread Jonathan Haddad
get one partition per org (probably bad depending on your > dataset). If partition key is (org id, team id, project id) then you will > have one partition per project which is probably fine ( again, depending on > your dataset). > > Cheers, > > -Phil > -

Re: Partition Key - Wide rows?

2016-10-06 Thread Ali Akhtar
set). > > Cheers, > > -Phil > -- > From: Ali Akhtar > Sent: ‎2016-‎10-‎06 9:04 AM > To: user@cassandra.apache.org > Subject: Partition Key - Wide rows? > > Heya, > > I'm designing some tables, where data needs to be stored

RE: Partition Key - Wide rows?

2016-10-06 Thread Philip Persad
pending on your dataset). Cheers, -Phil -- From: Ali Akhtar Sent: ‎2016-‎10-‎06 9:04 AM To: user@cassandra.apache.org Subject: Partition Key - Wide rows? Heya, I'm designing some tables, where data needs to be stored in the following hierarchy: Organization -> Te

Partition Key - Wide rows?

2016-10-06 Thread Ali Akhtar
eally long, 3 UUIDs + similar length'd issue id? 3) Will this store issues as skinny rows, or wide rows? If an org has a lot of teams, which have a lot of projects, which have a lot of issues, etc, could I have issues w/ running out of the column limit of wide rows? 4) Is there a better way of achieving this scenario?

Re: STCS Compaction with wide rows & TTL'd data

2016-09-02 Thread Kevin O'Connor
On Fri, Sep 2, 2016 at 9:33 AM, Mark Rose wrote: > Hi Kevin, > > The tombstones will live in an sstable until it gets compacted. Do you > have a lot of pending compactions? If so, increasing the number of > parallel compactors may help. Nope, we are pretty well managed on compactions. Only ever

Re: STCS Compaction with wide rows & TTL'd data

2016-09-02 Thread Jonathan Haddad
Also, if you can get to at least 2.0 you can use TimeWindowCompactionStrategy which works a lot better with time series data w/ TTLs than STCS. On Fri, Sep 2, 2016 at 9:53 AM Jonathan Haddad wrote: > What's your gc_grace_seconds set to? Is it possible you have a lot of > tombstones that haven't

Re: STCS Compaction with wide rows & TTL'd data

2016-09-02 Thread Jonathan Haddad
What's your gc_grace_seconds set to? Is it possible you have a lot of tombstones that haven't reached the GC grace time yet? On Thu, Sep 1, 2016 at 12:54 AM Kevin O'Connor wrote: > We're running C* 1.2.11 and have two CFs, one called OAuth2AccessToken and > one OAuth2AccessTokensByUser. OAuth2A

Re: STCS Compaction with wide rows & TTL'd data

2016-09-02 Thread Mark Rose
Hi Kevin, The tombstones will live in an sstable until it gets compacted. Do you have a lot of pending compactions? If so, increasing the number of parallel compactors may help. You may also be able to tun the STCS parameters. Here's a good explanation of how it works: https://shrikantbang.wordpre

STCS Compaction with wide rows & TTL'd data

2016-09-01 Thread Kevin O'Connor
We're running C* 1.2.11 and have two CFs, one called OAuth2AccessToken and one OAuth2AccessTokensByUser. OAuth2AccessToken has the token as the row key, and the columns are some data about the OAuth token. There's a TTL set on it, usually 3600, but can be higher (up to 1 month). OAuth2AccessTokensB

Performance impact of wide rows on read heavy workload

2016-07-21 Thread Bhuvan Rawal
Hi, We are trying to evaluate read performance impact of having a wide row by pushing a partition out into clustering column. From all the information I could gather[1] [2]

Re: Efficient Paging Option in Wide Rows

2016-04-24 Thread Clint Martin
I tend to agree with Carlos. Having multiple row keys and parallelizing your queries will tend to result in faster responses. Keeping positions relatively small will also help your cluster to manage your data more efficiently also resulting in better performance. One thing I would recommend is to

Re: Efficient Paging Option in Wide Rows

2016-04-24 Thread Carlos Alonso
Hi Anuj, That's a very good question and I'd like to hear an answer from anyone who can give a detailed answer, but in the mean time I'll try to give my two cents. First of all I think I'd rather split all the values into different partition keys for two reasons: 1.- If you're sure you're accessi

Re: Efficient Paging Option in Wide Rows

2016-04-23 Thread Anuj Wadehra
Hi, Can anyone take this question? ThanksAnuj Sent from Yahoo Mail on Android On Sat, 23 Apr, 2016 at 2:30 PM, Anuj Wadehra wrote: I think I complicated the question..so I am trying to put the question crisply.. We have a table defined with clustering key/column. We have  5 different

Re: Efficient Paging Option in Wide Rows

2016-04-23 Thread Anuj Wadehra
I think I complicated the question..so I am trying to put the question crisply.. We have a table defined with clustering key/column. We have  5 different clustering key values.  If we want to fetch all 5 rowd,Which query option would be faster and why? 1. Given a single primary key/partiti

Efficient Paging Option in Wide Rows

2016-04-22 Thread Anuj Wadehra
Hi, I have a wide row index table so that I can fetch all row keys corresponding to a column value.  Row of index_table will look like: ColValue1:bucket1 >> rowkey1, rowkey2.. rowkeyn..ColValue1:bucketn>> rowkey1, rowkey2.. rowkeyn We will have buckets to avoid hotspots. Row keys of main tabl

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-14 Thread Carlos Alonso
k retrieval. >> >> -- >> *From:* Jack Krupansky >> *To:* user@cassandra.apache.org >> *Sent:* Friday, March 11, 2016 7:25 PM >> >> *Subject:* Re: Strategy for dividing wide rows beyond just adding to the >> partition key >

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jack Krupansky
is bulk retrieval. > > -- > *From:* Jack Krupansky > *To:* user@cassandra.apache.org > *Sent:* Friday, March 11, 2016 7:25 PM > > *Subject:* Re: Strategy for dividing wide rows beyond just adding to the > partition key > > Thanks, that level of query de

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
Krupansky To: user@cassandra.apache.org Sent: Friday, March 11, 2016 7:25 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Thanks, that level of query detail gives us a better picture to focus on. I think through this some more over the weekend. Also

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
en we don't know where to start and end. Thanks, Jason From: Carlos Alonso To: "user@cassandra.apache.org" Sent: Friday, March 11, 2016 7:24 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Hi Jason, If I understand correctly you h

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
hing we > considered too but we didn't find any detail on whether that would solve > our timeout problem. If there is a reference for using this approach, it > would be of interest to us to avoid any assumptions on how we would > approach it. > > A question: Is the probability of a timeo

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Carlos Alonso
ion of SELECT DISTINCT. We need to run queries such as >> getting the first or last 5000 sensor readings when we don't know the time >> frame at which they occurred so cannot directly supply the timeShard >> portion of our partition key. >> >> I appreciate your input, >> >> Thanks, >>

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
roach it. A question: Is the probability of a timeout directly linked to a longer seek time in reading through a partition's contents? If that is the case, splitting the partition keys into a separate table would be straightforward. Regards, Jason From: Jack Krupansky To: user@cassandra

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
w the time > frame at which they occurred so cannot directly supply the timeShard > portion of our partition key. > > I appreciate your input, > > Thanks, > > Jason > > ------ > *From:* Jack Krupansky > *To:* "user@cassandra.apache.o

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
rectly supply the timeShard portion of our partition key. I appreciate your input, Thanks, Jason From: Jack Krupansky To: "user@cassandra.apache.org" Sent: Friday, March 11, 2016 4:45 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
the scope with a > where clause. > > If there is a recommended pattern that solves this, we haven't come across > it. > > I hope makes the problem clearer. > > Thanks, > > Jason > > -- > *From:* Jack Krupansky > *To:* user@ca

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
all partition >> keys. >> >> Hopefully this is clearer. >> >> Again, any suggestions would be appreciated. >> >> Thanks, >> >> Jason >> >> -- >> *From:* Jonathan Haddad >> *To:* user@cassandra.apache.org

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
t; > Thanks, > > Jason > > -- > *From:* Jonathan Haddad > *To:* user@cassandra.apache.org; Jason Kania > *Sent:* Thursday, March 10, 2016 11:21 AM > *Subject:* Re: Strategy for dividing wide rows beyond just adding to the > partition key >

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
ia Sent: Thursday, March 10, 2016 10:42 AM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key There is an effort underway to support wider rows:https://issues.apache.org/jira/browse/CASSANDRA-9754 This won't help you now though. Even with that improv

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
ld be appreciated. Thanks, Jason From: Jonathan Haddad To: user@cassandra.apache.org; Jason Kania Sent: Thursday, March 10, 2016 11:21 AM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Have you considered making the date (or week, or whatever, some

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
, just issue multiple async queries for each day you need. On Thu, Mar 10, 2016 at 5:57 AM Jason Kania wrote: > Hi, > > We have sensor input that creates very wide rows and operations on these > rows have started to timeout regulary. We have been trying to find a > solution to div

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jack Krupansky
ditional tables. As a general proposition, Cassandra should not be used for heavy filtering - query tables with the filtering criteria baked into the PK is the way to go. -- Jack Krupansky On Thu, Mar 10, 2016 at 8:54 AM, Jason Kania wrote: > Hi, > > We have sensor input that cr

Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
Hi, We have sensor input that creates very wide rows and operations on these rows have started to timeout regulary. We have been trying to find a solution to dividing wide rows but keep hitting limitations that move the problem around instead of solving it. We have a partition key consisting of

Re: Wide rows best practices and GC impact

2014-12-04 Thread Jabbar Azam
Hello, I saw this earlier yesterday but didn't want to reply because I didn't know what the cause was. Basically I using wide rows with cassandra 1.x and was inserting data constantly. After about 18 hours the JVM would crash with a dump file. For some reason I removed the compaction

Re: Wide rows best practices and GC impact

2014-12-03 Thread Gianluca Borello
tion. On Dec 3, 2014 6:33 PM, "Robert Coli" wrote: > > On Tue, Dec 2, 2014 at 5:01 PM, Gianluca Borello wrote: >> >> We mainly store time series-like data, where each data point is a binary blob of 5-20KB. We use wide rows, and try to put in the same row all the data that

Re: Wide rows best practices and GC impact

2014-12-03 Thread Robert Coli
On Tue, Dec 2, 2014 at 5:01 PM, Gianluca Borello wrote: > We mainly store time series-like data, where each data point is a binary > blob of 5-20KB. We use wide rows, and try to put in the same row all the > data that we usually need in a single query (but not more than that). As a >

Wide rows best practices and GC impact

2014-12-02 Thread Gianluca Borello
Hi, We have a cluster (2.0.11) of 6 nodes (RF=3), c3.4xlarge instances, about 50 column families. Cassandra heap takes 8GB out of the 30GB of every instance. We mainly store time series-like data, where each data point is a binary blob of 5-20KB. We use wide rows, and try to put in the same row

Re: Re[2]: how wide can wide rows get?

2014-11-13 Thread Takenori Sato
in some of our rows and it's ok. > > -- Original Message -- > From: "Hannu Kröger" > To: "user@cassandra.apache.org" > Sent: 14.11.2014 16:13:49 > Subject: Re: how wide can wide rows get? > > > The theoretical limit is maybe 2 billion but

Re[2]: how wide can wide rows get?

2014-11-13 Thread Plotnik, Alexey
We have 380k of them in some of our rows and it's ok. -- Original Message -- From: "Hannu Kröger" mailto:hkro...@gmail.com>> To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Sent: 14.11.2014 16:13:49 Subject: Re: how wide can wid

Re: how wide can wide rows get?

2014-11-13 Thread Joe Ramsey
You can have up to 2 billion columns but there are some considerations. This article might be of some help. http://www.ebaytechblog.com/2012/08/14/cassandra-data-modeling-best-practices-part-2/#.VGWdT4enCS0

Re: how wide can wide rows get?

2014-11-13 Thread Hannu Kröger
The theoretical limit is maybe 2 billion but recommended max is around 10-20 thousand. Br, Hannu > On 14.11.2014, at 8.10, Adaryl Bob Wakefield, MBA > wrote: > > I’m struggling with this wide row business. Is there an upward limit on the > number of columns you can have? > > Adaryl "Bob"

how wide can wide rows get?

2014-11-13 Thread Adaryl "Bob" Wakefield, MBA
I’m struggling with this wide row business. Is there an upward limit on the number of columns you can have? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
record_data for each > distinct test_id. > > > On Fri, Sep 19, 2014 at 8:48 AM, DuyHai Doan wrote: > >> "Does my above table falls under the category of wide rows in Cassandra >> or not?" --> It depends on the cardinality. For each distinct test_id, how >

Re: Wide Rows - Data Model Design

2014-09-19 Thread Check Peck
ve table falls under the category of wide rows in Cassandra > or not?" --> It depends on the cardinality. For each distinct test_id, how > many combinations of client_name/record_data do you have ? > > By the way, why do you put the record_data as part of primary key ? > > In

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
"Does my above table falls under the category of wide rows in Cassandra or not?" --> It depends on the cardinality. For each distinct test_id, how many combinations of client_name/record_data do you have ? By the way, why do you put the record_data as part of primary key ?

Re: Wide Rows - Data Model Design

2014-09-19 Thread Jonathan Lacefield
Datastax/about> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/> On Fri, Sep 19, 2014 at 10:41 AM, Check Peck wrote: > I am trying to use wide rows concept in my data modelling design for > Cassandra. We are using Cassandra 2.0.6. > > CREATE TABLE test_data

Wide Rows - Data Model Design

2014-09-19 Thread Check Peck
I am trying to use wide rows concept in my data modelling design for Cassandra. We are using Cassandra 2.0.6. CREATE TABLE test_data ( test_id int, client_name text, record_data text, creation_date timestamp, last_modified_date timestamp, PRIMARY KEY

Re: CQL 3 and wide rows

2014-05-20 Thread Maciej Miklas
Thank you Nate - now I understand it ! This is real improvement when compared to CLI :) Regards, Maciej On 20 May 2014, at 17:16, Nate McCall wrote: > Something like this might work: > > > cqlsh:my_keyspace> CREATE TABLE my_widerow ( > ... id text, > ...

Re: CQL 3 and wide rows

2014-05-20 Thread Nate McCall
Something like this might work: cqlsh:my_keyspace> CREATE TABLE my_widerow ( ... id text, ... my_col timeuuid, ... PRIMARY KEY (id, my_col) ... ) WITH caching='KEYS_ONLY' AND ... compaction={'class': 'Lev

Re: CQL 3 and wide rows

2014-05-20 Thread Maciej Miklas
if I can use wide rows. My idea of wide row, is a row that can hold large amount of key-value pairs (in any form), where I can filter on those keys to efficiently load only that part which I currently need. Regards, Maciej On 20 May 2014, at 09:06, Aaron Morton wrote: > In a CQL 3 tab

Re: CQL 3 and wide rows

2014-05-20 Thread Maciej Miklas
imary key”. > > -- Jack Krupansky > > From: Aaron Morton > Sent: Tuesday, May 20, 2014 3:06 AM > To: Cassandra User > Subject: Re: CQL 3 and wide rows > > In a CQL 3 table the only **column** names are the ones defined in the table, > in the example below there are three

Re: CQL 3 and wide rows

2014-05-20 Thread Jack Krupansky
To: Cassandra User Subject: Re: CQL 3 and wide rows In a CQL 3 table the only **column** names are the ones defined in the table, in the example below there are three column names. CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text

Re: CQL 3 and wide rows

2014-05-20 Thread Aaron Morton
L 3 you would use some > static names plus Map or Set structures, or you could still alter table and > have large number of columns. But still - I do not see Iteration, so it looks > to me that CQL 3 is limited when compared to CLI/Hector. > > > Regards, > Maciej > >

Re: CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas
: > Maciej, > > In CQL3 "wide rows" are expected to be created using clustering columns. So > while the schema will have a relatively smaller number of named columns, the > effect is a wide row. For example: > > CREATE TABLE keyspace.widerow ( > ro

Re: CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas
/dev/blog/does-cql-support-dynamic-columns-wide-rows > > -- Jack Krupansky > > From: Maciej Miklas > Sent: Monday, May 19, 2014 11:20 AM > To: user@cassandra.apache.org > Subject: CQL 3 and wide rows > > Hi *, > > I’ve checked DataStax driver code for CQL 3, a

Re: CQL 3 and wide rows

2014-05-19 Thread Jack Krupansky
You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.” See: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows -- Jack Krupansky From: Maciej Miklas

RE: CQL 3 and wide rows

2014-05-19 Thread James Campbell
Maciej, In CQL3 "wide rows" are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_c

CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas
Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I

CqlPagingInputFormat: paging through wide rows

2014-04-16 Thread Paolo Estrella
Hello, I've just upgraded to Cassandra 1.2.16. I've also started using the CqlPagingInputFormat within my map/reduce tasks. I have a question with regard to using CqlPagingInputFormat for paging through wide rows. I don't see a way to input more than one column at a time int

Re: how wide to make wide rows in practice?

2013-12-18 Thread Lee Mighdoll
Hi Rob, thanks for the refresher, and the the issue link (fixed today too- thanks Sylvain!). Cheers, Lee On Wed, Dec 18, 2013 at 10:47 AM, Robert Coli wrote: > On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll wrote: > >> What's the current cassandra 2.0 advice on sizing for wide storage engine >

Re: how wide to make wide rows in practice?

2013-12-18 Thread Robert Coli
On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll wrote: > What's the current cassandra 2.0 advice on sizing for wide storage engine > rows? Can we drop the added complexity of managing day/hour partitioning > for time series stores? > "A few hundred megs" at very most is generally recommended. in_

how wide to make wide rows in practice?

2013-12-18 Thread Lee Mighdoll
I think the recommendation once upon a time was to keep wide storage engine internal rows from growing too large. e.g. for time series, it was recommended to partition samples by day or by hour to keep the size manageable. What's the current cassandra 2.0 advice on sizing for wide storage engine

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
ts > > > I am working as a core commitor in Kundera, please do let me know if you > have any query. > > Sincerely, > -Vivek > > > > On Wed, Oct 23, 2013 at 10:41 PM, Les Hartzman wrote: > >> Hi Vivek, >> >> What I'm looking for are a couple

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Vivek Mishra
#x27;m looking for are a couple of things as I'm gaining an > understanding of Cassandra. With wide rows and time series data, how do you > (or can you) handle this data in an ORM manner? Now I understand that with > CQL3, doing a "select * from time_series_data" will retur

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Hiller, Dean
.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Wednesday, October 23, 2013 11:12 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: Wide rows (time series da

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
Thanks Dean. I'll check that page out. Les On Wed, Oct 23, 2013 at 7:52 AM, Hiller, Dean wrote: > PlayOrm supports different types of wide rows like embedded list in the > object, etc. etc. There is a list of nosql patterns mixed with playorm > patterns on this page > >

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
Hi Vivek, What I'm looking for are a couple of things as I'm gaining an understanding of Cassandra. With wide rows and time series data, how do you (or can you) handle this data in an ORM manner? Now I understand that with CQL3, doing a "select * from time_series_data" wil

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Hiller, Dean
PlayOrm supports different types of wide rows like embedded list in the object, etc. etc. There is a list of nosql patterns mixed with playorm patterns on this page http://buffalosw.com/wiki/patterns-page/ From: Les Hartzman mailto:lhartz...@gmail.com>> Reply-To: "user@cassandra

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Vivek Mishra
Can Kundera work with wide rows in an ORM manner? What specifically you looking for? Composite column based implementation can be built using Kundera. With Recent CQL3 developments, Kundera supports most of these. I think POJO needs to be aware of number of fields needs to be persisted(Same as

Wide rows (time series data) and ORM

2013-10-22 Thread Les Hartzman
As I'm becoming more familiar with Cassandra I'm still trying to shift my thinking from relational to NoSQL. Can Kundera work with wide rows in an ORM manner? In other words, can you actually design a POJO that fits the standard recipe for JPA usage? Would the queries return collecti

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
So I just saw a post about how Kundera translates all JPQL to CQL. On Mon, Oct 21, 2013 at 4:45 PM, Jon Haddad wrote: > If you're working with CQL, you don't need to worry about the column > names, it's handled for you. > > If you specify multiple keys as part of the primary key, they become >

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
What if you plan on using Kundera and JPQL and not CQL? Les On Oct 21, 2013 4:45 PM, "Jon Haddad" wrote: > If you're working with CQL, you don't need to worry about the column > names, it's handled for you. > > If you specify multiple keys as part of the primary key, they become > clustering key

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Jon Haddad
If you're working with CQL, you don't need to worry about the column names, it's handled for you. If you specify multiple keys as part of the primary key, they become clustering keys and are mapped to the column names. So if you have a sensor_id / time_stamp, all your sensor readings will be i

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
So looking at Patrick McFadin's data modeling videos I now know about using compound keys as a way of partitioning data on a by-day basis. My other questions probably go more to the storage engine itself. How do you refer to the columns in the wide row? What kind of names are assigned to the colum

Wide rows/composite keys clarification needed

2013-10-20 Thread Les Hartzman
Please correct me if I'm not describing this correctly. But if I am collecting sensor data and have a table defined as follows: create table sensor_data ( sensor_id int, time_stamp int, // time to the hour granularity voltage float,

Re: token(), limit and wide rows

2013-08-17 Thread Richard Low
at > > > http://fossies.org/dox/apache-cassandra-1.2.8-src/CqlPagingRecordReader_8java_source.html > > - Jon > > > On Fri, Aug 16, 2013 at 12:08 PM, Keith Freeman <8fo...@gmail.com> wrote: > >> I've run into the same problem, surprised nobody

Re: token(), limit and wide rows

2013-08-16 Thread Jonathan Rhone
into the same problem, surprised nobody's responded to you. Any > time someone asks "how do I page through all the rows of a table in CQL3?", > the standard answer is token() and limit. But as you point out, this > method will often miss some data from wide rows. > > May

Re: token(), limit and wide rows

2013-08-16 Thread Keith Freeman
I've run into the same problem, surprised nobody's responded to you. Any time someone asks "how do I page through all the rows of a table in CQL3?", the standard answer is token() and limit. But as you point out, this method will often miss some data from wide rows. Mayb

token(), limit and wide rows

2013-08-13 Thread Jan Algermissen
HI, ok, so I found token() [1], and that it is an option for paging through randomly partitioned data. I take it that combining token() and LIMIT is the CQL3 idiom for paging (set aside the fact that one shouldn't raelly want to page and use C*) Now, when I page through a CF with wide

Hadoop - using SlicePredicate with wide rows

2013-07-31 Thread Adam Masters
: // this will cause the predicate to be ignored in favor of scanning everything as a wide row ConfigHelper.setInputColumnFamily(job.getConfiguration(), KEYSPACE, COLUMN_FAMILY, true); This suggests that ignoring the SlicePredicate for wide rows is by design - and this is certainly the behavior I

  1   2   >