Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-14 Thread Carlos Alonso
k retrieval. >> >> ------------------ >> *From:* Jack Krupansky >> *To:* user@cassandra.apache.org >> *Sent:* Friday, March 11, 2016 7:25 PM >> >> *Subject:* Re: Strategy for dividing wide rows beyond just adding to the >> partition key >

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jack Krupansky
is bulk retrieval. > > -- > *From:* Jack Krupansky > *To:* user@cassandra.apache.org > *Sent:* Friday, March 11, 2016 7:25 PM > > *Subject:* Re: Strategy for dividing wide rows beyond just adding to the > partition key > > Thanks, that level of query de

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
Krupansky To: user@cassandra.apache.org Sent: Friday, March 11, 2016 7:25 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Thanks, that level of query detail gives us a better picture to focus on. I think through this some more over the weekend. Also

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-12 Thread Jason Kania
en we don't know where to start and end. Thanks, Jason From: Carlos Alonso To: "user@cassandra.apache.org" Sent: Friday, March 11, 2016 7:24 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Hi Jason, If I understand correctly you h

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
> where timeShard is a combination of year and week of year > > For known time range based queries, this works great. However, the > specific problem is in knowing the maximum and minimum timeShard values > when we want to select the entire range of data. Our understanding is that >

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Carlos Alonso
ion of SELECT DISTINCT. We need to run queries such as >> getting the first or last 5000 sensor readings when we don't know the time >> frame at which they occurred so cannot directly supply the timeShard >> portion of our partition key. >> >> I appreciate your input, >> >> Thanks, >>

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
roach it. A question: Is the probability of a timeout directly linked to a longer seek time in reading through a partition's contents? If that is the case, splitting the partition keys into a separate table would be straightforward. Regards, Jason From: Jack Krupansky To: user@cassandra

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
w the time > frame at which they occurred so cannot directly supply the timeShard > portion of our partition key. > > I appreciate your input, > > Thanks, > > Jason > > ---------- > *From:* Jack Krupansky > *To:* "user@cassandra.apache.o

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jason Kania
rectly supply the timeShard portion of our partition key. I appreciate your input, Thanks, Jason From: Jack Krupansky To: "user@cassandra.apache.org" Sent: Friday, March 11, 2016 4:45 PM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-11 Thread Jack Krupansky
the scope with a > where clause. > > If there is a recommended pattern that solves this, we haven't come across > it. > > I hope makes the problem clearer. > > Thanks, > > Jason > > -- > *From:* Jack Krupansky > *To:* user@ca

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
all partition >> keys. >> >> Hopefully this is clearer. >> >> Again, any suggestions would be appreciated. >> >> Thanks, >> >> Jason >> >> -- >> *From:* Jonathan Haddad >> *To:* user@cassandra.apache.org

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
t; > Thanks, > > Jason > > -- > *From:* Jonathan Haddad > *To:* user@cassandra.apache.org; Jason Kania > *Sent:* Thursday, March 10, 2016 11:21 AM > *Subject:* Re: Strategy for dividing wide rows beyond just adding to the > partition key >

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
ia Sent: Thursday, March 10, 2016 10:42 AM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key There is an effort underway to support wider rows:https://issues.apache.org/jira/browse/CASSANDRA-9754 This won't help you now though. Even with that improv

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
ld be appreciated. Thanks, Jason From: Jonathan Haddad To: user@cassandra.apache.org; Jason Kania Sent: Thursday, March 10, 2016 11:21 AM Subject: Re: Strategy for dividing wide rows beyond just adding to the partition key Have you considered making the date (or week, or whatever, some

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jonathan Haddad
Have you considered making the date (or week, or whatever, some time component) part of your partition key? something like: create table sensordata ( sensor_id int, day date, ts datetime, reading int, primary key((sensor_id, day), ts); Then if you know you need data by a particular date range, j

Re: Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jack Krupansky
There is an effort underway to support wider rows: https://issues.apache.org/jira/browse/CASSANDRA-9754 This won't help you now though. Even with that improvement you still may need a more optimal data model since large-scale scanning/filtering is always a very bad idea with Cassandra. The data m

Strategy for dividing wide rows beyond just adding to the partition key

2016-03-10 Thread Jason Kania
Hi, We have sensor input that creates very wide rows and operations on these rows have started to timeout regulary. We have been trying to find a solution to dividing wide rows but keep hitting limitations that move the problem around instead of solving it. We have a partition key consisting of