Re: Query around Data Modelling -2

2022-07-01 Thread Bowen Song via user
mp;utm_campaign=icon> *From:* Bowen Song *Sent:* Friday, July 1, 2022 08:48 *To:* user@cassandra.apache.org *Subject:* Re: Query around Data Modelling -2 This message was sent from outside the company. Please do not click links or open attachments unle

Re: Query around Data Modelling -2

2022-07-01 Thread MyWorld
/SkylineCommu> > > <https://www.facebook.com/SkylineCommunications/> > > <https://www.instagram.com/skyline.dataminer/> > > > <https://skyline.be/skyline/awards?utm_source=signature&utm_medium=email&utm_campaign=icon> > > > > > >

RE: Query around Data Modelling -2

2022-06-30 Thread Michiel Saelen
dium=email&utm_campaign=icon> [cid:image010.png@01D88D2B.263669C0] From: Bowen Song Sent: Friday, July 1, 2022 08:48 To: user@cassandra.apache.org Subject: Re: Query around Data Modelling -2 This message was sent from outside the company. Please do not click links or open attachments unles

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
e auto-compaction on the table and is relying on weekly scheduled compactions? Or running weekly major compactions? Neither of these sounds right. On 30/06/2022 15:03, MyWorld wrote: Hi all, Another query around data Modelling. We have a existing table with below

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
ng on weekly scheduled > compactions? Or running weekly major compactions? Neither of these sounds > right. > On 30/06/2022 15:03, MyWorld wrote: > > Hi all, > > Another query around data Modelling. > > We have a existing table with below structure: > Table(PK,CK, col1,col2,

Re: Query around Data Modelling -2

2022-06-30 Thread Bowen Song
06/2022 15:03, MyWorld wrote: Hi all, Another query around data Modelling. We have a existing table with below structure: Table(PK,CK, col1,col2, col3, col4,col5) Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB to 80MB. We have overall 100+ millions partitions. Also we

Re: Query around Data Modelling -2

2022-06-30 Thread MyWorld
all, > > Another query around data Modelling. > > We have a existing table with below structure: > Table(PK,CK, col1,col2, col3, col4,col5) > > Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB > to 80MB. We have overall 100+ millions partitions. Als

Re: Query around Data Modelling -2

2022-06-30 Thread Jeff Jirsa
other query around data Modelling. > > We have a existing table with below structure: > Table(PK,CK, col1,col2, col3, col4,col5) > > Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB to > 80MB. We have overall 100+ millions partitions. Also we have set le

Query around Data Modelling -2

2022-06-30 Thread MyWorld
Hi all, Another query around data Modelling. We have a existing table with below structure: Table(PK,CK, col1,col2, col3, col4,col5) Now each Pk here have 1k - 10k Clustering keys. Each PK has size from 10MB to 80MB. We have overall 100+ millions partitions. Also we have set levelled

Re: Query around Data Modelling

2022-06-22 Thread MyWorld
e is still > under 100 MB > > On Thu, Jun 23, 2022, 7:18 AM Jeff Jirsa wrote: > >> How many rows per partition in each model? >> >> >> > On Jun 22, 2022, at 6:38 PM, MyWorld wrote: >> > >> >  >> > Hi all, >> > >>

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
7:18 AM Jeff Jirsa wrote: >>> How many rows per partition in each model? >>> >>> >>> > On Jun 22, 2022, at 6:38 PM, MyWorld wrote: >>> > >>> >  >>> > Hi all, >>> > >>> > Just a small query aroun

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
022, at 6:38 PM, MyWorld wrote: >> > >> >  >> > Hi all, >> > >> > Just a small query around data Modelling. >> > Suppose we have to design the data model for 2 different use cases which >> > will query the data on same set of (partion

Re: Query around Data Modelling

2022-06-22 Thread MyWorld
ach model? > > > > On Jun 22, 2022, at 6:38 PM, MyWorld wrote: > > > >  > > Hi all, > > > > Just a small query around data Modelling. > > Suppose we have to design the data model for 2 different use cases which > will query the data on same set

RE: Query around Data Modelling

2022-06-22 Thread Michiel Saelen
campaign=icon> [cid:image010.png@01D886E7.E4E5C360] From: MyWorld Sent: Thursday, June 23, 2022 09:38 To: user@cassandra.apache.org Subject: Query around Data Modelling This message was sent from outside the company. Please do not click links or open attachments unless you recognise

Re: Query around Data Modelling

2022-06-22 Thread manish khandelwal
Table1 should be fine if some column values are not entered than Cassandra will not create entry for them so partiton will almost be same in both cases. On Thu, Jun 23, 2022, 07:08 MyWorld wrote: > Hi all, > > Just a small query around data Modelling. > Suppose we have to design th

Re: Query around Data Modelling

2022-06-22 Thread Jeff Jirsa
How many rows per partition in each model? > On Jun 22, 2022, at 6:38 PM, MyWorld wrote: > >  > Hi all, > > Just a small query around data Modelling. > Suppose we have to design the data model for 2 different use cases which will > query the data on same set of (par

Query around Data Modelling

2022-06-22 Thread MyWorld
Hi all, Just a small query around data Modelling. Suppose we have to design the data model for 2 different use cases which will query the data on same set of (partion+clustering key). So should we maintain a seperate table for each or a single table. Model1 - Combined table Table(Pk,CK, col1

Re: data modelling

2019-03-05 Thread Stefan Miklosovic
quot; for having queries super fast and tailored for your use case. I suggest to read more about data modelling in general. On Wed, 6 Mar 2019 at 11:19, Bobbie Haynes wrote: > Hi >Could you help modelling this usecase > >I have below table ..I will update tagid's

RE: data modelling

2019-03-05 Thread Kenneth Brotman
the query? If you could have tagid not be a collection, and make it part of the primary key, that would help a lot. From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] Sent: Tuesday, March 05, 2019 4:33 PM To: user@cassandra.apache.org Subject: RE: data modelling Hi Bobbie

RE: data modelling

2019-03-05 Thread Kenneth Brotman
mp;q=jeff%20carpenter%20chapter%205&f=false From: Bobbie Haynes [mailto:haynes30...@gmail.com] Sent: Tuesday, March 05, 2019 4:19 PM To: user@cassandra.apache.org Subject: data modelling Hi Could you help modelling this usecase I have below table ..I will update tagid&

data modelling

2019-03-05 Thread Bobbie Haynes
Hi Could you help modelling this usecase I have below table ..I will update tagid's columns set(bigit) based on PK. I have created the secondary index column on tagid to query like below.. Select * from keyspace.customer_sensor_tagids where tagids CONTAINS 11358097; this query is doing th

RE: [EXTERNAL] Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-28 Thread Durity, Sean R
modelling For E-Commerce Pattern data modelling for Search Hi -- you want to use Elasticsearch with a Cassandra store for the blob data. On Thu, Dec 7, 2017 at 7:39 PM, @Nandan@ mailto:nandanpriyadarshi...@gmail.com>> wrote: Hi Peoples, As currently around the world 60-70% websites are exc

Reg:- Data Modelling Concept ** Amazon Video **

2017-12-21 Thread @Nandan@
Hi All, For Self Exploring, I am trying to do data modeling for Amazon Video [For learning purpose] and trying to check, is it possible to do data modeling for Amazon Video or not. Below are the details:- Amazon Video Contains different columns such as:- 1) Video_title -> One VIdeo One Title 2)

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-08 Thread Bradford Stephens
Hi -- you want to use Elasticsearch with a Cassandra store for the blob data. On Thu, Dec 7, 2017 at 7:39 PM, @Nandan@ wrote: > Hi Peoples, > > As currently around the world 60-70% websites are excelling with > E-commerce in which we have to store huge amount of data and select pattern > based o

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread Jon Haddad
You’re going to have duplicate data no matter what you do. Creating indexes is another representation of the data, it’s not free. Yes, storing it in two places is more work, but I’ve typically had to do that anyways. My search queries are almost never an exact match to my Cassandra data mod

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Thanks. But again my questions come back at the same place that how to do data modeling because If we will do denormalized then we have to allow a lot of data duplication, as well as Insert and Update, will also need to think because based on this we have to insert data into multiple tables at sam

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread Jon Haddad
I mean ES is great as a search engine. I would use Cassandra as my source of truth, and also index my data in ES. I typed my original message before I walked my dog, I should have also pointed out https://github.com/strapdata/elassandra and https://gi

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Hi Jon, Do you mean Elastic search for storing data or Data should be store into Cassandra and use Elastic Search for Select records from tables. ? On Fri, Dec 8, 2017 at 9:50 AM, Jon Haddad wrote: > 1. No, Apache Cassandra is pretty terrible for search on it’s own. Even > with SASI. > 2. Mayb

Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread Jon Haddad
1. No, Apache Cassandra is pretty terrible for search on it’s own. Even with SASI. 2. Maybe, but it’s complicated, and doing it right takes a lot of experience. I’d use Elastic Search instead. > On Dec 7, 2017, at 5:39 PM, @Nandan@ wrote: > > Hi Peoples, > > As currently around the world

Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Hi Peoples, As currently around the world 60-70% websites are excelling with E-commerce in which we have to store huge amount of data and select pattern based on Partial Search, Text match, Full-Text Search and all. So below questions comes to mind : 1) Is Cassandra a correct choice for data mode

Re: Help in c* Data modelling

2017-07-23 Thread @Nandan@
Hi , The best way will go with per query per table plan.. and distribute the common column into both tables. This will help you to support queries as well as Read and Write will be fast. Only Drawback will be, you have to insert common data into both tables at the same time which can be easily hand

Re: Help in c* Data modelling

2017-07-23 Thread Jonathan Haddad
Using a different table to answer each query is the correct answer here assuming there's a significant amount of data. If you don't have that much data, maybe you should consider using a database like Postgres which gives you query flexibility instead of horizontal scalability. On Sun, Jul 23, 201

Re: Help in c* Data modelling

2017-07-23 Thread techpyaasa .
Hi vladyu/varunbarala Instead of creating second table as you said can I just have one(first) table below and get all rows with status=0. CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING ORDER BY (pid

Re: Help in c* Data modelling

2017-07-23 Thread Vladimir Yudovin
Hi, unfortunately ORDER BY is supported for clustering columns only... Winguzone - Cloud Cassandra Hosting On Sun, 23 Jul 2017 12:49:36 -0400 techpyaasa . wrote Hi Varun, Thanks a lot for your reply. In this case if I want to update status(st

Re: Help in c* Data modelling

2017-07-23 Thread techpyaasa .
Hi Varun, Thanks a lot for your reply. In this case if I want to update status(status can be updated for given account_id, pid) , I need to delete existing row in 2nd table & add new one... :( :( Its like hitting cassandra twice for 1 change.. :( On Sun, Jul 23, 2017 at 8:42 PM, Varun Barala

Re: Help in c* Data modelling

2017-07-23 Thread Varun Barala
Hi, You can create pseudo index table. IMO, structure can be:- CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING ORDER BY (pid ASC); CREATE TABLE IF NOT EXISTS test.user_index ( account_id bigint, pi

Help in c* Data modelling

2017-07-22 Thread techpyaasa .
Hi , We have a table like below : CREATE TABLE ks.cf ( accountId bigint, pid bigint, dispName text, status > int, PRIMARY KEY (accountId, pid) ) WITH CLUSTERING ORDER BY (pid ASC); We would like to have following queries possible on the above table: select * from site24x7.wm_current_status wh

Reg:- Data Modelling Conceptual [DISCUSS]

2017-06-22 Thread @Nandan@
Hi All, I am working on the data model. Just want to discuss based on below conditions with valid pros and cons. Requirment:- 1) User Registration Module 1.1) Multi types of Users such as Buyer, Seller. 1.2) Registration pages are different for different types of users which contain different numb

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes I am not thinking to go with MV. I am trying to implement by myself. May be some idea will get about doing cassandra-stress about data generation and all. Thanks Jonathan. On Tue, Jun 13, 2017 at 10:44 AM, Jonathan Haddad wrote: > Unless you're willing to put in a lot of time fixing bugs, I'

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jonathan Haddad
Unless you're willing to put in a lot of time fixing bugs, I'd recommend avoiding 3.0's materialized views and manage them yourself. On Mon, Jun 12, 2017 at 6:11 PM @Nandan@ wrote: > Correct, Our first concern is to store huge READ and WRITE, for that > Cassandra is our First and Best Choice. Bu

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Correct, Our first concern is to store huge READ and WRITE, for that Cassandra is our First and Best Choice. But according to Use Case, we need to implement Advance search like Partial text, Phrase search etc.. So we are thinking the best way, that how to implement data model. On Tue, Jun 13, 201

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Michael , MV is also good option when we have to select based on equality search, but here condition is to developing a model for advance partial search way. And Also , In case of MV, suppose we have 2 DC with 3 Nodes on each DC then MV will replicated data based on 6*6 times which will be anoth

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Ok , Then let's try to implement and will check by using cassandra-stress to check what will be performance. I worked on another data model for book storage for my company, with same situations like having 1 single table with 80 columns and primary key as bookid uuid. Implemented Solr on top of th

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
Agree, I meant as Jonathan said to use C* for primary key and as a primary storage and ES as an indexed version of what you have in cassandra. 2017-06-12 19:19 GMT+02:00 DuyHai Doan : > Sorry, I misread some reply I had the impression that people recommend ES > as primary datastore > > On Mon, Ju

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread DuyHai Doan
Sorry, I misread some reply I had the impression that people recommend ES as primary datastore On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad wrote: > Nobody is promoting ES as a primary datastore in this thread. Every > mention of it is to accompany C*. > > > > On Mon, Jun 12, 2017 at 10:03

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jonathan Haddad
Nobody is promoting ES as a primary datastore in this thread. Every mention of it is to accompany C*. On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan wrote: > For all those promoting ES as a PRIMARY datastore, please read this before: > > https://discuss.elastic.co/t/elasticsearch-as-a-primary-d

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread DuyHai Doan
For all those promoting ES as a PRIMARY datastore, please read this before: https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13 There are a lot of warning before recommending ES as a datastore. The answer from Pilato, ES official evangelist: - You absolutely care about

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Michael Mior
For queries 1-5 this seems like a potentially good use case for materialized views. Create one table with the videos stored by ID and the materialized views for each of the queries. -- Michael Mior mm...@apache.org 2017-06-11 22:40 GMT-04:00 @Nandan@ : > Hi, > > Currently, I am working on data

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jason Brown
removing dev@ from this conversation, as the thread is more appropriately for user@ On Mon, Jun 12, 2017 at 4:51 AM, Eduardo Alonso wrote: > -Virtual tokens are not recommended when using SOLR or > cassandra-lucene-index. > > If you use your table schema you will not have any problem with partit

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
-Virtual tokens are not recommended when using SOLR or cassandra-lucene-index. If you use your table schema you will not have any problem with partition size because your table is *not* a WIDE row table (it does not have clustering keys) The limit for 1 record with those 15 or 20 columns must not

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
And due to single table videos, maybe it will go with around 15,20 columns, then we need to also think very carefully about partition sizes also. On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ wrote: > Yes this is only Option I am also thinking like this as my second options. > Before this I was thin

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes this is only Option I am also thinking like this as my second options. Before this I was thinking to do denormalize table based on search columns, but due to partial search this will be not that effective. Now suppose , if we are going with this single table as videos. and implemented with Sol

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Using cassandra collections CREATE TABLE videos ( videoid uuid primary key, title text, actor list, producer list, release_date timestamp, description text, music text, etc... ); When using collection you need to take care of its length. Collections are designed to store

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
So In short we have to go with one single table as videos and put primary key as videoid uuid. But then how can we able to handle multiple actor name and producer name. ? On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso wrote: > Yes, you are right. > > Table denormalization is useful just when yo

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Yes, you are right. Table denormalization is useful just when you have unique primary keys, not your case. Denormalized tables are only different in its primary key, every denormalized table contains all the data (it just change how it is structured). So, if you need to index it, do it with just o

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Eduardo, And As we are trying to build an advanced search functionality in which we can able to do partial search based on actor, producer, director, etc. columns. So if we do denormalization of tables then we have to create tables such as below :- video_by_actor video_by_producer video_by_dire

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Edurado, As you mentioned queries 1-6 , In this condition, we have to proceed with a table like as below :- create table videos ( videoid uuid primary key, title text, actor text, producer text, release_date timestamp, description text, music text, etc... ); This table will help to store video

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
TLDR shouldBe *PD Eduardo Alonso Vía de las dos Castillas, 33, Ática 4, 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd * 2017-06-12 10:58 GMT+02:00 Eduardo Alonso : > Hi Nandan: > > So, your system must provide the

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Hi Nandan: So, your system must provide these queries: 1 - SELECT video FROM ... WHERE actor = '...'; 2 - SELECT video FROM ... WHERE producer = '...'; 3 - SELECT video FROM ... WHERE music = '...'; 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...'; 5 - SELECT video FROM ... WHERE

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
But Condition is , I am working with Apache Cassandra Database in which I have to store my data into Cassandra and then have to implement partial search capability. If we need to search based on full search primary key, then it really best and easy to work with Cassandra , but in case of flexible

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
I haven't run solr with Cassandra myself. I just meant to run elasticsearch as a completely separate service and write there as well. > On 12 Jun 2017, at 09:45, @Nandan@ wrote: > > Do you mean to use Elastic Search with Cassandra? > Even I am thinking to use Apache Solr With Cassandra. > In

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Do you mean to use Elastic Search with Cassandra? Even I am thinking to use Apache Solr With Cassandra. In that case I have to create distributed tables such as:- 1) video_by_title, video_by_actor, video_by_year etc.. 2) After creating Tables , will have to configure solr core on all tables. Is i

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
Why not elasticsearch for this use case? It will make your life much simpler > On 12 Jun 2017, at 04:40, @Nandan@ wrote: > > Hi, > > Currently, I am working on data modeling for Video Company in which we have > different types of users as well as different user functionality. > But currentl

Reg:- Cassandra Data modelling for Search

2017-06-11 Thread @Nandan@
Hi, Currently, I am working on data modeling for Video Company in which we have different types of users as well as different user functionality. But currently, my concern is about Search video module based on different fields. Query patterns are as below:- 1) Select video by actor. 2) select vid

Re: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread @Nandan@
ent:* vendredi 9 juin 2017 10:27 > *To:* Jacques-Henri Berthemet > *Cc:* user@cassandra.apache.org > *Subject:* Re: Reg:- Data Modelling For Hierarchy Data > > > > Hi, > > Yes, I am following with single Users table. > > Suppose my query patterns are:- > > 1)

RE: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread Jacques-Henri Berthemet
-materialized-views/ I don’t have experience on MVs, I’m stuck on 2.2 for now. Regards, -- Jacques-Henri Berthemet From: @Nandan@ [mailto:nandanpriyadarshi...@gmail.com] Sent: vendredi 9 juin 2017 10:27 To: Jacques-Henri Berthemet Cc: user@cassandra.apache.org Subject: Re: Reg:- Data Modelling For

Re: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread @Nandan@
ng as all of them don’t exceed 64k, but you > could create dedicate columns for all attributes that you know will always > be there. > > > > *--* > > *Jacques-Henri Berthemet* > > > > *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com] > *Sent:* vendredi

RE: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread Jacques-Henri Berthemet
exceed 64k, but you could create dedicate columns for all attributes that you know will always be there. -- Jacques-Henri Berthemet From: @Nandan@ [mailto:nandanpriyadarshi...@gmail.com] Sent: vendredi 9 juin 2017 03:14 To: user@cassandra.apache.org Subject: Reg:- Data Modelling For Hierarchy Data

Reg:- Data Modelling For Hierarchy Data

2017-06-08 Thread @Nandan@
Hi, I am working on Music database where we have multiple order of users of our portal. Different category of users is having some common attributes but some different attributes based on their registration. This becomes a hierarchy pattern. I am attaching one sample hierarchy pattern of User Modu

Re: Reg:- Data Modelling Concepts

2017-05-20 Thread Gedeon Kamga
One issue that will be encountered with this data model is the unbounded partition growth. Partition will continue to grow indefinitely over time and there will be a risk to hit the limit of 2 billions columns per partition. Consider a composite partition key. Thanks, Gedeon On Wed, May 17, 201

Reg:- Data Modelling Documentation

2017-05-18 Thread @Nandan@
Hi Team, Just as Information, When Data modeling Document will be published on official link. Waiting for its so long time. Please update the document. Currently no any documents present. And please put document, which can be understood by the absolute beginner as well as others also. Thanks in A

Re: Reg:- Data Modelling Concepts

2017-05-17 Thread Anthony Grasso
Hi Nandan, If there is a requirement to answer a query "What are the changes to a book made by a particular user?", then yes the schema you have proposed can work. To obtain the list of updates for a book by a user from the *book_title_by_user* table will require the partition key (*book_title*),

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
Hi Jon, We need to keep tracking of all updates like 'User' of our platform can check what changes made before. I am thinking in this way.. CREATE TABLE book_info ( book_id uuid, book_title text, author_name text, updated_at timestamp, PRIMARY KEY(book_id)); This table will contain details about a

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread Jonathan Haddad
Sorry, I hit return a little early. What you want is called "event sourcing": https://martinfowler.com/eaaDev/EventSourcing.html Think of it as time series applied to state (instead of mutable state) CREATE TABLE book ( name text, ts timeuuid, author text, primary key(bookid, ts) ); for example

Re: Reg:- Data Modelling Concepts

2017-05-16 Thread Jonathan Haddad
I don't understand why you need to store the old value a second time. If you know that the value went from A -> B -> C, just store the new value, not the old. You can see that it changed from A->B->C without storing it twice. On Tue, May 16, 2017 at 6:36 PM @Nandan@ wrote: > The requirement is

Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
The requirement is to create DB in which we have to keep data of Updated values as well as which user update the particular book details and what they update. We are like to create a schema which store book info, as well as the history of the update, made based on book_title, author, publisher, pr

Re: Reg:- Data Modelling based on Update History details

2017-05-15 Thread Anthony Grasso
Hi Nandan, Interesting project! One thing that helps define the schema is knowing what queries will be made to the database up front. It sounds like you have an idea already of what those queries will be. I want to confirm that these are the queries that the database needs to answer. - *What

Reg:- Data Modelling based on Update History details

2017-05-14 Thread @Nandan@
Hi , I am currently working on Book Management System in which I have a table which contains Books details in which PRIMARY KEY is book_id uuid. The requirement is to create DB in which we have to keep data of Updated values as well as which user update the particular book details and what they upd

Re: Query on Data Modelling of a specific usecase

2017-04-20 Thread Naresh Yadav
Hi Jon, Thanks for your guidance. In above mentioned table i can have different scale depending on Report. One report may have 1 rows. Second report may have half million rows. Third report may have 1 million rows. Fourth report may have 10 million rows. As this is timeseries data that was

Re: Query on Data Modelling of a specific usecase

2017-04-19 Thread Jon Haddad
How much data do you plan to store in each table? I’ll be honest, this doesn’t sound like a Cassandra use case at first glance. 1 table per report x 1000 is going to be a bad time. Odds are with different queries, you’ll need multiple views, so lets call that a handful of tables per report.

Re: Query on Data Modelling of a specific usecase

2017-04-18 Thread Naresh Yadav
Looking for cassandra expert's recommendation on above usecase, please reply. On Mon, Apr 17, 2017 at 7:37 PM, Naresh Yadav wrote: > Hi all, > > This is my existing table configured on apache-cassandra-3.0.9: > > CREATE TABLE report_id1 ( >mc_id text, >tag_id text, >e_date timestamp.

Query on Data Modelling of a specific usecase

2017-04-17 Thread Naresh Yadav
Hi all, This is my existing table configured on apache-cassandra-3.0.9: CREATE TABLE report_id1 ( mc_id text, tag_id text, e_date timestamp. value text PRIMARY KEY ((mc_id, tag_id), e_date) } I create table dynamically for each report from application. Need to support upto 1000 re

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-27 Thread Zoltan Lorincz
Great suggestion! Thanks Avi! On Mon, Mar 27, 2017 at 3:47 PM, Avi Kivity wrote: > You can use static columns to and just one table: > > > CREATE TABLE documents ( > > doc_id uuid, > > element_id uuid, > > description text static, > > doc_title text static, > > element_title

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-27 Thread Avi Kivity
You can use static columns to and just one table: CREATE TABLE documents ( doc_id uuid, element_id uuid, description text static, doc_title text static, element_title text, PRIMARY KEY (doc_id, element_id) ); The static columns are present once per unique doc_id.

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-27 Thread Zoltan Lorincz
Thank you Matija, because i am newbie, it was not clear for me that i am able to query by the partition key (not providing the clustering key), sorry about that! Zoltan. On Mon, Mar 27, 2017 at 1:54 PM, Matija Gobec wrote: > Thats exactly what I described. IN queries can be used sometimes but I

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-27 Thread Matija Gobec
Thats exactly what I described. IN queries can be used sometimes but I usually run parallel async as Alexander explained. On Mon, Mar 27, 2017 at 12:08 PM, Zoltan Lorincz wrote: > Hi Alexander, > > thank you for your help! I think we found the answer: > > CREATE TABLE documents ( > doc_id uu

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-27 Thread Zoltan Lorincz
Hi Alexander, thank you for your help! I think we found the answer: CREATE TABLE documents ( doc_id uuid, description text, title text, PRIMARY KEY (doc_id) ); CREATE TABLE nodes ( doc_id uuid, element_id uuid, title text, PRIMARY KEY (doc_id, element_id) ); We

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-26 Thread Alexander Dejanovski
Hi Zoltan, you must try to avoid multi partition queries as much as possible. Instead, use asynchronous queries to grab several partitions concurrently. Try to send no more than ~100 queries at the same time to avoid DDOS-ing your cluster. This would leave you roughly with 1000+ async queries gro

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-26 Thread Zoltan Lorincz
Querying by (doc_id and element_id ) OR just by (element_id) is fine, but the real question is, will it be efficient to query 100k+ primary keys in the elements table? e.g. SELECT * FROM elements WHERE element_id IN (element_id1, element_id2, element_id3, element_id100K+) ? The elements_id

Re: Help with data modelling (from MySQL to Cassandra)

2017-03-26 Thread Matija Gobec
Have one table hold document metadata (doc_id, title, description, ...) and have another table elements where partition key is doc_id and clustering key is element_id. Only problem here is if you need to query and/or update element just by element_id but I don't know your queries up front. On Sun,

Help with data modelling (from MySQL to Cassandra)

2017-03-26 Thread Zoltan Lorincz
Dear cassandra users, We have the following structure in MySql: documents->[doc_id(primary key), title, description] elements->[element_id(primary key), doc_id(index), title, description] Notation: table name->[column1(key or index), column2, …] We want to transfer the data to Cassandra. Each

Estimating partition size for C*2.X and C*3.X and Time Series Data Modelling.

2016-06-20 Thread G P
f the same tables in MSSQL to C* is not recommended due to the way C*2.X stores its data. I took the DS220: Data Modelling Course, that showcases two formulas for estimating a partition size based on the Table design. [cid:image003.png@01D1CB16.9A41FD30] [cid:image004.png@01D1CB16.9A41FD30] Not

Re: Data modelling, including cleanup

2016-04-10 Thread Bo Finnerup Madsen
materialised views so that you don’t need > to keep two tables up to date manually: > http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views > > Hannu > > On 17 Mar 2016, at 12:05, Bo Finnerup Madsen > wrote: > > Hi, > > We are pretty new to data

Re: Data modelling, including cleanup

2016-03-19 Thread Hannu Kröger
log/new-in-cassandra-3-0-materialized-views> Hannu > On 17 Mar 2016, at 12:05, Bo Finnerup Madsen wrote: > > Hi, > > We are pretty new to data modelling in cassandra, and are having a bit of a > challenge creating a model that caters both for queries and updates. >

Data modelling, including cleanup

2016-03-19 Thread Bo Finnerup Madsen
Hi, We are pretty new to data modelling in cassandra, and are having a bit of a challenge creating a model that caters both for queries and updates. Let me try to explain it using the users example from http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling They define two

Re: Example Data Modelling

2015-07-11 Thread Jérôme Mainaud
;>>>> On Tue, Jul 7, 2015 at 11:32 AM, Peer, Oded wrote: >>>>> >>>>>> The data model suggested isn’t optimal for the “end of month” query >>>>>> you want to run since you are not querying by partition key. >>>>>> >

Re: Example Data Modelling

2015-07-08 Thread Saladi Naidu
month as clustering and keep employee details as static columns so they wont be repeated  Naidu Saladi From: Srinivasa T N To: "user@cassandra.apache.org" Sent: Tuesday, July 7, 2015 3:07 AM Subject: Re: Example Data Modelling Thanks for the inputs. Now my question is how

Re: Example Data Modelling

2015-07-07 Thread John Sanda
or a specific month which might > cause hotspots on those nodes. > > > > Choose the approach that works best for you. > > > > > > *From:* Carlos Alonso [mailto:i...@mrcalonso.com > ] > *Sent:* Monday, July 06, 2015 7:04 PM > *To:* user@cassandra.apache.or

Re: Example Data Modelling

2015-07-07 Thread Carlos Alonso
;>>>> where month = 1” which requires filtering and has unpredictable >>>>> performance. >>>>> >>>>> >>>>> >>>>> For this type of query to be fast you can use the “month” column as >>>>> the pa

Re: Example Data Modelling

2015-07-07 Thread Jérôme Mainaud
This approach also has drawbacks: >>>> >>>> 1. This data model creates a wide row. Depending on the number of >>>> employees this partition might be very large. You should limit partition >>>> sizes to 25MB >>>> >>>> 2. D

  1   2   >