Re: data model to store large volume syslog

2013-03-13 Thread Aaron Turner
On Wed, Mar 13, 2013 at 4:23 AM, Mohan L wrote: > > > On Fri, Mar 8, 2013 at 9:42 PM, aaron morton > wrote: >> >> > 1). create a column family 'cfrawlog' which stores raw log as received. >> > row key could be 'ddmmhh'(new row is added for each hour or less), each >> > 'column name' is uuid w

Re: data model to store large volume syslog

2013-03-13 Thread Mohan L
On Fri, Mar 8, 2013 at 9:42 PM, aaron morton wrote: > > 1). create a column family 'cfrawlog' which stores raw log as received. > row key could be 'ddmmhh'(new row is added for each hour or less), each > 'column name' is uuid with 'value' is raw log data. Since we are also going > to use this

Re: data model to store large volume syslog

2013-03-08 Thread aaron morton
> 1). create a column family 'cfrawlog' which stores raw log as received. row > key could be 'ddmmhh'(new row is added for each hour or less), each > 'column name' is uuid with 'value' is raw log data. Since we are also going > to use this log for forensics purpose, so it will help us to hav

RE: data model to store large volume syslog

2013-03-07 Thread moshe.kranc
Row key based on hour will create hot spots for write - for an entire hour, all the writes will be going to the same node, i.e., the node where the row resides. You need to come up with a row key that distributes writes evenly across all your C* nodes, e.g., time concatenated with a sequence cou