Re: Cassandra behaviour

2010-07-26 Thread tsuraan
> It's reading through keys in the index and adding offset information > about roughly every 128th entry in RAM, in order to speed up reads. > Performing a binary search in an sstable from scratch would be > expensive. Because of the high cost of disk seeks, most storage > systems use btrees with a

Re: Cassandra behaviour

2010-07-26 Thread tsuraan
> My guess: > Your test is beating up your system. The system may need more memory > or disk throughput or CPU in order to keep up with that particular > test. Yeah, I am testing on a pretty wimpy machine; I just wanted to get some practice getting cassandra up and running, and I ran into this pro

Re: Cassandra behaviour

2010-07-26 Thread tsuraan
> Bloom filters are indeed linear in size with respect to the number of > items (assuming a constant target false positive rate). While I have > not looked at how Cassandra calculates the bloom filter sizes, I feel > pretty confident in saying that it won't dynamically replace bloom > filters with

Cassandra behaviour

2010-07-26 Thread tsuraan
I have a system where we're currently using Postgres for all our data storage needs, but on a large table the index checks for primary keys are really slowing us down on insert. Cassandra sounds like a good alternative (not saying postgres and cassandra are equivalent; just that I think they are b

0.7, 0.8 roadmaps

2010-07-06 Thread tsuraan
Is there a document anywhere with estimated release dates for the 0.7 and 0.8 versions of cassandra? I've seen https://issues.apache.org/jira/browse/CASSANDRA/fixforversion/12314533, which indicates progress towards 0.7, but I haven't had much luck with finding date estimates. I'm especially inte

Re: Concurrent SuperColumn update question

2010-04-23 Thread tsuraan
> On Thu, Apr 22, 2010 at 11:34 AM, tsuraan wrote: >> Suppose I have a SuperColumn CF where one of the SuperColumns in each >> row is being treated as a list (e.g. keys only, values are just >> empty).  In this list, values will only ever be added; deletion never >

Concurrent SuperColumn update question

2010-04-22 Thread tsuraan
Suppose I have a SuperColumn CF where one of the SuperColumns in each row is being treated as a list (e.g. keys only, values are just empty). In this list, values will only ever be added; deletion never occurs. If I have two processes simultaneously add values to this list (on different nodes, wh

Re: Re: Modelling assets and user permissions

2010-04-20 Thread tsuraan
> It seems to me you might get by with putting the actual assets into > cassandra (possibly breaking them up into chunks depending on how big > they are) and storing the pointers to them in Postgres along with all > the other metadata.  If it were me, I'd split each file into a fixed > chunksize an

Re: Re: Modelling assets and user permissions

2010-04-20 Thread tsuraan
> I'm curious as to how you would have so many asset / user permissions that > you couldn't use a standard relational database to model them. Is this some > sort of multi-tenant system where you're providing some generalized asset > check-out mechanism to many, many customers? Even so, I'm not sure

Re: Modelling assets and user permissions

2010-04-20 Thread tsuraan
> Suppose I have a CF that holds some sort of assets that some users of > my program have access to, and that some do not.  In SQL-ish terms it > would look something like this: > > TABLE Assets ( >  asset_id serial primary key, >  ... > ); > > TABLE Users ( >  user_id serial primary key, >  user_n

Modelling assets and user permissions

2010-04-19 Thread tsuraan
Suppose I have a CF that holds some sort of assets that some users of my program have access to, and that some do not. In SQL-ish terms it would look something like this: TABLE Assets ( asset_id serial primary key, ... ); TABLE Users ( user_id serial primary key, user_name text ); TABLE