Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Jonathan Ellis
I see what you mean -- you have understood correctly. On Mon, Mar 29, 2010 at 8:13 AM, Henrik Schröder wrote: > On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: >> >> On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder >> wrote: >> > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: >> >>

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Mike Malone
On Mon, Mar 29, 2010 at 7:13 AM, Henrik Schröder wrote: > On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: > >> On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder >> wrote: >> > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis >> wrote: >> >> It's a unique index then? And you're trying to read

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Henrik Schröder
On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: > On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder > wrote: > > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: > >> It's a unique index then? And you're trying to read things ordered by > >> the index, not just "give me keys with that

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Jonathan Ellis
On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder wrote: > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: >> It's a unique index then?  And you're trying to read things ordered by >> the index, not just "give me keys with that have a column with this >> value?" > > Yes, because if we have mo

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Henrik Schröder
On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: > On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder > wrote: > > For each indexvalue we insert a row where the key is indexid + ":" + > > indexvalue encoded as hex string, and the row contains only one column, > > where the name is the object k

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Jonathan Ellis
On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder wrote: > For each indexvalue we insert a row where the key is indexid + ":" + > indexvalue encoded as hex string, and the row contains only one column, > where the name is the object key encoded as a bytearray, and the value is > empty. It's a uniq

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Henrik Schröder
> > So all the values for an entire index will be in one row? That > doesn't sound good. > > You really want to put each index [and each table] in its own CF, but > until we can do that dynamically (0.7) you could at least make the > index row keys a tuple of (indexid, indexvalue) and the column n

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Jonathan Ellis
On Thu, Mar 25, 2010 at 8:33 AM, Henrik Schröder wrote: > Hi everyone, > > We're trying to implement a virtual datastore for our users where they can > set up "tables" and "indexes" to store objects and have them indexed on > arbitrary properties. And we did a test implementation for Cassandra in

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Sylvain Lebresne
On Thu, Mar 25, 2010 at 5:31 PM, Henrik Schröder wrote: > On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote: >> >> I don't know If that could play any role, but if ever you have >> disabled the assertions >> when running cassandra (that is, you removed the -ea line in >> cassandra.in.sh), the

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Nathan McCall
I noticed you turned Key caching off in your ColumnFamily declaration, have you tried experimenting with this on and playing key caching configuration? Also, have you looked at the JMX output for what commands are pending execution? That is always helpful to me in hunting down bottlenecks. -Nate

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Henrik Schröder
On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote: > I don't know If that could play any role, but if ever you have > disabled the assertions > when running cassandra (that is, you removed the -ea line in > cassandra.in.sh), there > was a bug in 0.6beta2 that will make read in row with lots o

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Sylvain Lebresne
I don't know If that could play any role, but if ever you have disabled the assertions when running cassandra (that is, you removed the -ea line in cassandra.in.sh), there was a bug in 0.6beta2 that will make read in row with lots of columns quite slow. Another problem you may have is if you have

Range scan performance in 0.6.0 beta2

2010-03-25 Thread Henrik Schröder
Hi everyone, We're trying to implement a virtual datastore for our users where they can set up "tables" and "indexes" to store objects and have them indexed on arbitrary properties. And we did a test implementation for Cassandra in the following way: Objects are stored in one columnfamily, each k