Possible Bug in 0.8.2 with DynamicComposites?

2011-08-10 Thread Todd Nine
Hi guys,
  I've been dealing with a problem in my JPA plugin for a couple days now.
I've been able to create a native test in 0.8.2 that reproduces the issue.
Here is the test.


https://gist.github.com/3ce70eab8102d2555626


Essentially, here is what is happening.

A dynamic composite with the following ordering is created in a column

UTF8Type+BytesType(reversed=true).

2 columns are then inserted, without composite encoding, these are the 2
values

"jeans" + 129384000L

"jeans" + 129409920L


Here are the byte values (with spaces added to make the encoding of the
composite easier to read)  The format is 4 byte comparator, 4 byte length, n
field bytes, 1 byte comparator, then repeats

Inserted:

8073 0005 6a65616e73 00   8042 0008 012d4b889b80 00
8073 0005 6a65616e73 00   8042 0008 012d3c158780 00

Query start

8073 0005 6a65616e73 00

Query end

8073 0005 6a65616e73 01

Returned from Hector Results

8073 0005 6a65616e73 00   8042 0008 012d3c158780 00
8073 0005 6a65616e73 00   8042 0008 012d4b889b80 00


Given that the first value is sorted normally, and the second value is
reversed, I would expect the higher long value to appear before the lower
one (the longs are dates) when the first value in the composite is equal.
Is this the expected behavior, or is this a bug?

Thanks,
Todd


Possible Bug in 0.8.2 with DynamicComposite range scans?

2011-08-10 Thread Todd Nine
Hi guys,
  I've been dealing with a problem in my JPA plugin for a couple days
now.  I've been able to create a native test in 0.8.2 that reproduces
the issue.  Here is the test.


https://gist.github.com/3ce70eab8102d2555626


Essentially, here is what is happening.

A dynamic composite with the following ordering is created in a column

UTF8Type+BytesType(reversed=
true).

2 columns are then inserted, without composite encoding, these are the 2 values

"jeans" + 129384000L

"jeans" + 129409920L


Here are the byte values (with spaces added to make the encoding of
the composite easier to read)  The format is 4 byte comparator, 4 byte
length, n field bytes, 1 byte comparator, then repeats

Inserted:

8073 0005 6a65616e73 00    8042 0008 012d4b889b80 00
8073 0005 6a65616e73 00    8042 0008 012d3c158780 00

Query start

8073 0005 6a65616e73 00

Query end

8073 0005 6a65616e73 01

Returned from Hector Results

8073 0005 6a65616e73 00    8042 0008 012d3c158780 00
8073 0005 6a65616e73 00    8042 0008 012d4b889b80 00


Given that the first value is sorted normally, and the second value is
reversed, I would expect the higher long value to appear before the
lower one (the longs are dates) when the first value in the composite
is equal.  Is this the expected behavior, or is this a bug?

Thanks,
Todd


Re: Possible Bug in 0.8.2 with DynamicComposite range scans?

2011-08-10 Thread Sylvain Lebresne
Well, this seem to be on the hector side.

I've tried the same example using the CLI, and:

[default@unknown] create keyspace test;
642e6f90-c336-11e0--242d50cf1fd5
Waiting for schema agreement...
... schemas agree across the cluster
[default@unknown] use test;
Authenticated to keyspace: test
[default@test] create column family foobar with
comparator=DynamicCompositeType and key_validation_class=AsciiType and
default_validation_class=AsciiType;
40032380-c337-11e0--242d50cf1fd5
Waiting for schema agreement...
... schemas agree across the cluster
[default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@1'] = a;
Value inserted.
[default@test] get foobar[k];
=> (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
timestamp=1312970389512000)
Returned 1 results.
[default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@2'] = a;
Value inserted.
[default@test] get foobar[k];
=> (column=UTF8Type@jeans:BytesType(reversed=true)@02, value=a,
timestamp=1312970410712000)
=> (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
timestamp=1312970389512000)
Returned 2 results.

Now, the last query is not exactly the one you do, since it does a full row
query but the CLI don't support setting the start and end of a slice. However,
I have tried hard-coding the exact query into the CLI (with
start='UTF8Type@jeans'
and end='UTF8Type@jeans:!'), and it still returns the columns in the columns
in the right order (with the biggest second component first).

--
Sylvain

On Wed, Aug 10, 2011 at 9:26 AM, Todd Nine  wrote:
> Hi guys,
>   I've been dealing with a problem in my JPA plugin for a couple days
> now.  I've been able to create a native test in 0.8.2 that reproduces
> the issue.  Here is the test.
>
>
> https://gist.github.com/3ce70eab8102d2555626
>
>
> Essentially, here is what is happening.
>
> A dynamic composite with the following ordering is created in a column
>
> UTF8Type+BytesType(reversed=
> true).
>
> 2 columns are then inserted, without composite encoding, these are the 2 
> values
>
> "jeans" + 129384000L
>
> "jeans" + 129409920L
>
>
> Here are the byte values (with spaces added to make the encoding of
> the composite easier to read)  The format is 4 byte comparator, 4 byte
> length, n field bytes, 1 byte comparator, then repeats
>
> Inserted:
>
> 8073 0005 6a65616e73 00    8042 0008 012d4b889b80 00
> 8073 0005 6a65616e73 00    8042 0008 012d3c158780 00
>
> Query start
>
> 8073 0005 6a65616e73 00
>
> Query end
>
> 8073 0005 6a65616e73 01
>
> Returned from Hector Results
>
> 8073 0005 6a65616e73 00    8042 0008 012d3c158780 00
> 8073 0005 6a65616e73 00    8042 0008 012d4b889b80 00
>
>
> Given that the first value is sorted normally, and the second value is
> reversed, I would expect the higher long value to appear before the
> lower one (the longs are dates) when the first value in the composite
> is equal.  Is this the expected behavior, or is this a bug?
>
> Thanks,
> Todd
>


[VOTE] Release Apache Cassandra 0.8.4

2011-08-10 Thread Sylvain Lebresne
We just fixed a fairly serious bug with counter (CASSANDRA-3006 -- it is
serious in that it "corrupt" counters). Cassandra 0.8.3 also shipped with a
potential small upgrade problem (CASSANDRA-3011). There is no reason to wait
to give those to users (especially the counter fix), so I propose the
following artifacts for release as 0.8.4.

SVN: https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8@r1156253
Artifacts: 
https://repository.apache.org/content/repositories/orgapachecassandra-015/org/apache/cassandra/apache-cassandra/0.8.4/
Staging repository:
https://repository.apache.org/content/repositories/orgapachecassandra-015/

The artifacts as well as a debian package are also available here:
http://people.apache.org/~slebresne/

Because 0.8.3 has been release only 2 days ago, there hasn't been many changes
since then, and the said change are mainly trivial, so I propose an expedited
vote of 24 hours (longer if needed).

[1]: http://goo.gl/DGZP5 (CHANGES.txt)
[2]: http://goo.gl/c8ZfD (NEWS.txt)


Re: Possible Bug in 0.8.2 with DynamicComposite range scans?

2011-08-10 Thread Todd Nine
Hi Sylvain,

  I noticed a couple of things that differ in your usage from what I'm
doing.  


I've defined the column family with these aliases.   I tried this in the
CLI, but it won't create the column family with these values.

DynamicComposite(a=>AsciiType,b=>BytesType,i=>IntegerType,x=>LexicalUUIDType,l=>LongType,t=>TimeUUIDType,s=>UTF8Type,u=>UUIDType,A=>AsciiType(reversed=true),B=>BytesType(reversed=true),I=>IntegerType(reversed=true),X=>LexicalUUIDType(reversed=true),L=>LongType(reversed=true),T=>TimeUUIDType(reversed=true),S=>UTF8Type(reversed=true),U=>UUIDType(reversed=true))


I had the same (correct) results as you using the aliases above when the
longs in the value range from 0 to 9 in my tests initially.  However
when I inserted the longs that represent the dates, I started seeing
this error.  Can you try it again using the long values and the aliases
in my test?  These higher values seem to cause the issue.

129384000l
129409920l



Thanks,
Todd


On Wed, 2011-08-10 at 12:26 +0200, Sylvain Lebresne wrote:

> Well, this seem to be on the hector side.
> 
> I've tried the same example using the CLI, and:
> 
> [default@unknown] create keyspace test;
> 642e6f90-c336-11e0--242d50cf1fd5
> Waiting for schema agreement...
> ... schemas agree across the cluster
> [default@unknown] use test;
> Authenticated to keyspace: test
> [default@test] create column family foobar with
> comparator=DynamicCompositeType and key_validation_class=AsciiType and
> default_validation_class=AsciiType;
> 40032380-c337-11e0--242d50cf1fd5
> Waiting for schema agreement...
> ... schemas agree across the cluster
> [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@1'] = a;
> Value inserted.
> [default@test] get foobar[k];
> => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> timestamp=1312970389512000)
> Returned 1 results.
> [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@2'] = a;
> Value inserted.
> [default@test] get foobar[k];
> => (column=UTF8Type@jeans:BytesType(reversed=true)@02, value=a,
> timestamp=1312970410712000)
> => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> timestamp=1312970389512000)
> Returned 2 results.
> 
> Now, the last query is not exactly the one you do, since it does a full row
> query but the CLI don't support setting the start and end of a slice. However,
> I have tried hard-coding the exact query into the CLI (with
> start='UTF8Type@jeans'
> and end='UTF8Type@jeans:!'), and it still returns the columns in the columns
> in the right order (with the biggest second component first).
> 
> --
> Sylvain
> 
> On Wed, Aug 10, 2011 at 9:26 AM, Todd Nine  wrote:
> > Hi guys,
> >   I've been dealing with a problem in my JPA plugin for a couple days
> > now.  I've been able to create a native test in 0.8.2 that reproduces
> > the issue.  Here is the test.
> >
> >
> > https://gist.github.com/3ce70eab8102d2555626
> >
> >
> > Essentially, here is what is happening.
> >
> > A dynamic composite with the following ordering is created in a column
> >
> > UTF8Type+BytesType(reversed=
> > true).
> >
> > 2 columns are then inserted, without composite encoding, these are the 2 
> > values
> >
> > "jeans" + 129384000L
> >
> > "jeans" + 129409920L
> >
> >
> > Here are the byte values (with spaces added to make the encoding of
> > the composite easier to read)  The format is 4 byte comparator, 4 byte
> > length, n field bytes, 1 byte comparator, then repeats
> >
> > Inserted:
> >
> > 8073 0005 6a65616e73 008042 0008 012d4b889b80 00
> > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> >
> > Query start
> >
> > 8073 0005 6a65616e73 00
> >
> > Query end
> >
> > 8073 0005 6a65616e73 01
> >
> > Returned from Hector Results
> >
> > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> > 8073 0005 6a65616e73 008042 0008 012d4b889b80 00
> >
> >
> > Given that the first value is sorted normally, and the second value is
> > reversed, I would expect the higher long value to appear before the
> > lower one (the longs are dates) when the first value in the composite
> > is equal.  Is this the expected behavior, or is this a bug?
> >
> > Thanks,
> > Todd
> >


Re: Possible Bug in 0.8.2 with DynamicComposite range scans?

2011-08-10 Thread Todd Nine
Hi Sylvain,
  I did a bit more digging, and I may have found the issue, but I
haven't yet determined the root cause.   This is all from the 0.8.2
release source.

When performing the range scan for my test the method
"getColumnComparator" on line 106 of the SliceQueryFilter is invoked.
It's using the BytesType comparator, so it is comparing the second
component.  

However, the "reversed" boolean flag is set to false, so it's not
correctly utilizing the columeReverseComparator instance when performing
range scans. 

This seems to be a disconnect between when a column is specified as
"reversed" in the component itself, and reversed is specified in the
range query.  For each component, wouldn't you need to do this?

reversed = user reversed ^ composite reversed

This is the table I came up with for range scanning.  True is forward,
false is reverse

UserComponentScan direction
false   false false
false   true   true
true false true
true true   false




Thanks,


-- 
todd 
CHIEF SOFTWARE ENGINEER

todd nine| spidertracks ltd |  117a the square 
po box 5203 | palmerston north 4441 | new zealand 
P: +64 6 353 3395
E: t...@spidertracks.co.nz W: www.spidertracks.com 


On Wed, 2011-08-10 at 12:26 +0200, Sylvain Lebresne wrote:

> Well, this seem to be on the hector side.
> 
> I've tried the same example using the CLI, and:
> 
> [default@unknown] create keyspace test;
> 642e6f90-c336-11e0--242d50cf1fd5
> Waiting for schema agreement...
> ... schemas agree across the cluster
> [default@unknown] use test;
> Authenticated to keyspace: test
> [default@test] create column family foobar with
> comparator=DynamicCompositeType and key_validation_class=AsciiType and
> default_validation_class=AsciiType;
> 40032380-c337-11e0--242d50cf1fd5
> Waiting for schema agreement...
> ... schemas agree across the cluster
> [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@1'] = a;
> Value inserted.
> [default@test] get foobar[k];
> => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> timestamp=1312970389512000)
> Returned 1 results.
> [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@2'] = a;
> Value inserted.
> [default@test] get foobar[k];
> => (column=UTF8Type@jeans:BytesType(reversed=true)@02, value=a,
> timestamp=1312970410712000)
> => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> timestamp=1312970389512000)
> Returned 2 results.
> 
> Now, the last query is not exactly the one you do, since it does a full row
> query but the CLI don't support setting the start and end of a slice. However,
> I have tried hard-coding the exact query into the CLI (with
> start='UTF8Type@jeans'
> and end='UTF8Type@jeans:!'), and it still returns the columns in the columns
> in the right order (with the biggest second component first).
> 
> --
> Sylvain
> 
> On Wed, Aug 10, 2011 at 9:26 AM, Todd Nine  wrote:
> > Hi guys,
> >   I've been dealing with a problem in my JPA plugin for a couple days
> > now.  I've been able to create a native test in 0.8.2 that reproduces
> > the issue.  Here is the test.
> >
> >
> > https://gist.github.com/3ce70eab8102d2555626
> >
> >
> > Essentially, here is what is happening.
> >
> > A dynamic composite with the following ordering is created in a column
> >
> > UTF8Type+BytesType(reversed=
> > true).
> >
> > 2 columns are then inserted, without composite encoding, these are the 2 
> > values
> >
> > "jeans" + 129384000L
> >
> > "jeans" + 129409920L
> >
> >
> > Here are the byte values (with spaces added to make the encoding of
> > the composite easier to read)  The format is 4 byte comparator, 4 byte
> > length, n field bytes, 1 byte comparator, then repeats
> >
> > Inserted:
> >
> > 8073 0005 6a65616e73 008042 0008 012d4b889b80 00
> > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> >
> > Query start
> >
> > 8073 0005 6a65616e73 00
> >
> > Query end
> >
> > 8073 0005 6a65616e73 01
> >
> > Returned from Hector Results
> >
> > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> > 8073 0005 6a65616e73 008042 0008 012d4b889b80 00
> >
> >
> > Given that the first value is sorted normally, and the second value is
> > reversed, I would expect the higher long value to appear before the
> > lower one (the longs are dates) when the first value in the composite
> > is equal.  Is this the expected behavior, or is this a bug?
> >
> > Thanks,
> > Todd
> >


Re: Possible Bug in 0.8.2 with DynamicComposite range scans?

2011-08-10 Thread Todd Nine
Correction to my truth table.  True is reversed, false is forward.  I
just caught that typo.


UserComponentScan direction
false   false  false
false   true   true
truefalse  true
truetrue   false


-- 
todd 
CHIEF SOFTWARE ENGINEER

todd nine| spidertracks ltd |  117a the square 
po box 5203 | palmerston north 4441 | new zealand 
P: +64 6 353 3395
E: t...@spidertracks.co.nz W: www.spidertracks.com 


On Thu, 2011-08-11 at 12:04 +1200, Todd Nine wrote:
> Hi Sylvain,
>   I did a bit more digging, and I may have found the issue, but I
> haven't yet determined the root cause.   This is all from the 0.8.2
> release source.
> 
> When performing the range scan for my test the method
> "getColumnComparator" on line 106 of the SliceQueryFilter is invoked.
> It's using the BytesType comparator, so it is comparing the second
> component.  
> 
> However, the "reversed" boolean flag is set to false, so it's not
> correctly utilizing the columeReverseComparator instance when
> performing range scans. 
> 
> This seems to be a disconnect between when a column is specified as
> "reversed" in the component itself, and reversed is specified in the
> range query.  For each component, wouldn't you need to do this?
> 
> reversed = user reversed ^ composite reversed
> 
> This is the table I came up with for range scanning.  True is forward,
> false is reverse
> 
> UserComponentScan direction
> false   false false
> false   true   true
> true false true
> true true   false
> 
> 
> 
> 
> Thanks,
> 
> 
> -- 
> todd 
> CHIEF SOFTWARE ENGINEER
> 
> todd nine| spidertracks ltd |  117a the square 
> po box 5203 | palmerston north 4441 | new zealand 
> P: +64 6 353 3395
> E: t...@spidertracks.co.nz W: www.spidertracks.com 
> 
> 
> 
> On Wed, 2011-08-10 at 12:26 +0200, Sylvain Lebresne wrote: 
> > Well, this seem to be on the hector side.
> > 
> > I've tried the same example using the CLI, and:
> > 
> > [default@unknown] create keyspace test;
> > 642e6f90-c336-11e0--242d50cf1fd5
> > Waiting for schema agreement...
> > ... schemas agree across the cluster
> > [default@unknown] use test;
> > Authenticated to keyspace: test
> > [default@test] create column family foobar with
> > comparator=DynamicCompositeType and key_validation_class=AsciiType and
> > default_validation_class=AsciiType;
> > 40032380-c337-11e0--242d50cf1fd5
> > Waiting for schema agreement...
> > ... schemas agree across the cluster
> > [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@1'] = 
> > a;
> > Value inserted.
> > [default@test] get foobar[k];
> > => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> > timestamp=1312970389512000)
> > Returned 1 results.
> > [default@test] set foobar[k]['UTF8Type@jeans:BytesType(reversed=true)@2'] = 
> > a;
> > Value inserted.
> > [default@test] get foobar[k];
> > => (column=UTF8Type@jeans:BytesType(reversed=true)@02, value=a,
> > timestamp=1312970410712000)
> > => (column=UTF8Type@jeans:BytesType(reversed=true)@01, value=a,
> > timestamp=1312970389512000)
> > Returned 2 results.
> > 
> > Now, the last query is not exactly the one you do, since it does a full row
> > query but the CLI don't support setting the start and end of a slice. 
> > However,
> > I have tried hard-coding the exact query into the CLI (with
> > start='UTF8Type@jeans'
> > and end='UTF8Type@jeans:!'), and it still returns the columns in the columns
> > in the right order (with the biggest second component first).
> > 
> > --
> > Sylvain
> > 
> > On Wed, Aug 10, 2011 at 9:26 AM, Todd Nine  wrote:
> > > Hi guys,
> > >   I've been dealing with a problem in my JPA plugin for a couple days
> > > now.  I've been able to create a native test in 0.8.2 that reproduces
> > > the issue.  Here is the test.
> > >
> > >
> > > https://gist.github.com/3ce70eab8102d2555626
> > >
> > >
> > > Essentially, here is what is happening.
> > >
> > > A dynamic composite with the following ordering is created in a column
> > >
> > > UTF8Type+BytesType(reversed=
> > > true).
> > >
> > > 2 columns are then inserted, without composite encoding, these are the 2 
> > > values
> > >
> > > "jeans" + 129384000L
> > >
> > > "jeans" + 129409920L
> > >
> > >
> > > Here are the byte values (with spaces added to make the encoding of
> > > the composite easier to read)  The format is 4 byte comparator, 4 byte
> > > length, n field bytes, 1 byte comparator, then repeats
> > >
> > > Inserted:
> > >
> > > 8073 0005 6a65616e73 008042 0008 012d4b889b80 00
> > > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> > >
> > > Query start
> > >
> > > 8073 0005 6a65616e73 00
> > >
> > > Query end
> > >
> > > 8073 0005 6a65616e73 01
> > >
> > > Returned from Hector Results
> > >
> > > 8073 0005 6a65616e73 008042 0008 012d3c158780 00
> > > 8073 0005 6a65616e73 008042 0008 012d4b88

Re: [VOTE] Release Apache Cassandra 0.8.4

2011-08-10 Thread Jonathan Ellis
+1

On Wed, Aug 10, 2011 at 12:08 PM, Sylvain Lebresne  wrote:
> We just fixed a fairly serious bug with counter (CASSANDRA-3006 -- it is
> serious in that it "corrupt" counters). Cassandra 0.8.3 also shipped with a
> potential small upgrade problem (CASSANDRA-3011). There is no reason to wait
> to give those to users (especially the counter fix), so I propose the
> following artifacts for release as 0.8.4.
>
> SVN: 
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8@r1156253
> Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-015/org/apache/cassandra/apache-cassandra/0.8.4/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-015/
>
> The artifacts as well as a debian package are also available here:
> http://people.apache.org/~slebresne/
>
> Because 0.8.3 has been release only 2 days ago, there hasn't been many changes
> since then, and the said change are mainly trivial, so I propose an expedited
> vote of 24 hours (longer if needed).
>
> [1]: http://goo.gl/DGZP5 (CHANGES.txt)
> [2]: http://goo.gl/c8ZfD (NEWS.txt)
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: [VOTE] Release Apache Cassandra 0.8.4

2011-08-10 Thread Brandon Williams
+1

On Wed, Aug 10, 2011 at 12:08 PM, Sylvain Lebresne  wrote:
> We just fixed a fairly serious bug with counter (CASSANDRA-3006 -- it is
> serious in that it "corrupt" counters). Cassandra 0.8.3 also shipped with a
> potential small upgrade problem (CASSANDRA-3011). There is no reason to wait
> to give those to users (especially the counter fix), so I propose the
> following artifacts for release as 0.8.4.
>
> SVN: 
> https://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8@r1156253
> Artifacts: 
> https://repository.apache.org/content/repositories/orgapachecassandra-015/org/apache/cassandra/apache-cassandra/0.8.4/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-015/
>
> The artifacts as well as a debian package are also available here:
> http://people.apache.org/~slebresne/
>
> Because 0.8.3 has been release only 2 days ago, there hasn't been many changes
> since then, and the said change are mainly trivial, so I propose an expedited
> vote of 24 hours (longer if needed).
>
> [1]: http://goo.gl/DGZP5 (CHANGES.txt)
> [2]: http://goo.gl/c8ZfD (NEWS.txt)
>