Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-10 Thread Matt Stump
Druid was our inspiration to layer bitmap indexes on top of Cassandra.
Druid doesn't work for us because or data set is too large. We would need
many hundreds of nodes just for the pre-processed data. What I envisioned
was the ability to perform druid style queries (no aggregation) without the
limitations imposed by having the entire dataset in memory. I primarily
need to query whether a user performed some event, but I also intend to add
trigram indexes for LIKE, ILIKE or possibly regex style matching.

I wasn't aware of CONCISE, thanks for the pointer. We are currently
evaluating fastbit, which is a very similar project:
https://sdm.lbl.gov/fastbit/


On Wed, Apr 10, 2013 at 5:49 PM, Brian O'Neill wrote:

>
> How does this compare with Druid?
> https://github.com/metamx/druid
>
> We're currently evaluating Acunu, Vertica and Druid...
>
> http://brianoneill.blogspot.com/2013/04/bianalytics-on-big-datacassandra.html
>
> With its bitmapped indexes, Druid appears to have the most potential.
> They boast some pretty impressive stats, especially WRT handling
> "real-time" updates and adding new dimensions.
>
> They also use a compression algorithm, CONCISE, to cut down on the space
> requirements.
> http://ricerca.mat.uniroma3.it/users/colanton/concise.html
>
> I haven't looked too deep into the Druid code, but I've been meaning to
> see if it could be backed by C*.
>
> We'd be game to join the hunt if you pursue such a beast. (with your code,
> or with portions of Druid)
>
> -brian
>
>
> On Apr 10, 2013, at 5:40 PM, mrevilgnome wrote:
>
> > What do you think about set manipulation via indexes in Cassandra? I'm
> > interested in answering queries such as give me all users that performed
> > event 1, 2, and 3, but not 4. If the answer is yes than I can make a case
> > for spending my time on C*. The only downside for us would be our current
> > prototype is in C++ so we would loose some performance and the ability to
> > dedicate an entire machine to caching/performing queries.
> >
> >
> > On Wed, Apr 10, 2013 at 11:57 AM, Jonathan Ellis 
> wrote:
> >
> >> If you mean, "Can someone help me figure out how to get started updating
> >> these old patches to trunk and cleaning out the Avro?" then yes, I've
> been
> >> knee-deep in indexing code recently.
> >>
> >>
> >> On Wed, Apr 10, 2013 at 11:34 AM, mrevilgnome 
> >> wrote:
> >>
> >>> I'm currently building a distributed cluster on top of cassandra to
> >> perform
> >>> fast set manipulation via bitmap indexes. This gives me the ability to
> >>> perform unions, intersections, and set subtraction across sub-queries.
> >>> Currently I'm storing index information for thousands of dimensions as
> >>> cassandra rows, and my cluster keeps this information cached,
> distributed
> >>> and replicated in order to answer queries.
> >>>
> >>> Every couple of days I think to myself this should really exist in C*.
> >>> Given all the benifits would there be any interest in
> >>> reviving CASSANDRA-1472?
> >>>
> >>> Some downsides are that this is very memory intensive, even for sparse
> >>> bitmaps.
> >>>
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder, http://www.datastax.com
> >> @spyced
> >>
>
> --
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
>
>


Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-11 Thread Matt Stump
You could embed Lucene, but then you pretty much have DSE search, and there
are people on this list in a better position than I to describe
the difficulty in making that scale. By rolling your own you get simplicity
and control. If you use a uniform index size you can just assign chunks of
it to the cassandra ring making it easy to distribute queries. I think that
using Lucene in this way would cause most of the benefit of the library to
be lost, and add unnecessary complexity. If Lucene were easy, then I think
given the team's experience with both Lucene and C* it would have been done
already.

Sorry if it's a fuzzy answer, but I haven't run down every technical angle
on the integration with C* yet. The idea was still very much in the
wouldn't it be very cool if this thing lived in Cassandra. It would be the
nail in the coffin for impala, redshift, et al.


On Thu, Apr 11, 2013 at 3:15 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> What's the advantage over Lucene?
>
>
> On Wed, Apr 10, 2013 at 10:43 PM, Matt Stump 
> wrote:
>
> > Druid was our inspiration to layer bitmap indexes on top of Cassandra.
> > Druid doesn't work for us because or data set is too large. We would need
> > many hundreds of nodes just for the pre-processed data. What I envisioned
> > was the ability to perform druid style queries (no aggregation) without
> the
> > limitations imposed by having the entire dataset in memory. I primarily
> > need to query whether a user performed some event, but I also intend to
> add
> > trigram indexes for LIKE, ILIKE or possibly regex style matching.
> >
> > I wasn't aware of CONCISE, thanks for the pointer. We are currently
> > evaluating fastbit, which is a very similar project:
> > https://sdm.lbl.gov/fastbit/
> >
> >
> > On Wed, Apr 10, 2013 at 5:49 PM, Brian O'Neill  > >wrote:
> >
> > >
> > > How does this compare with Druid?
> > > https://github.com/metamx/druid
> > >
> > > We're currently evaluating Acunu, Vertica and Druid...
> > >
> > >
> >
> http://brianoneill.blogspot.com/2013/04/bianalytics-on-big-datacassandra.html
> > >
> > > With its bitmapped indexes, Druid appears to have the most potential.
> > > They boast some pretty impressive stats, especially WRT handling
> > > "real-time" updates and adding new dimensions.
> > >
> > > They also use a compression algorithm, CONCISE, to cut down on the
> space
> > > requirements.
> > > http://ricerca.mat.uniroma3.it/users/colanton/concise.html
> > >
> > > I haven't looked too deep into the Druid code, but I've been meaning to
> > > see if it could be backed by C*.
> > >
> > > We'd be game to join the hunt if you pursue such a beast. (with your
> > code,
> > > or with portions of Druid)
> > >
> > > -brian
> > >
> > >
> > > On Apr 10, 2013, at 5:40 PM, mrevilgnome wrote:
> > >
> > > > What do you think about set manipulation via indexes in Cassandra?
> I'm
> > > > interested in answering queries such as give me all users that
> > performed
> > > > event 1, 2, and 3, but not 4. If the answer is yes than I can make a
> > case
> > > > for spending my time on C*. The only downside for us would be our
> > current
> > > > prototype is in C++ so we would loose some performance and the
> ability
> > to
> > > > dedicate an entire machine to caching/performing queries.
> > > >
> > > >
> > > > On Wed, Apr 10, 2013 at 11:57 AM, Jonathan Ellis 
> > > wrote:
> > > >
> > > >> If you mean, "Can someone help me figure out how to get started
> > updating
> > > >> these old patches to trunk and cleaning out the Avro?" then yes,
> I've
> > > been
> > > >> knee-deep in indexing code recently.
> > > >>
> > > >>
> > > >> On Wed, Apr 10, 2013 at 11:34 AM, mrevilgnome <
> mrevilgn...@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> I'm currently building a distributed cluster on top of cassandra to
> > > >> perform
> > > >>> fast set manipulation via bitmap indexes. This gives me the ability
> > to
> > > >>> perform unions, intersections, and set subtraction across
> > sub-queries.
> > > >>> Currently I'm storing index information for thousands of dimensions
> > as
> > > >>> cassandra rows, and my cluster keeps this information cached,
> > > distributed
> > > >>> and replicated in order to answer queries.
> > > >>>
> > > >>> Every couple of days I think to myself this should really exist in
> > C*.
> > > >>> Given all the benifits would there be any interest in
> > > >>> reviving CASSANDRA-1472?
> > > >>>
> > > >>> Some downsides are that this is very memory intensive, even for
> > sparse
> > > >>> bitmaps.
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Jonathan Ellis
> > > >> Project Chair, Apache Cassandra
> > > >> co-founder, http://www.datastax.com
> > > >> @spyced
> > > >>
> > >
> > > --
> > > Brian ONeill
> > > Lead Architect, Health Market Science (http://healthmarketscience.com)
> > > mobile:215.588.6024
> > > blog: http://weblogs.java.net/blog/boneill42/
> > > blog: http://brianoneill.blogspot.com/
> > >
> > >
> >
>


Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-12 Thread Matt Stump
It looks like there is some interest so I'm going to disgorge everything
I've learned/considered in the past couple weeks just so that we have a
consistant base. I'm going to break down how the indexes work, different
optimizations and drawbacks and try to address the points/questions that
people have raised. I've broken it down by subject.

Basic Equality:
Bitmap indexes are essentially giant arrays of bytes, with one bit per
possibility. If you want to know what rows have boolean value "event1" set
to true then you set the address of those rows to 1 in the index. For
example in index 0100100101, this would mean rows 1, 4, 7 and 9 would
contain "event1". If you want to know which rows contain event1 and event2
then you do a bitwise AND over the two indexes to get the set intersection
eg. 00100101 AND 1101 results in 0101. From this you can build
complex queries by doing simple set arithmetic. Each of these sets are
called a query dimension.

Range Queries:
If you want to encode ranges such as give me all users who have a counter
in the integer interval [0, 2] then you need a two dimensional bitmap
index. The first dimension is what values between [0, 7] have been hit:
10010011, the second dimension is which rows for each of those possible
values contain the value. So for value 0 there would be another index
00100010, which means that rows 3 and 7 contain value 0.  This forms a
giant two dimensional array.

[1] [00100010]
[0] [1010]
[0] [10100110]
[1] [00100010]
[0] [00101010]
[0] [01101010]
[0] [0010]
[1] [00100110]
[1] [0010]

To figure out the answer to who has counter value [0, 2] would be the union
of the sets  [00100010], [1010], [10100110] which is [10101110].

Binning:
Each index has an address size which limits the total number of rows/values
you can encode in an index. So if you have a 32bit index out can encode a
total of 4,294,967,295 positions. If you need to encode more values than
what is possible in that address space you can do two things. Increase the
address space or perform binning. Binning is essentially hash collision,
meaning two or more values are assigned to the same value. For example if
you wanted to index floats you could use the same index as above but if you
want to know the rows who contain the real numbers [0,1.9] then you would
need to check the actual value for the rows in the result set. This is
called a candidate check which is very expensive, often the most expensive
part of a query.

Segments:
If you increase the address size of the index then you run into
space/memory problems. A 32 bit index is 4 GB when fully materialized, a 64
bit index is 2 PB. To solve this problem you use segments/partitions or
compression. Segments work through the use of a sparse table
or interval list. You break the index up into chunks of equal size lets say
1kb. If an bit gets flipped at position 4098, then you go to segment 0x02
and if it exists you flip that bit. If that segment doesn't exist then you
create it and set the bit. The advantage is that you only have to create
segments that contain flipped bits. The downside is that if you have a wide
distribution of bits flipped you end up with many segments with one or two
bits flipped and the is empty space or wasted memory.

Compression:
The other approach to use compression. The trick is that you need a
compression algorithm that doesn't require decompression before doing the
bitwise operations. This means you need a run length encoding (RLE). There
are 3 leading contenders for king of the hill, CONCISE which is used by
Druid, WAH which is used by Fastbit, and PWAH which isn't used by anybody I
think yet. They all work the same way which is to encode store large blocks
of zeros or ones as encoded values.  So you get this index taken from PWAH
[101 101011 11101100101 0010010] which means 11 blocks
of 1 bits a literal block of 15 bits and 18 blocks of 0 bits. The downside
to this approach is that you lose the ability to index directly to a
position in the index. If you want to perform an update you've either got
to read/decode to that position and split the index or rebuild the entire
index. This is why Druid, and Fastbit are very good for static data sets
but can't deal with fast changing data.

A hybrid approach to performance:
You can take a hybrid approach to performance which means combine segments
and compression. You break the index address space up into segments and
then perform compression within each of those segments to negate the
problem of many segments with only 1 bit flipped but consuming the entire
segment size worth of memory. This also limits the cost of having to split
the encoding for any compressed segment. As segments fill up you can
compress and join them together using the standard SSTable or levelDB
approach.

Distributing work across the ring:
If you use segments and a uniform index size that means you can assign
segments of the index to different portions of the C* ring

Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-15 Thread Matt Stump
I spent some time this afternoon thinking about ways forward. I need to
make progress regardless of whether or not my eventual work makes it into
C*. In order to do so, I was thinking about creating an index management
library and query engine in C++. Because of the nature of bitmap indexes
it's ok if the engine on any node spits out partial results with a list of
indexes it needs in order to complete the query. It would be up to the
caller to get information to the index library, and shuttle around partial
bitmaps if necessary.

I like this plan for a couple reasons. I get tight the tight control over
memory and the performance of C++. I can wrap a C++ library and make it
accessible to C* via JNA. If the work doesn't make it into C* I can just
wrap it in riak_core and still meet my goals. A library fills an empty
ecosystem niche between full blown databases and collections.

Unknowns areas of risk in my mind are that I don't know enough about the
implementation of indexes and the query planner in C* to know whether or
not this is crazy. I've read through the old patches for the aborted bitmap
index attempt and it looks pretty simple, but I lack context which is hard
to gather from old disembodied code. Are there any docs for the query
planner or indexes other than code and the bug DB?

I spent some time benchmarking over the weekend and my prototype is capable
of answering 3 clause queries for fully materialsed 4GB uncompressed
indexes (4B encoded values) in ~240 ms on my laptop. I'm fairly certain I'm
running into memory bandwidth limitations. There might be one or two small
tweaks still available to me, but they're all OS and hardware dependent.

I would love any feedback, even if it's to say "Matt, you're crazy go home".

Thanks,
Matt Stump



On Fri, Apr 12, 2013 at 10:41 AM, Matt Stump  wrote:

> It looks like there is some interest so I'm going to disgorge everything
> I've learned/considered in the past couple weeks just so that we have a
> consistant base. I'm going to break down how the indexes work, different
> optimizations and drawbacks and try to address the points/questions that
> people have raised. I've broken it down by subject.
>
> Basic Equality:
> Bitmap indexes are essentially giant arrays of bytes, with one bit per
> possibility. If you want to know what rows have boolean value "event1" set
> to true then you set the address of those rows to 1 in the index. For
> example in index 0100100101, this would mean rows 1, 4, 7 and 9 would
> contain "event1". If you want to know which rows contain event1 and event2
> then you do a bitwise AND over the two indexes to get the set intersection
> eg. 00100101 AND 1101 results in 0101. From this you can build
> complex queries by doing simple set arithmetic. Each of these sets are
> called a query dimension.
>
> Range Queries:
> If you want to encode ranges such as give me all users who have a counter
> in the integer interval [0, 2] then you need a two dimensional bitmap
> index. The first dimension is what values between [0, 7] have been hit:
> 10010011, the second dimension is which rows for each of those possible
> values contain the value. So for value 0 there would be another index
> 00100010, which means that rows 3 and 7 contain value 0.  This forms a
> giant two dimensional array.
>
> [1] [00100010]
> [0] [1010]
> [0] [10100110]
> [1] [00100010]
> [0] [00101010]
> [0] [01101010]
> [0] [0010]
> [1] [00100110]
> [1] [0010]
>
> To figure out the answer to who has counter value [0, 2] would be the
> union of the sets  [00100010], [1010], [10100110] which is [10101110].
>
> Binning:
> Each index has an address size which limits the total number of
> rows/values you can encode in an index. So if you have a 32bit index out
> can encode a total of 4,294,967,295 positions. If you need to encode more
> values than what is possible in that address space you can do two things.
> Increase the address space or perform binning. Binning is essentially
> hash collision, meaning two or more values are assigned to the same value.
> For example if you wanted to index floats you could use the same index as
> above but if you want to know the rows who contain the real numbers [0,1.9]
> then you would need to check the actual value for the rows in the result
> set. This is called a candidate check which is very expensive, often the
> most expensive part of a query.
>
> Segments:
> If you increase the address size of the index then you run into
> space/memory problems. A 32 bit index is 4 GB when fully materialized, a 64
> bit index is 2 PB. To solve this problem you use segments/partitions or
> compression. Segments work through the use of a sparse table
> or interval li

Add libcql to the wikis

2013-06-24 Thread Matt Stump
Can someone with permissions please link to libcql on the wikis. I got
yelled at by users who had trouble finding the library.

The url to libcql:
https://github.com/mstump/libcql

The relevant wiki pages I could identify:
http://wiki.apache.org/cassandra/ClientOptions
http://www.planetcassandra.org/Download/ViewDownloads/ClientDriver

Thanks,
Matt Stump