I can't see us ever committing a dependency to a custom C++ library to
the core, for the same reason that despite passionate advocacy we'll
probably never have Scala code in the tree -- saying that potential
contributors need to be familiar with Java *and* Language X to
contribute to the core is ju
I spent some time this afternoon thinking about ways forward. I need to
make progress regardless of whether or not my eventual work makes it into
C*. In order to do so, I was thinking about creating an index management
library and query engine in C++. Because of the nature of bitmap indexes
it's ok
It looks like there is some interest so I'm going to disgorge everything
I've learned/considered in the past couple weeks just so that we have a
consistant base. I'm going to break down how the indexes work, different
optimizations and drawbacks and try to address the points/questions that
people h
I am not sure about the collection case. But for compact storage you can
specify multiple-ranges in a slice query.
https://issues.apache.org/jira/browse/CASSANDRA-3885
I am not sure this will get you all the way to bit-map indexes but in a
wide row scenario it seems like you could support a "even
Something like this?
SELECT * FROM users
WHERE user_id IN (select user_id from events where type in (1, 2, 3))
AND user_id NOT IN (select user_id from events where type=4)
This doesn't really look like a Cassandra query to me. More like a
query for Hive (or Drill, or Impala).
But, I know Sylv
Brian,
The Solr StatsComponent performs aggregations.
http://wiki.apache.org/solr/StatsComponent
I recommend using Datastax DSE Search...
On Fri, Apr 12, 2013 at 10:09 AM, Brian O'Neill wrote:
> @Jason,
>
> I have a lot of experience with SOLR + ES, but mainly for search. (i.e.
> Finding the
@Jason,
I have a lot of experience with SOLR + ES, but mainly for search. (i.e.
Finding the most relevant records given a query)
That's been working well, but now we have requirements to support
dashboards. Those dashboards have aggregations in them (sum, average,
count(s), etc). I have limited
You could embed Lucene, but then you pretty much have DSE search, and there
are people on this list in a better position than I to describe
the difficulty in making that scale. By rolling your own you get simplicity
and control. If you use a uniform index size you can just assign chunks of
it to th
What's the advantage over Lucene?
On Wed, Apr 10, 2013 at 10:43 PM, Matt Stump wrote:
> Druid was our inspiration to layer bitmap indexes on top of Cassandra.
> Druid doesn't work for us because or data set is too large. We would need
> many hundreds of nodes just for the pre-processed data. Wh
information shared in this discussion is quite informative for developers.
Would like to go through this kind of discussion in the group.
On Thu, Apr 11, 2013 at 9:14 AM, Brandon Williams wrote:
> On Wed, Apr 10, 2013 at 9:50 PM, Carl Yeksigian
> wrote:
>
> > This discussion is off topic for t
On Wed, Apr 10, 2013 at 9:50 PM, Carl Yeksigian wrote:
> This discussion is off topic for the dev list. If you want to continue it,
> please move to user@.
>
I disagree entirely, this is absolutely dev-oriented.
-Brandon
This discussion is off topic for the dev list. If you want to continue it,
please move to user@.
Thanks,
Carl
On Wed, Apr 10, 2013 at 10:43 PM, Matt Stump wrote:
> Druid was our inspiration to layer bitmap indexes on top of Cassandra.
> Druid doesn't work for us because or data set is too larg
Druid was our inspiration to layer bitmap indexes on top of Cassandra.
Druid doesn't work for us because or data set is too large. We would need
many hundreds of nodes just for the pre-processed data. What I envisioned
was the ability to perform druid style queries (no aggregation) without the
limi
How does this compare with Druid?
https://github.com/metamx/druid
We're currently evaluating Acunu, Vertica and Druid...
http://brianoneill.blogspot.com/2013/04/bianalytics-on-big-datacassandra.html
With its bitmapped indexes, Druid appears to have the most potential.
They boast some pretty im
What do you think about set manipulation via indexes in Cassandra? I'm
interested in answering queries such as give me all users that performed
event 1, 2, and 3, but not 4. If the answer is yes than I can make a case
for spending my time on C*. The only downside for us would be our current
prototy
If you mean, "Can someone help me figure out how to get started updating
these old patches to trunk and cleaning out the Avro?" then yes, I've been
knee-deep in indexing code recently.
On Wed, Apr 10, 2013 at 11:34 AM, mrevilgnome wrote:
> I'm currently building a distributed cluster on top of
16 matches
Mail list logo