A very interesting and detailed article, thank you DuyHai. I think this should
be part of general Cassandra documentation.
--
Jacques-Henri Berthemet
From: DuyHai Doan [mailto:doanduy...@gmail.com]
Sent: Thursday, February 22, 2018 7:04 PM
To: user
Subject: Re: Secondary Indexes C* 3.0
Read
Read this: http://www.doanduyhai.com/blog/?p=13191
On Thu, Feb 22, 2018 at 6:44 PM, Akash Gangil wrote:
> To provide more context, I was going through this
> https://docs.datastax.com/en/cql/3.3/cql/cql_using/useWhenIndex.html#
> useWhenIndex__highCardCol
>
> On Thu, Feb 22, 2018 at 9:35 AM,
To provide more context, I was going through this
https://docs.datastax.com/en/cql/3.3/cql/cql_using/useWhenIndex.html#useWhenIndex__highCardCol
On Thu, Feb 22, 2018 at 9:35 AM, Akash Gangil wrote:
> Hi,
>
> I was wondering if there are recommendations around the cardinality of
> secondary index
Hello, I created https://issues.apache.org/jira/browse/CASSANDRA-7766 about
that
Fabrice LARCHER
2014-08-13 14:58 GMT+02:00 DuyHai Doan :
> Hello Fabrice.
>
> A quick hint, try to create your secondary index WITHOUT the "IF NOT
> EXISTS" clause to see if you still have the bug.
>
> Another ide
Hello Fabrice.
A quick hint, try to create your secondary index WITHOUT the "IF NOT
EXISTS" clause to see if you still have the bug.
Another idea is to activate query tracing on client side to see what's
going on underneath.
On Wed, Aug 13, 2014 at 2:48 PM, Fabrice Larcher
wrote:
> Hello,
>
OK, thanks for the information.
Gareth
On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli wrote:
> On Thu, Aug 1, 2013 at 12:49 PM, Gareth Collins
> wrote:
>>
>> Would this be correct? Just making sure I understand how to best use
>> secondary indexes in Cassandra with time series data.
>
>
> In gener
Thanks a lot.
Regards,
Shahab
On Thu, Aug 1, 2013 at 8:32 PM, Robert Coli wrote:
> On Thu, Aug 1, 2013 at 2:34 PM, Shahab Yunus wrote:
>
>> Can you shed some more light (or point towards some other resource) that
>> why you think built-in Secondary Indexes should not be used easily or
>> witho
On Thu, Aug 1, 2013 at 2:34 PM, Shahab Yunus wrote:
> Can you shed some more light (or point towards some other resource) that
> why you think built-in Secondary Indexes should not be used easily or
> without much consideration? Thanks.
>
1) Secondary indexes are more or less modeled like a manu
Hi Robert,
Can you shed some more light (or point towards some other resource) that
why you think built-in Secondary Indexes should not be used easily or
without much consideration? Thanks.
Regards,
Shahab
On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli wrote:
> On Thu, Aug 1, 2013 at 12:49 PM, G
On Thu, Aug 1, 2013 at 12:49 PM, Gareth Collins
wrote:
> Would this be correct? Just making sure I understand how to best use
> secondary indexes in Cassandra with time series data.
>
In general unless you ABSOLUTELY NEED the one unique feature of built-in
Secondary Indexes (atomic update of base
Ok. I always know the row key before I start the Cassandra read operation. A
full system could have 300-500k columns so secondary indexes don't seem a good
idea here. I think the best option will be to query a range of columns for the
given row key.
Thanks a bunch guys.
On Mar 20, 2013, at 11:2
> When I query for user_id = "user1" and order_attr1 = 1991 I want to get the
> order_num. Is this possible without super columns?
If you only have a few hundred columns you can read them all back and filter
client side.
Secondary indexes are used when you do not know the row you want to get b
Hi Aaron,
I did mean 1000 columns. But I see your point.
The current CF schema has user_id as the row key and unnamed column
order_num = order info as the col-val pair. The plan is to add named
columns order_attr1, order_attr2... order_attr18. When I query for user_id
= "user1" and order_attr1 = 1
> Assuming we have 1000 columns in 1 row of the column family and about 900 of
> them have
>
> NamedColumn1=1 and of those 900 only 10 of them also have NamedColumn2=1.
Am assuming you mean 1,000 rows not columns.
> does Cassandra
> optimized this in any way by fetching only the 10 versus the 9
Thanks guys. I am working with Andy on this project.
Further questions on the secondary indexes:
Assuming we have 1000 columns in 1 row of the column family and about
900 of them have
NamedColumn1=1 and of those 900 only 10 of them also have NamedColumn2=1. If I
query for columns which have Named
> - Will that result in Cassandra creating 18 new column families,
> one for each index?
Inserts will be slower, as each insert will potentially result in 18 additional
inserts. This is just the same as a RDBMS, more indexes == more insert work.
> - If a given column is not specified in any rows
I do not think this is a good use case for Cassandra alone, assuming the
queries can be any combination of the 18 columns.
I would consider using some combination of Cassandra and Solr, where Solr
provides the indexing/search, and Cassandra provides the bulk store.
From: Andy Stec [mailto:andys.
On Thu, Jun 7, 2012 at 5:41 AM, aaron morton wrote:
> Sounds good. Do you want to make the change ?
>
Done.
>
> Thanks for taking the time.
>
Thanks for giving the answer!
Jim
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 7/06/2
Sounds good. Do you want to make the change ?
Thanks for taking the time.
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 7/06/2012, at 7:54 AM, Jim Ancona wrote:
> On Tue, Jun 5, 2012 at 4:30 PM, Jim Ancona wrote:
> It might be a good idea fo
On Tue, Jun 5, 2012 at 4:30 PM, Jim Ancona wrote:
> It might be a good idea for the documentation to reflect the tradeoffs
> more clearly.
Here's a proposed addition to the Secondary Index FAQ at
http://wiki.apache.org/cassandra/SecondaryIndexes
Q: How does choice of Consistency Level affect c
On Mon, Jun 4, 2012 at 2:34 PM, aaron morton wrote:
> IIRC index slices work a little differently with consistency, they need to
> have CL level nodes available for all token ranges. If you drop it to CL
> ONE the read is local only for a particular token range.
>
Yes, this is what we observed. W
IIRC index slices work a little differently with consistency, they need to have
CL level nodes available for all token ranges. If you drop it to CL ONE the
read is local only for a particular token range.
The problem when doing index reads is the nodes that contain the results can no
longer be
rsion of the
> schema on a new node? Is there a reason to apply the migrations?
>
> - Mike
>
> From: aaron morton [mailto:aa...@thelastpickle.com]
> Sent: Tuesday, March 06, 2012 4:14 AM
> To: user@cassandra.apache.org
> Subject: Re: Secondary indexes don't go away afte
on of the schema on a
new node? Is there a reason to apply the migrations?
- Mike
From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Tuesday, March 06, 2012 4:14 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes don't go away after metadata change
When the new node co
elastpickle.com]
> Sent: Monday, March 05, 2012 3:58 AM
> To: user@cassandra.apache.org
> Subject: Re: Secondary indexes don't go away after metadata change
>
> The secondary index CF's are marked as no longer required / marked as
> compacted. under 1.x they would then be delet
2012 3:58 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes don't go away after metadata change
The secondary index CF's are marked as no longer required / marked as
compacted. under 1.x they would then be deleted reasonably quickly, and
definitely deleted after a restart
The secondary index CF's are marked as no longer required / marked as
compacted. under 1.x they would then be deleted reasonably quickly, and
definitely deleted after a restart.
Is there a zero length .Compacted file there ?
> Also, when adding a new node to the ring the new node will build i
Perfect, Aaron, Thanks a lot
From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Tuesday, February 14, 2012 12:54 AM
To: user@cassandra.apache.org
Subject: Re: Secondary indexes and cardinality
Heard that indexing a field with high cardinality is not good.
http://www.datastax.com/docs
> Heard that indexing a field with high cardinality is not good.
http://www.datastax.com/docs/0.7/data_model/secondary_indexes
> Will there be any performance improvement? Is this the way secondary indexes
> are maintained?
Updating secondary indexes requires a read and a write.
> Also this ma
https://issues.apache.org/jira/browse/CASSANDRA-3488
On Nov 12, 2011, at 9:52 AM, Jeremy Hanna wrote:
> It sounds like that's just a message in compactionstats that's a no-op. This
> is reporting for about an hour that it's building a secondary index on a
> specific column family. Not sure if
It sounds like that's just a message in compactionstats that's a no-op. This
is reporting for about an hour that it's building a secondary index on a
specific column family. Not sure if that's the same thing. I see in the data
directory definitely new files written with the old index names we
Sounds like it could be
https://issues.apache.org/jira/browse/CASSANDRA-3123 ? (which was
fixed in 0.8.5).
On Fri, Nov 11, 2011 at 9:10 PM, Jeremy Hanna
wrote:
> We're using 0.8.4 in our cluster and two nodes needed rebuilding. When
> building and streaming data to the nodes, there were multipl
Upgrade. https://issues.apache.org/jira/browse/CASSANDRA-2320
On Fri, Aug 5, 2011 at 6:58 PM, Aurynn Shaw wrote:
> Answered my own question; a snapshot gets taken when you drop a CF, per:
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-take-a-snapshot-after-a-column
Answered my own question; a snapshot gets taken when you drop a CF, per:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-take-a-snapshot-after-a-column-family-update-td6222772.html
So, I can recover to a known-good working position, and delete my
indices properly.
T
> it will probably be better to denormalize and store
> some precomputed data
Yes, if you know there are queries you need to serve it is better to support
those directly in the data model.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle
OK, got some results (below).
2 nodes, one on localhost, second on LAN, reading with
ConsistencyLevel.ONE, buffer_size=512 rows (that's how many rows
pycassa will get on one connection, than it will use last row_id as
start row for next query)
Queries types:
1) get_range - just added limit of 1024
Can you provide some more information on the query you are running ? How many
terms are you selecting with?
How long does it take to return 1024 rows ? IMHO thats a reasonably big slice
to get.
The server will pick the most selective equality predicate, and then filter the
results from that
I just updated added a new page to the wiki:
http://wiki.apache.org/cassandra/SecondaryIndexes
On Apr 3, 2011, at 7:37 PM, Drew Kutcharian wrote:
> Yea I know, I just didn't know anyone can update it.
>
>
> On Apr 3, 2011, at 1:26 PM, Joe Stump wrote:
>
>>
>> On Apr 3, 2011, at 2:22 PM, Dre
Yea I know, I just didn't know anyone can update it.
On Apr 3, 2011, at 1:26 PM, Joe Stump wrote:
>
> On Apr 3, 2011, at 2:22 PM, Drew Kutcharian wrote:
>
>> Thanks Tyler. Can you update the wiki with these answers so they are stored
>> there for others to see too?
>
> Dude, it's a wiki.
On Apr 3, 2011, at 2:22 PM, Drew Kutcharian wrote:
> Thanks Tyler. Can you update the wiki with these answers so they are stored
> there for others to see too?
Dude, it's a wiki.
Thanks Tyler. Can you update the wiki with these answers so they are stored
there for others to see too?
On Apr 3, 2011, at 12:51 PM, Tyler Hobbs wrote:
> I'm not familiar with some of the details, but I'll try to answer your
> questions in general. Secondary indexes are implemented as a slig
I'm not familiar with some of the details, but I'll try to answer your
questions in general. Secondary indexes are implemented as a slightly
special separate column family with the indexed value serving as the key;
most of the properties of secondary indexes follow from that.
On Sun, Apr 3, 2011
You'd need to drop and recreate the index (but see
https://issues.apache.org/jira/browse/CASSANDRA-2320 when doing this).
On Mon, Mar 14, 2011 at 6:07 AM, Terje Marthinussen
wrote:
> Hi,
> Should it be expected that secondary indexes are automatically regenerated
> when importing data using json2
I would expect they get created on the fly while importing. If not I think
its a bug...
Bye,
Norman
2011/3/14 Terje Marthinussen
> Hi,
>
> Should it be expected that secondary indexes are automatically regenerated
> when importing data using json2sstable?
> Or is there some manual procedure th
Info on secondary indexes
http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes
Some answers to your other questions are also in there as well as a discussion
about the limitations.
Hope that helps.
Aaron
On 7/03/2011, at 3:54 PM, Mark wrote:
> I haven't looked at Cass
Thanks a lot for the info
Sebastien
On 2 February 2011 16:53, Jonathan Ellis wrote:
> On Wed, Feb 2, 2011 at 7:37 AM, Sébastien Druon
> wrote:
> > Hi!
> > I would like to know if secondary indexes are foreseen for super columns
> /
> > columns inside of super columns?
>
> No.
>
> > If yes, wil
On Wed, Feb 2, 2011 at 7:37 AM, Sébastien Druon wrote:
> Hi!
> I would like to know if secondary indexes are foreseen for super columns /
> columns inside of super columns?
No.
> If yes, will it be in a near future?
Probably not.
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of
I will frame my question in a different way.
Each user in my system subscribes to updates from selected other users
(updates are aggregated from outside) and tags the users to which he/she
is subscribed to.
In my current design, I have a column family called "Followers" keyed by
userid in w
One approach is to ask yourself questions as to how you would use this
information, for example
- how often to you go from user to tags
- how often would you want to go from tag->users.
- What kind of reporting would you want to do on tags and how often
- Can multiple people add the sa
I have a very similar use case in my system, I've solved it as follows;
If all your users have a unique id, such as a login userid.
You could create a new column family, keyed by the userid, and add columns
which have no value, but the column name is the tag value.
Searching these tags later will
When you say 'comparator = TimeUUIDType', your saying that all column names
are TimeUUIDs. So, when you try to create an index on a column with name
'uuid_nonindexed', it complains because that's a string, not a TimeUUID.
- Tyler
On Mon, Dec 13, 2010 at 6:16 PM, Frank LoVecchio wrote:
> I was
On Thu, Dec 9, 2010 at 12:16 PM, David Boxenhorn wrote:
> What do you mean by, "The included secondary indexes still aren't good at
> finding keys for ranges of indexed values, such as " name > 'b' and name <
> 'c' "."?
>
> Do you mean that secondary indexes don't support range queries at all?
ht
What do you mean by, "The included secondary indexes still aren't good at
finding keys for ranges of indexed values, such as " name > 'b' and name <
'c' "."?
Do you mean that secondary indexes don't support range queries at all?
Besides supporting range queries, I see the importance of secondary
OPP is not yet obsolete.
The included secondary indexes still aren't good at finding keys for ranges
of indexed values, such as " name > 'b' and name < 'c' ". This is something
that an OPP index would be good at. Of course, you can do something similar
with one or more rows, so it's not that big
- OPP becomes obsolete (OOP is not obsolete!)
- primary indexes become obsolete if you ever want to do a range query
(which you probably will...), better to assign a random row id
Taken together, it's likely that very little will remain of your old
database schema...
Am I right?
https://issues.apache.org/jira/browse/CASSANDRA-1571
On Fri, Oct 1, 2010 at 3:41 PM, Jonathan Ellis wrote:
> Yes, this is a bug. Can you create a ticket?
>
> On Fri, Oct 1, 2010 at 4:02 AM, Petr Odut wrote:
> > Hi,
> > I have CF User with secondary index on "email" column.
> > When I remove a
Yes, this is a bug. Can you create a ticket?
On Fri, Oct 1, 2010 at 4:02 AM, Petr Odut wrote:
> Hi,
> I have CF User with secondary index on "email" column.
> When I remove a whole row,
> del User['row']
> SI is not updated and get_indexed_slices still returns the deleted data.
> Is it still ope
On Fri, Oct 1, 2010 at 2:37 AM, J T wrote:
> Hi,
> I've managed to get secondary indexes working on Normal Columns. I've even
> managed to get multiple IndexExpressions to work when querying.
> However, its not clear to me
> 1) If I should be able to have a secondary index on a sub-column of a
> s
58 matches
Mail list logo