Hi all,
I have been struggling with Cassandra’s lack of adhoc query support (I know
this is an anti-pattern of Cassandra, but sometimes management come over
and ask me to run stuff and it’s impossible to explain that it will take me
a while when it would take about 10 seconds in MySQL) so I have
Hi Andres,
This looks awesome, many thanks for your work on this. Just out of
curiosity, how does this compare to the DSE Cassandra with embedded Solr?
Do they provide very similar functionality? Is there a list of obvious pros
and cons of one versus the other?
Thanks!
Matthew
*From:* A
fact make part of your analysis job? Kind of a pre-process/prep step?
Regards,
Shahab
On Wed, Jun 3, 2015 at 10:48 AM, Matthew Johnson
wrote:
Hi all,
I am trying to store some data (user actions in our application) for future
analysis (probably using Spark). I understand best practice i
Hi all,
I am trying to store some data (user actions in our application) for future
analysis (probably using Spark). I understand best practice is to store it
in denormalized form, and this will definitely make some of our future
queries much easier. But I have a problem with denormalizing the d
PM, Matthew Johnson
wrote:
Hi gurus,
We have ordered some hardware for a 3-node cluster, but its ETA is 6 to 8
weeks. In the meantime, I have been lent a single server that I can use. I
am wondering what the best way is to set up my single node (SN), so I can
then move to the 3-node cluster
Hi gurus,
We have ordered some hardware for a 3-node cluster, but its ETA is 6 to 8
weeks. In the meantime, I have been lent a single server that I can use. I
am wondering what the best way is to set up my single node (SN), so I can
then move to the 3-node cluster (3N) when the hardware arrives.
partition, and most of my records are written exactly once. So, I just let
the tombstones get written and they’ll eventually get compacted out and
life will go on.
It’s annoying and not ideal, but what can you do?
On Apr 29, 2015, at 2:36 AM, Matthew Johnson
wrote:
Hi all,
I have some
Hi all,
I have some fields that I am storing into Cassandra, but some of them could
be null at any given point. As there are quite a lot of them, it makes the
code much more readable if I don’t check each one for null before adding it
to the INSERT.
I can see a few Jiras around CQL 3 supporti
Hi Neha,
I guess it depends why you are adding a new node – do you need more storage
capacity, do you want better resilience, or are you trying to increase
performance?
If you add a new node with the same amount of storage as the previous two,
but you increase the RF, you will use up all of t
thing other than
increase GC pauses.
On Fri, Apr 24, 2015 at 11:50 AM Phil Yang wrote:
2015-04-23 22:16 GMT+08:00 Matthew Johnson :
In HBase, we do something like:
Put put = new Put(id);
put.add(myPojo.getTimestamp(), myPojo.getValue());
put.add(myPojo.getMySecondTimes
Hi Jimmy,
I have very limited experience with Cassandra so far, but from following
some tutorials to create keyspaces, create tables, and insert data, it
definitely seems to me like creating keyspaces and tables is way slower
than inserting data. Perhaps a more experienced user can confirm if th
to issue an ‘ALTER TABLE’
statement for every new column. I read one suggestions which is to use
collections instead - so basically have a single pre-defined column which
is a Map, say, and then add ‘timestamp : value’ into that map instead of a
new column for every timestamp. Would you say this is
sSimpleClientBoundStatements_t.html
Jim Witschey
Software Engineer in Test | jim.witsc...@datastax.com
On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson
wrote:
> Hi all,
>
>
>
> Currently looking at switching from HBase to Cassandra, and one big
> difference so far is that in HBas
Hi all,
Currently looking at switching from HBase to Cassandra, and one big
difference so far is that in HBase, we create a ‘Put’ object, add to it a
set of column/value pairs, and send the Put to the server. So far in
Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really
like
Hi Bill,
To remove your address from the list, send a message to:
Cheers,
Matt
*From:* Bill Tsay [mailto:bt...@splunk.com]
*Sent:* 22 April 2015 15:36
*To:* user@cassandra.apache.org
*Subject:* unsubscribe
*From: *Mich Talebzadeh
*Reply-To: *"user@cassandra.apache.org"
*Da
do you expect - mostly read, or mostly write?
On Wed, Apr 22, 2015 at 5:06 PM, Matthew Johnson
wrote:
Hi Ali, Brian,
Thanks for the suggestion – we have previously used Solr (SolrCloud for
distribution) for a lot of other products, presumably this will do the same
job as ElasticSearch? Or does
ght find it better to use elasticsearch for your aggregate queries
and analytics. Cassandra is more of just a data store.
On Apr 22, 2015 4:42 PM, "Matthew Johnson" wrote:
Hi all,
Currently we are setting up a “big” data cluster, but we are only going to
have a couple of servers
Hi all,
Currently we are setting up a “big” data cluster, but we are only going to
have a couple of servers to start with but we need to be able to scale out
quickly when usage ramps up. Previously we have used Hadoop/HBase for our
big data cluster, but since we are starting this one on only two
on the same network, but if you can't be, you'll need to
use the public ip in listen_address.
On Mon, Apr 20, 2015 at 9:47 AM Matthew Johnson
wrote:
Hi all,
I have set up a Cassandra cluster with 2.1.4 on some existing AWS boxes,
just as a POC. Cassandra servers connect to each other over
Hi all,
I have set up a Cassandra cluster with 2.1.4 on some existing AWS boxes,
just as a POC. Cassandra servers connect to each other over their internal
AWS IP addresses (172.x.x.x) aliased in /etc/hosts as sales1, sales2 and
sales3.
I connect to it from my local dev environment using the
Hi Colin,
To remove your address from the list, send a message to:
Cheers,
Matt
*From:* Colin Clark [mailto:co...@clark.ws]
*Sent:* 20 April 2015 14:10
*To:* user@cassandra.apache.org
*Subject:* Re: Adding nodes to existing cluster
unsubscribe
On Apr 20, 2015, at 8:08 AM, C
21 matches
Mail list logo