Re: What is wrong in this token function

2016-03-11 Thread Matt Kennedy
The conversation around the partitioner sidetracks a bit from your original question. You originally asked: >> Business case: Show me all events for a given customer in a given time frame In RDBMS it will be (Query1) where customer_id = '289' and event_time >= '2016-03-01 18:45:00+' and even

Re: How to measure the write amplification of C*?

2016-03-10 Thread Matt Kennedy
eful given the difference between the synthetic workload used to create those ratings and the workload that Cassandra is producing for your particular case. You can find out more about those here: https://www.jedec.org/standards-documents/docs/jesd219a Matt Kennedy Sr. Product Manager, DSE Core

Re: How to measure the write amplification of C*?

2016-03-10 Thread Matt Kennedy
ndurance drive to replace your next round of failed units. [image: datastax_logo.png] <http://www.datastax.com/> Matt Kennedy Partner Architect | +1.703.582.5017 | matt.kenn...@datastax.com [image: linkedin.png] <https://www.linkedin.com/pub/matt-kennedy/25/258/663> [ima

Re: How to measure the write amplification of C*?

2016-03-10 Thread Matt Kennedy
//hothardware.com/news/google-data-center-ssd-research-report-offers-surprising-results-slc-not-more-reliable-than-mlc-flash and the paper that article mentions: http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/23105-fast16-papers-schroeder.pdf Hope this helps. Ma

Re: Pig output to Cassandra

2011-03-10 Thread Matt Kennedy
On its way... https://issues.apache.org/jira/browse/CASSANDRA-1828 On Mar 10, 2011, at 11:17 PM, Mark wrote: > I thought I read somewhere that Pig has an output format that can write to > Cassandra but I am unable to find any documentation on this. Is this possible > and if so can someone pleas

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Matt Kennedy
Great, that worked, thanks for your time. On Thu, Mar 10, 2011 at 4:57 PM, Jonathan Ellis wrote: > Drop the index, then restart once more. It shouldn't try to rebuild > the index after that. > > On Thu, Mar 10, 2011 at 3:36 PM, Matt Kennedy > wrote: > > Sorry, I wa

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Matt Kennedy
more data than you can index > in-memory. > > You should wait for the next Hudson build (which will include 2295) > and use that. Or, create your indexes before adding the data. > > On Thu, Mar 10, 2011 at 12:26 PM, Matt Kennedy > wrote: > > Well it looks like the index

Re: Understanding index builds (updated: crashed cluster)

2011-03-10 Thread Matt Kennedy
e I get to a certain number of indexes on the column family? Thanks, Matt On Wed, Mar 9, 2011 at 8:40 PM, Jonathan Ellis wrote: > https://issues.apache.org/jira/browse/CASSANDRA-2294 > https://issues.apache.org/jira/browse/CASSANDRA-2295 > > On Wed, Mar 9, 2011 at 5:47 PM, Matt Kenned

Understanding index builds

2011-03-09 Thread Matt Kennedy
I'm trying to gain some insight into what happens with a cluster when indexes are being built, or when CFs with indexed columns are being written to. Over the past couple of days we've been doing some loads into a CF with 29 indexed columns. Eventually, the nodes just got overwhelmed and the clie

Cluster not starting up

2011-03-04 Thread Matt Kennedy
I'm currently the proud owner of an 8-node cluster that won't start up. Yesterday we had a developer doing very high volume writes to our cluster via a Hadoop job that was reading an HDFS file and running six concurrent mappers on each of 8 nodes and using Hector to do the load and it sort of kill

Re: How to use JConsole to connect to a Cassandra cluster in Amazon EC2?

2011-03-03 Thread Matt Kennedy
If you edit the $CASSANDRA_HOME/conf/cassandra-env.sh script, you should be able to set up ssl for the JMX connection. That should allow you to do a direct connection from a locally running JConsole to the JMX port on the public IP of your EC2 instance. On Wed, Mar 2, 2011 at 9:10 PM, Sameer Faroo

Re: How to restrict access to cassandra jmx to private network?

2011-02-27 Thread Matt Kennedy
Apparently it is tricky, I found this: http://vafer.org/blog/20061010091658/ On Feb 26, 2011, at 4:17 PM, ruslan usifov wrote: > Hello > > For example if servers in cluster hav etwo network interfaces, one of which > is private (accessible only from local network). Is it possible to bind jmx >

Re: map reduce job over indexed range of keys

2011-02-24 Thread Matt Kennedy
Right, so I'm interpreting silence as a confirmation on all points. I opened: https://issues.apache.org/jira/browse/CASSANDRA-2245 https://issues.apache.org/jira/browse/CASSANDRA-2246 to work on these. On Wed, Feb 23, 2011 at 5:31 PM, Matt Kennedy wrote: > Let me start out by sayin

map reduce job over indexed range of keys

2011-02-23 Thread Matt Kennedy
a into Pig with this sytax: rows = LOAD 'cassandra://mykeyspace/mycolumnfamily?country=UK' using CassandraStorage(); I'd like to get some feedback on that syntax. Thanks, Matt Kennedy

Re: Pig not reading all cassandra data

2011-02-17 Thread Matt Kennedy
aves it on for everything else. So far, I can't see any negative side effects from it. Thoughts? On Fri, Feb 11, 2011 at 3:37 PM, Matt Kennedy wrote: > Sorry it has taken me a while to get back to this. I'm still trying to get > to the bottom of this to find where the discon

Re: Pig not reading all cassandra data

2011-02-11 Thread Matt Kennedy
es so with only one mapper. It looks like the Pig map combiner isn't using the split.getLength call to determine how the maps get combined as I originally suspected. I'll update when I figure more out. -Matt On Sat, Feb 5, 2011 at 1:01 AM, Jonathan Ellis wrote: > On Fri, Feb 4, 201

Re: Pig not reading all cassandra data

2011-02-04 Thread Matt Kennedy
Found the culprit. There is a new feature in Pig 0.8 that will try to reduce the number of splits used to speed up the whole job. Since the ColumnFamilyInputFormat lists the input size as zero, this feature eliminates all of the splits except for one. The workaround is to disable this featu